Adam is like RMSProp with momentum. The (simplified) update [[Stanford CS231n]](https://cs231n.github.io/neural-networks-3/#ada) looks as follows:
```
m = beta1*m + (1-beta1)*dx
v = beta2*v + (1-beta2)*(dx**2)
x += - learning_rate * m / (np.sqrt(v) + eps)
```