My blogs

About me

Introduction	It is one of the most popular optimization method in deep learning. It is an extension of Stochastic Gradient Descent (SGD). It dynamically computes individual learning rates based on the past gradients and their second moments.