A journey into Optimization algorithms for Deep Neural Networks
An overview of the most popular optimization algorithms for training deep neural networks. From stohastic gradient descent to Adam, AdaBelief and second-order optimization
How can we efficiently train very deep neural network architectures? What are the best in-layer normalization options? We gathered all you need about normalization in transformers, recurrent neural nets, convolutional neural networks.