Machine learning and stochastic gradient descent

I very briefly describe gradient descent (GD) and how it is used in the context of machine learning: the stochastic gradient descent algorithm (SGD). The idea is simple: divide your data into randomly chosen mini-batches and use a mini-batch to estimate the gradient of your cost function. Use that to do GD iterations at fixed …