This work presents a dimensionality reduction network (MMINet) training procedure based on the stochastic estimate of the mutual information gradient and introduces emerging information theoretic feature transformation protocols as an end-to-end neural network training approach.