From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification - Citation Graph | Papersgraph