Sparsifying Transformer Models with Trainable Representation Pooling - Citation Graph | Papersgraph