Self-Supervised Video Representation Learning by Uncovering Spatio-Temporal Statistics - Citation Graph | Papersgraph