Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors - Citation Graph | Papersgraph