Word Discovery in Visually Grounded, Self-Supervised Speech Models - Citation Graph | Papersgraph