Spatio-Temporal Video Grounding | State-of-the-Art