ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy - Citation Graph | Papersgraph