3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in network-interpretation-4
No benchmarks available.
Use these libraries to find network-interpretation-4 models and implementations
No datasets available.
No subtasks available.
This work first generate question-guided informative image captions, and pass the captions to a PLM as context for question answering, and achieves state-of-the-art results on zero-shot VQAv2 and GQA.
It is claimed that the stability of neural network interpretation method with respect to the authors' adversarial model manipulation is an important criterion to check for developing robust and reliable neural network interpretations method.
It is demonstrated that training the networks to have interpretable gradients improves their robustness to adversarial perturbations, especially in cross-norm attacks and under heavy perturbation.
This paper theoretically shows that with a proper measurement of interpretation, it is actually difficult to prevent prediction-evasion adversarial attacks from causing interpretation discrepancy, and develops an interpretability-aware defensive scheme built only on promoting robust interpretation (without the need for resorting to adversarial loss minimization).
This paper presents a framework that can preserve the attributions while compressing a network by employing the Weighted Collapsed Attribution Matching regularizer, and demonstrates the effectiveness of the algorithm both quantitatively and qualitatively on diverse compression methods.
This work forms a normalization invariant cosine distance-based criterion and derives its upper bound, which gives insight for why simply minimizing the Hessian norm at the input, as has been done in previous work, is not sufficient for attaining robust feature attribution.
Adding a benchmark result helps the community track progress.