Deep multi-metric learning is used to address the purpose of text-independent speaker verification and introduces three different losses for this problem, i.e., triplet loss, n-pair loss and angular loss, which work in a cooperative way to train a feature extraction network equipped with Residual connections and squeeze-and-excitation attention.