3260 papers • 126 benchmarks • 313 datasets
Calculate a numerical score for the semantic similarity between two words.
(Image credit: Papersgraph)
These leaderboards are used to track progress in word-similarity-2
No benchmarks available.
Use these libraries to find word-similarity-2 models and implementations
No subtasks available.
A new approach based on the skipgram model, where each word is represented as a bag of character n-grams, with words being represented as the sum of these representations, which achieves state-of-the-art performance on word similarity and analogy tasks.
This paper demonstrates a counter-intuitive, postprocessing technique -- eliminate the common mean vector and a few top dominating directions from the word vectors -- that renders off-the-shelf representations even stronger.
It is proposed that evaluation of word representation evaluation should focus on data efficiency and simple supervised tasks, where the amount of available data is varied and scores of a supervised model are reported for each subset (as commonly done in transfer learning).
The proposed method follows an edge-based approach using a lexical database and gives highest correlation value for both word and sentence similarity outperforming other similar models.
The proposed Speech2Vec model, a novel deep neural network architecture for learning fixed-length vector representations of audio segments excised from a speech corpus, is based on a RNN Encoder-Decoder framework, and borrows the methodology of skipgrams or continuous bag-of-words for training.
This work proposes a fully unsupervised framework for learning MWEs that directly exploits the relations between all language pairs and substantially outperforms previous approaches in the experiments on multilingual word translation and cross-lingual word similarity.
SemGloVe is proposed, which distills semantic co-occurrences from BERT into static GloVe word embeddings and can define the co- Occurrence weights by directly considering the semantic distance between word pairs.
This paper argues that word embedding can be naturally viewed as a ranking problem due to the ranking nature of the evaluation metrics, and proposes a novel framework WordRank that efficiently estimates word representations via robust ranking, in which the attention mechanism and robustness to noise are readily achieved via the DCG-like ranking losses.
The results show that a model that controls dependencies between the word being defined and the definition words performs significantly better, and that a character-level convolution layer that leverages morphology can complement word-level embeddings.
Adding a benchmark result helps the community track progress.