3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in semantic-retrieval-9
Use these libraries to find semantic-retrieval-9 models and implementations
No subtasks available.
This work proposes a simple yet effective pipeline system with special consideration on hierarchical semantic retrieval at both paragraph and sentence level, and their potential effects on the downstream task, and illustrates that intermediate semantic retrieval modules are vital for shaping upstream data distribution and providing better data for downstream modeling.
Experimental results show that MedCPT sets new state-of-the-art performance on six biomedical IR tasks, outperforming various baselines including much larger models including much larger models such as GPT-3-sized cpt-text-XL.
A segmental QbE approach where variable-duration speech segments (queries, search utterances) are mapped to fixed-dimensional embedding vectors and it is shown that aQbE system using an embedding function trained on visually grounded speech data outperforms a purely acoustic QBE system in terms of both exact and semantic retrieval performance.
This work uses a multilingual knowledge distillation approach to train BERT models to produce sentence embeddings for Ancient Greek text and evaluates their models on translation search, semantic similarity, and semantic retrieval tasks and investigates translation bias.
It is shown that state-of-the-art pretrained encoders fail to provide satisfactory results on the task proposed, and Language Model-based solutions perform better, especially when unsupervised fine-tuning is applied.
The current landscape of the first-stage retrieval models under a unified framework is described to clarify the connection between classical term-based retrieval methods, early semantic retrieved methods, and neural semantic retrieval methods.
This work proposes an unsupervised deep hashing layer called Bi-Half Net that maximizes entropy of the binary codes and designs a new parameter-free network layer to explicitly force continuous image features to approximate the optimal half-half bit distribution.
This work formalizes the alignment problem in terms of an audiovisual alignment tensor that is based on earlier VGS work, introduces systematic metrics for evaluating model performance in aligning visual objects and spoken words, and proposes a new VGS model variant for the alignment task utilizing cross-modal attention layer.
This paper proposes Homomorphic Projective Distillation to learn compressed sentence embeddings and augments a small Transformer encoder model with learnable projection layers to produce compact representations while mimicking a large pre-trained language model to retain the sentence representation quality.
This work model semantics by means of hidden random variables and defines the semantic communication task as the data-reduced and reliable transmission of messages over a communication channel such that semantics is best preserved, and considers this task as an end-to-end Information Bottleneck problem, enabling compression while preserving relevant information.
Adding a benchmark result helps the community track progress.