3260 papers • 126 benchmarks • 313 datasets
Molecular description generation entails the creation of a detailed textual depiction illuminating the structure, properties, biological activity, and applications of a molecule based on its molecular descriptors. It furnishes chemists and biologists with a swift conduit to essential molecular information, thus efficiently guiding their research and experiments.
(Image credit: Papersgraph)
These leaderboards are used to track progress in molecule-captioning-9
Use these libraries to find molecule-captioning-9 models and implementations
No subtasks available.
A molecular multimodal foundation model which is pretrained from molecular graphs and their semantically related textual data via contrastive learning which enhances molecular property prediction and possesses capability to generate meaningful molecular graphs from natural language descriptions is proposed.
This study introduces MolFM, a multimodal molecular foundation model designed to facilitate joint representation learning from molecular structures, biomedical texts, and knowledge graphs, and provides theoretical analysis that MolFM captures local and global molecular knowledge by minimizing the distance in the feature space between different modalities of the same molecule.
This work proposes the first multi-domain, multi-task language model that can solve a wide range of tasks in both the chemical and natural language domains and suggests that such models can robustly and efficiently accelerate discovery in physical sciences by superseding problem-specific fine-tuning and enhancing human-model interactions.
The results show that MolT5-based models are able to generate outputs, both molecules and captions, which in many cases are high quality, and consider several metrics, including a new cross-modal embedding-based metric, to evaluate the tasks of molecule captioning and text-based molecule generation.
This work proposes a novel LLM-based framework (MolReGPT) for molecule-caption translation, where an In-Context Few-Shot Molecule Learning paradigm is introduced to empower molecule discovery with LLMs like ChatGPT to perform their in-context learning capability without domain-specific pre-training and fine-tuning.
GIT-Mol is introduced, a multi-modal large language model that integrates the Graph, Image, and Text information and proposes GIT-Former, a novel architecture that is capable of aligning all modalities into a unified latent space.
This paper first introduces a retrieval-based prompting strategy to construct high-quality pseudo data, then explores the optimal method to effectively leverage this pseudo data to address the low-resource challenge by utilizing artificially-real data generated by Large Language Models (LLMs).
A comprehensive pre-training framework that enriches cross-modal integration in biology with chemical knowledge and natural language associations, and distinguishes between structured and unstructured knowledge, leading to more effective utilization of information.
MolCA retains the LM's ability of open-ended text generation and augments it with 2D graph information and extensively benchmarked it on tasks of molecule captioning, IUPAC name prediction, and molecule-text retrieval, on which MolCA significantly outperforms the baselines.
InstructMol, a multi-modal LLM, effectively aligns molecular structures with natural language via an instruction-tuning approach, utilizing a two-stage training strategy that adeptly combines limited domain-specific data with molecular and textual information.
Adding a benchmark result helps the community track progress.