3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in text-based-de-novo-molecule-generation-4
Use these libraries to find text-based-de-novo-molecule-generation-4 models and implementations
No subtasks available.
This study introduces MolFM, a multimodal molecular foundation model designed to facilitate joint representation learning from molecular structures, biomedical texts, and knowledge graphs, and provides theoretical analysis that MolFM captures local and global molecular knowledge by minimizing the distance in the feature space between different modalities of the same molecule.
The results show that MolT5-based models are able to generate outputs, both molecules and captions, which in many cases are high quality, and consider several metrics, including a new cross-modal embedding-based metric, to evaluate the tasks of molecule captioning and text-based molecule generation.
This work proposes the first multi-domain, multi-task language model that can solve a wide range of tasks in both the chemical and natural language domains and suggests that such models can robustly and efficiently accelerate discovery in physical sciences by superseding problem-specific fine-tuning and enhancing human-model interactions.
This work proposes a novel LLM-based framework (MolReGPT) for molecule-caption translation, where an In-Context Few-Shot Molecule Learning paradigm is introduced to empower molecule discovery with LLMs like ChatGPT to perform their in-context learning capability without domain-specific pre-training and fine-tuning.
GIT-Mol is introduced, a multi-modal large language model that integrates the Graph, Image, and Text information and proposes GIT-Former, a novel architecture that is capable of aligning all modalities into a unified latent space.
A comprehensive pre-training framework that enriches cross-modal integration in biology with chemical knowledge and natural language associations, and distinguishes between structured and unstructured knowledge, leading to more effective utilization of information.
It is demonstrated that TGM-DLM outperforms MolT5-Base, an autoregressive model, without the need for additional data resources, underscore the remarkable effectiveness of TGM-DLM in generating coherent and precise molecules with specific properties, opening new avenues in drug discovery and related scientific domains.
BioT5+ is introduced, an extension of the BioT5 framework, tailored to enhance biological research and drug discovery, and stands out for its ability to capture intricate relationships in biological data, thereby contributing significantly to bioinformatics and computational biology.
Adding a benchmark result helps the community track progress.