3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in short-text-clustering-2
Use these libraries to find short-text-clustering-2 models and implementations
No subtasks available.
This work proposes an effective method (Deep Aligned Clustering) to discover new intents with the aid of limited known intent data, and leverages a few labeled known intent samples as prior knowledge to pre-train the model.
It is found that when the data is projected into a feature space with a dimensionality of the target cluster number, the rows and columns of its feature matrix correspond to the instance and cluster representation, respectively.
This work proposes Supporting Clustering with Contrastive Learning (SCCL) – a novel framework to leverage contrastive learning to promote better separation in distance-based clustering and demonstrates the effectiveness of SCCL in leveraging the strengths of both bottom-up instance discrimination and top-down clustering to achieve better intra-clusters and inter-cluster distances.
A flexible Self-Taught Convolutional neural network framework for Short Text Clustering, which can flexibly and successfully incorporate more useful semantic features and learn non-biased deep text representation in an unsupervised manner is proposed.
The method is proposed, which learns discriminative features from both an autoencoder and a sentence embedding, then uses assignments from a clustering algorithm as supervision to update weights of the encoder network.
Constrained deep adaptive clustering with cluster refinement (CDAC+) is proposed, an end-to-end clustering method that can naturally incorporate pairwise constraints as prior knowledge to guide the clustering process.
The proposed clustering enhancement method not only improves the clustering quality of different baseline clustering methods but also outperforms the state-of-the-art short text clustering method on several short text datasets by a statistically significant margin.
This paper presents an intent discovery framework that can mine a vast amount of conversational logs and to generate labeled data sets for training intent models, and introduced an extension to the DBSCAN algorithm and a density-based clustering algorithm ITER-DBSCAN for unbalanced data clustering.
This work proposes an efficient indexing structure to improve the scalability of Spherical k-Means with respect to k and exploits the sparsity of the input vectors and the convergence behavior of k- Means to reduce the number of comparisons on each iteration significantly.
Adding a benchmark result helps the community track progress.