Description: | A project for clustering text streams using locality-sensitive hashing (LSH) in Python. |
---|---|
Author: | Krishna Y. Kamath |
For a demo of how to use this class take a look at the streamingLSHClusteringDemo()
method in the Python module demo/StreamingLSHClusteringDemo.py
.
Applications currently using Streaming LSH:
Note: If you want your application listed here, contact the author.
- Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity Search in High Dimensions via Hashing. In Proceedings of the 25th International Conference on Very Large Data Bases (VLDB '99), Malcolm P. Atkinson, Maria E. Orlowska, Patrick Valduriez, Stanley B. Zdonik, and Michael L. Brodie (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 518-529.
- Moses S. Charikar. 2002. Similarity estimation techniques from rounding algorithms. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing (STOC '02). ACM, New York, NY, USA, 380-388. DOI=10.1145/509907.509965
- Deepak Ravichandran, Patrick Pantel, and Eduard Hovy. 2005. Randomized algorithms and NLP: using locality sensitive hash function for high speed noun clustering. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (ACL '05). Association for Computational Linguistics, Stroudsburg, PA, USA, 622-629. DOI=10.3115/1219840.1219917
- Morchen, F.; Brinker, K., and Neubauer, C., Any-time clustering of high frequency news streams. The Thirteenth ACM SIGKDD Int'l. Conference on Knowledge Discovery and Data Mining: Data Mining Case Studies Workshop (DMCS), August 2007.
- Benjamin Van Durme and Ashwin Lall. 2010. Online generation of locality sensitive hash signatures. In Proceedings of the ACL 2010 Conference Short Papers (ACLShort '10). Association for Computational Linguistics, Stroudsburg, PA, USA, 231-235.