Index

Jan. 18, 2016
TF-IDF Word Cloud
Python
scikit-learn, data visualization

Term frequency–inverse document frequency word clouds. This project combines the TfidfVectorizer from scikit-learn with Andreas Mueller's excellent word cloud generator.

Compare a document with a corpus of documents, extracting keywords unique to it. Generate an immediate word cloud summary of that document's "uniqueness". Useful in multitudinous scenarios, e.g. compare a user's text versus all text within a chat room, or a single chapter versus all other chapters within a book.

Integrated with my Python Powered mIRC when clicking a user's name in chat.

Example

Chapter 14

Chapter 15