Last Updated on November 20, 2022 by David Vause
Machine learning algorithms typically need numerical inputs to build their models. The challenge arises when the features of the model are textual. In this case, vectorizers are used to “vectorize” or turn the input text into numerical vectors. One such vectorizer is the Term Frequency Inverse Document Frequency (TF-IDF) Vectorizer.
I had problems getting insight into how it worked until I build one and used it on data so simple that I could what was going on in the data.