TF-IDF: What It Is & How It Can Help SEO
TF-IDF: What It Is & How It Can Help SEO. Like many other concepts in SEO, TF-IDF it a topic that is much debated.
First, you read about it being a silver bullet to rank your content on Google.
Then, immediately, you hear that TF-IDF is so old-school that it isn’t worth any effort.
The truth usually lies somewhere in the middle.
This post will explore why you shouldn’t expect TF-IDF to substitute a comprehensive optimization strategy and what the true benefits are of using it for SEO.
TF-IDF: What Kind of Beast Is That?
For a human brain, it doesn’t take any math to tell what my article is about. It’s about TF-IDF, right?
But when relevancy is evaluated (and, most importantly, compared for several articles) by a machine, we need a numeric representation to see that:
- Article A is about TF-IDF (as opposed to, say, link building).
- Article A is more about TF-IDF than article B.
Could we simply count the number of times our keyword, TF-IDF, appears in each document?
No, thus we obviously ignore the size of the documents.
Could we compare the count of our keyword to the total number of words?
This is what we call keyword density – a widely used content optimization metric of the past.
But relying on keyword density makes me think that the word “to be” (not “TF-IDF”) is the most prominent one in this article.
Is there a way to adjust my calculations for the fact that some words appear more frequently in speech in general?
This is where TF-IDF comes into play, letting us see how “TF-IDF” use frequency in this article compares to its average use frequency across other documents on the Web.
Thus, we’re able to pay less attention to all the commonly used words and distinguish a very specific topic for a particular piece of content.
The formula for my calculations looks like this:
Or, to put it simply (disclaimer: I’m purposefully oversimplifying here for the sake of conveying the basic idea), we’re taking:
- Term Frequency = (count of the term) / (total word count in the document)
- Inverse Document Frequency = log (number of docs) / (docs containing keyword)
Read more: https://www.searchenginejournal.com/tf-idf-can-it-really-help-your-seo/331075/?ver=331075X3