I have discussed about LSI – Latent Semantic Indexing in my previous blog post “What is LSI” but not go into detail. Today let’s see some history of LSI.
Applied Semantics is a company whose use the software tecnology (creator of CIRCA) to extract and organize information from websites in a manner similar to the way that humans might act. The software technology arose from the shortage of finding the valuable content for the advertising of Adsense ads. As you may know, Adsense ads matched the keywords on the pages to keywords in the ads and a webmaster earned money for every click he received from an ad shown on his website. Google soon found the problems of those webpages simply contain relevant keywords phrases to capture traffic for the Adsense ads without valuable content. The software technology of Applied Semantics thus plays an important role in solving this problem. In year 2003, Google purchased Applied Semantics.
The exact mathematical formula used for LSI is complicated and it analyzes pages not only for keywords, but for processing documents based on keyword themes. So its real function is to determine if the content of a site is of value to the visitor or not. Since Google found LSI technology useful and has been implementing it into its ranking algorithm, we should understand and deploy on our websites.
We have some basic knowledge about LSI (Latent Semantic Indexing), here we see how some words have other meanings to Google under the LSI.
When we perform search for the word “laptop” using Google, the returns results on the search pages are as follow:
Google BOLDS the search word “laptop”
If we run a query using the LSI commands which is a ~ sign in front of the search word “~laptop”, Google will show the search results as follow:
With LSI, you can see that to Google, “laptop” also means: Dell, VAIO, Toshiba, Computers.
LSI has changed the way Google looks at sites – shift from “keyword” to “themes”. It based on the concept that humans are not looking for pages that contain specific keywords but for sites build around a theme.