Query the obscure

Gaining a better understanding of queries is a top priority of the search industry.

When it comes to common searches that repeat millions of times like “Britney Spears” or “Hybrid Cars,” returning the most appropriate results, or advertisements, is not difficult. But what about queries that are exceptionally rare and may never repeat more than a single time? Clearly, these queries are infinitely harder for the search engine to understand.

Andrei Broder, Yahoo! Research Fellow and Vice President of Search Technology and Computational Advertising, and a team of Yahoo! researchers set out to tackle this problem. Their work is outlined in a paper called Robust Classification of Rare Queries Using Web Knowledge, that appeared in SIGIR 2007.

To address the problem, the Yahoo! team proposed a methodology for using search results, as well as information available on the Web, as a source of external knowledge. To this end, they sent rare queries to a search engine and assumed that a majority of the highest-ranking search results were relevant to the query. Categorizing these results allowed the team to classify the original query with high accuracy.

The results definitively confirmed that using the Web as a repository of world knowledge contributes valuable information about the query, and aids in its correct classification. “We discovered the best source of information to understand what these rare queries are about is to look at the search results,” Broder explains. “If you look at each returned page as a vote on what the query is about, you find that the majority tends to be correct even though many individual pages are wrong.”

– Read full story
– Download paper

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

One comment

Leave a Reply

Related Posts

[Paper] Why people still fall for phishing emails

What does it mean to trust a technology?

Using psychology to bolster cybersecurity

One comment

Leave a Reply