Tutorial: Object Ranking

Roelof van Zwol - Yahoo! Research
Srinivas Vadrevu - Yahoo! Labs

Object ranking is an emerging discipline within information retrieval that is concerned with the ranking of objects, e.g. named entities and their attributes, in context of given a user query, or application. In this tutorial we will address the different aspects involved when building an object ranking system. We will present the state-of-the-art research in object ranking, as well as going into detail about our hands-on experiences when designing and developing the system for object ranking as it is in production at Yahoo! today. This allows for a unique mixture of research and development that will give the participants in-depth insights into the problem of object ranking.

The focus of current Web search engines is to retrieve relevant documents on the Web, and more precisely documents that match with the query intent of the user. Some users are looking for specific information, while other just want to access rich media content (images, videos, etc.) or explore a topic. In the latter scenario, users do not have a fixed or pre-determined information need, but are using the search engine to discover information related to a particular object of interest. In this scenario one can say that the user is in a exploratory mode.

To support users in their exploratory search the search engines are offering semantic search suggestions. In this tutorial, we will present a generic framework for ranking related objects. This framework ranks related entities according to two dimensions: a lateral dimension and a faceted dimension. In the lateral dimension, related entities are of the same nature as the entity queried (e.g. Barcelona and Madrid, or Angelina Jolie and Jessica Alba). In the faceted dimension, related entities are usually not of the same type as the queried entity, and refer to a specific aspect of the queried entity (e.g. Jennifer Aniston and the tvshow Friends).

In this tutorial we will describe the process of building a Web-scale object ranking system. In particular we will address the construction of a knowledge base that forms the basis for the object ranking, and the generation of ranking features using external sources such as search engine query logs, photo annotations in Flickr, and tweets on Twitter. Next, we will discuss machine learned ranking models using an ensemble of pair-wise preference models, and address various aspects of object ranking, including multi-media extensions, vertical solutions, attribute-aware ranking, and the importance of freshness. Last but not least, we will address the evaluation methodologies involved to tune the performance of Web-scale object ranking strategies.

 

Speakers:

 

Roelof van Zwol - Yahoo! Research

Dr. Roelof van Zwol is a senior research scientist at Yahoo! Research, where he is managing the multimedia research team. He has more than 10 years of international research experience in multimedia, information retrieval, (social) media mining, object ranking, spatial search, XML, databases, and machine learning. He is passionate about conducting research in an industrial context and to apply the outcomes in high-impact end-user services. His work on object ranking now powers the left-rail search suggestions in Yahoo!'s Web and image search engine. Prior to joining Yahoo he was an assistant professor at Utrecht University in the Netherlands. He received his Ph.D. in Computer Science in 2002 from the University of Twente in the Netherlands. Roelof van Zwol is the author of more than 70 peer reviewed publications. He was co-chair of CIVR in 2009, and organizer of the second CHORUS conference on Multimedia Search, as well as the organizer of 5 workshops themed around Web search, multimedia, and information retrieval.

Srinivas Vadrevu - Yahoo! Labs

Dr. Srinivas Vadrevu is a senior research scientist at Yahoo! Labs.  His current work at Yahoo! involves ranking related problems such as Web search ranking, cross-domain ranking for international countries and ranking and recommendation of related entities. During this time, he authored several publications at venues such as WWW, CIKM and KDD conferences and Machine Learning Journal. His recent work on object ranking powers entity search experiences at various parts within Yahoo!.  Previously he worked on information extraction and particularly on extracting structured data from Web pages, where he published a number of papers.  He received his Ph.D. in Computer Science from Arizona State University and received his M.S. in Computer Science from University of Minnesota, Duluth.