Last time around I wrote about my take on trends at this year's SIGIR. This time around I wanted to go over some of the papers that caught my eye including the Best Paper Award winner as well as a bunch of long/short papers on session and task retrieval.
(To those wondering why I'm suddenly so active, it is because writing helps me organize my thoughts about a paper and helps me better analyze it.)
``Beliefs and Biases in Web Search'' by Ryen White : This paper thoroughly deserved to win the best paper award, as it provided excellent insights into biases displayed by searchers and search engines. While psychological studies have shown that beliefs and (sub-conscious) biases influence human behavior and decision-making, this is the first paper that addresses the issue in the context of search. Unlike other work on IR biases (which relate to the presentation by the search engine) this focuses on biases due to skews in the result list as well as seemingly-irrational search behavior. Focusing on yes-no questions from the medical domain the paper studies biases in searchers' click behavior as well as skews in search engine result lists compared to the ground truth labels (obtained from 2 medical experts). The paper shows that both parties (the searcher and search engine) are biased towards positive information (i.e., a Yes response). Some of the findings in this paper that demonstrate this include:
- Users highly favor yes responses when they were uncertain to begin with,
- A survey showed that users demonstrate strong confirmation biases, as they were unlikely to change strong beliefs.
- Search engine tend to disproportionately rank Yes-results higher.
- Users tend to skip No-results to click on Yes-results.
Putting everything together, Ryen showed that this can lead to highly sub-optimal searching as the users wind up with the incorrect answer a large fraction of the time (potentially between 45-55%). These findings have strong implications on search engine ranking and demonstrate the need for (unbiased) ranking functions. While the focus over the past few years has been on satisfying users (and hence training using click data and other indicators of satisfaction), if user satisfaction is driven by confirmation bias this can lead to users finding inaccurate answers. Hence there is a need to either incorporate these biases (say via the snippets shown) while using user satisfaction signals or to explicitly de-bias these rankings or both. Our work on stable coactive learning could be useful for this purpose. Alternately ranking lists could be diversified to show more opinions/perspective, say via a conjunction of ideas from work on user intent diversification and diversification due to opinion biases.
Session Search and Complex Search Tasks:
As I noted earlier, there is a noticeable spike in the work on this topic. The TREC Session track rightfully deserves some accolades as it has helped facilitate much of the work on this topic.
There were quite a few papers at SIGIR which dealt with different aspects of session or task retrieval including our work on Intrinsically Diverse Tasks (which won us the Best Student Paper Award).
Here are some other papers that caught my eye:
"Task-Aware Query Recommendation" by Feild and Allan: The first talk of the conference. This paper studied the effect of task-based context on query recommendation performance. Their method of incorporating multiple previous queries as context is similar to that used in session search, which is based on 2 factors: a) Relevance of context to current query/task, b) How far back was the context. They showed that while on-task queries as context can help improve recommendations, even a single off-task query as context can be harmful. Their experiments also indicate that state-of-the-art same-task classification methods can be used to remove such off-task queries and improve overall performance.
"Characterizing Stages of a Multi-session Complex Search Task through Direct and Indirect Query Modiļ¬cations" by He et. al.: While previous work have shown that user interaction behavior changes across different stages of a search task, this short paper from the CWI group tries to account for differences in interface functionalities. Based on a user study performed in a course involving 25 media studies students, they studied behavioral changes across task stages from two sources: 1) Those seen explicitly via (manual) query reformulations : the universal feature available across all search interfaces; 2) Changes based on richer search interface features/functionalities (beyond just the search box), such as search filters. Their findings indicate that the behavioral patterns observed are similar across these two sources, which indicates that search engines could use signals from various components beyond the search box to better identify the user search context.
"Aggregated Search Interface Preferences in Multi-Session Search Tasks" by Bron et. al. : This paper, from the ILPS group, is on a related topic i.e., aggregate user interface preferences during multi-session search tasks. In particular this paper studies user preferences between three aggregate interfaces/displays: 1. Tabbed 2. Blended and 3. Similarity: Blended with Find-Similar functionality. Using the same 25 student media-study class data as the above work, they found significant differences in usage of the different interfaces as the search task proceeds. While the tabbed display was found to be the most popular (esp. at the start and end of the search tasks), the similarity display was the least. While the tabbed display was favored at the start of the task for more general exploratory searches, blended displays usage increased during the middle as users looked to find more specific content. A second lab study found that the tabbed display was regarded as the easiest to use with blended displays being harder. This was especially pronounced when the information need of the searcher was the same for the different displays, indicating that interface switching is more likely when there is a change in information need.
``Query Change as Relevance Feedback in Session Search'' by Zhang et. al. : This short paper from Grace Yang's group tackles the problem of improving session search using both the documents previously seen as well as the queries previously issued within a relevance feedback (RF) framework. Their method, which comprises a set of heuristics that alter retrieval for the subsequent query based on differences to the current query, is demonstrated on the TREC 2012 Session track data. Incorporating more than one previously issued queries is one avenue for improving on this work.
"Utilizing Query Change for Session Search" by Guan et. al. : This is essentially a longer version of the above paper. The paper again uses term-weight modification heuristics from the previous paper with a Markov Decision Process viewpoint of modeling query change.
You may also be interested in :
- Recapping SIGIR 2013 : Overview
EDIT: Corrected an error pointed out to me by Prof. Arjen de Vries about a paper by Bron et. al. that I incorrectly attributed to the CWI group instead of the ILPS group. Apologies for my oversight especially to the authors of the paper.
(To those wondering why I'm suddenly so active, it is because writing helps me organize my thoughts about a paper and helps me better analyze it.)
``Beliefs and Biases in Web Search'' by Ryen White : This paper thoroughly deserved to win the best paper award, as it provided excellent insights into biases displayed by searchers and search engines. While psychological studies have shown that beliefs and (sub-conscious) biases influence human behavior and decision-making, this is the first paper that addresses the issue in the context of search. Unlike other work on IR biases (which relate to the presentation by the search engine) this focuses on biases due to skews in the result list as well as seemingly-irrational search behavior. Focusing on yes-no questions from the medical domain the paper studies biases in searchers' click behavior as well as skews in search engine result lists compared to the ground truth labels (obtained from 2 medical experts). The paper shows that both parties (the searcher and search engine) are biased towards positive information (i.e., a Yes response). Some of the findings in this paper that demonstrate this include:
- Users highly favor yes responses when they were uncertain to begin with,
- A survey showed that users demonstrate strong confirmation biases, as they were unlikely to change strong beliefs.
- Search engine tend to disproportionately rank Yes-results higher.
- Users tend to skip No-results to click on Yes-results.
Putting everything together, Ryen showed that this can lead to highly sub-optimal searching as the users wind up with the incorrect answer a large fraction of the time (potentially between 45-55%). These findings have strong implications on search engine ranking and demonstrate the need for (unbiased) ranking functions. While the focus over the past few years has been on satisfying users (and hence training using click data and other indicators of satisfaction), if user satisfaction is driven by confirmation bias this can lead to users finding inaccurate answers. Hence there is a need to either incorporate these biases (say via the snippets shown) while using user satisfaction signals or to explicitly de-bias these rankings or both. Our work on stable coactive learning could be useful for this purpose. Alternately ranking lists could be diversified to show more opinions/perspective, say via a conjunction of ideas from work on user intent diversification and diversification due to opinion biases.
Session Search and Complex Search Tasks:
As I noted earlier, there is a noticeable spike in the work on this topic. The TREC Session track rightfully deserves some accolades as it has helped facilitate much of the work on this topic.
There were quite a few papers at SIGIR which dealt with different aspects of session or task retrieval including our work on Intrinsically Diverse Tasks (which won us the Best Student Paper Award).
Here are some other papers that caught my eye:
"Task-Aware Query Recommendation" by Feild and Allan: The first talk of the conference. This paper studied the effect of task-based context on query recommendation performance. Their method of incorporating multiple previous queries as context is similar to that used in session search, which is based on 2 factors: a) Relevance of context to current query/task, b) How far back was the context. They showed that while on-task queries as context can help improve recommendations, even a single off-task query as context can be harmful. Their experiments also indicate that state-of-the-art same-task classification methods can be used to remove such off-task queries and improve overall performance.
"Characterizing Stages of a Multi-session Complex Search Task through Direct and Indirect Query Modiļ¬cations" by He et. al.: While previous work have shown that user interaction behavior changes across different stages of a search task, this short paper from the CWI group tries to account for differences in interface functionalities. Based on a user study performed in a course involving 25 media studies students, they studied behavioral changes across task stages from two sources: 1) Those seen explicitly via (manual) query reformulations : the universal feature available across all search interfaces; 2) Changes based on richer search interface features/functionalities (beyond just the search box), such as search filters. Their findings indicate that the behavioral patterns observed are similar across these two sources, which indicates that search engines could use signals from various components beyond the search box to better identify the user search context.
"Aggregated Search Interface Preferences in Multi-Session Search Tasks" by Bron et. al. : This paper, from the ILPS group, is on a related topic i.e., aggregate user interface preferences during multi-session search tasks. In particular this paper studies user preferences between three aggregate interfaces/displays: 1. Tabbed 2. Blended and 3. Similarity: Blended with Find-Similar functionality. Using the same 25 student media-study class data as the above work, they found significant differences in usage of the different interfaces as the search task proceeds. While the tabbed display was found to be the most popular (esp. at the start and end of the search tasks), the similarity display was the least. While the tabbed display was favored at the start of the task for more general exploratory searches, blended displays usage increased during the middle as users looked to find more specific content. A second lab study found that the tabbed display was regarded as the easiest to use with blended displays being harder. This was especially pronounced when the information need of the searcher was the same for the different displays, indicating that interface switching is more likely when there is a change in information need.
``Query Change as Relevance Feedback in Session Search'' by Zhang et. al. : This short paper from Grace Yang's group tackles the problem of improving session search using both the documents previously seen as well as the queries previously issued within a relevance feedback (RF) framework. Their method, which comprises a set of heuristics that alter retrieval for the subsequent query based on differences to the current query, is demonstrated on the TREC 2012 Session track data. Incorporating more than one previously issued queries is one avenue for improving on this work.
"Utilizing Query Change for Session Search" by Guan et. al. : This is essentially a longer version of the above paper. The paper again uses term-weight modification heuristics from the previous paper with a Markov Decision Process viewpoint of modeling query change.
You may also be interested in :
- Recapping SIGIR 2013 : Overview
EDIT: Corrected an error pointed out to me by Prof. Arjen de Vries about a paper by Bron et. al. that I incorrectly attributed to the CWI group instead of the ILPS group. Apologies for my oversight especially to the authors of the paper.
No comments:
Post a Comment