Terry Vaughn
School of Information
The University of Texas at Austin
April 14, 2004
1. Introduction
2. Query Formulation
3. Results Presentation
4. Query Reformulation
5. Conclusion
In spite of a growing body of research suggesting new designs and modes of interaction for search, the majority of commercial search interfaces have remained relatively stagnant over the past decade, reflecting underlying data and system architecture rather than supporting a human mental model of search. Hearst (1999) speaks of designing interfaces that reinforce a user's "internal locus of control" (p. 259) and Nielsen (2001) asserts that users should "control their own destiny" (¶ 1), but the standard text box and button provide little toward this end. Among the websites that offer advanced search, only a fraction of users use it to their advantage and many others use it incorrectly (Nielsen, 2001, ¶ 13).
This paper will survey new search interface design techniques that have proven to provide better user comprehension, performance, control, and satisfaction. In 1997, Shneiderman proposed a four-phase seach interaction model – consisting of query formulation, action, results presentation, and query reformulation – that still holds true today. Except for the action phase, the structure of this paper will follow the Shneiderman model as it presents new research and design techniques.
Little research has been done to discover how to help users get search right the first time or engage users in the process of reformulation. Based on a study of people shopping on various e-commerce sites, Nielsen (2001) concluded that users are very poor at query reformulation. If users did not find the desired result on the first try, they often gave up. His study revealed a success rate of 51% on the first query, followed by 32% on the second query, and 18% on the third. In a similar study, Spool (2001) achieved almost identical results. In other words, users were progressively less likely to find the desired result on subsequent queries, and almost half quit after their first unsuccessful search.
Nielsen suggests that more effort should be invested to increase users' success on the first search attempt. Studies (Belkin, 2001; Belkin, 2002) have revealed that this can be achieved by assisting users in the initial query formulation. Simple features like spellchecking support, improved labels in the user interface, and a more generous text input area could potentially improve search performance.
When
users make spelling mistakes, their search often yields no results. This
happens when search terms are not in the index, the user misspells
the terms, or the user does not know the proper inflection of a word entered
as a search term. In a study conducted by Dalianis (2002), approximately
10% of all queries sampled were potentially erroneous. Current search engines
such as Google and Yahoo! Search analyze terms after the query has
been submitted by the client application. For example, Google implements
software that parses a query to see if each term matches the most common
version of a word's spelling. If it determines that aternative spellings
of query term(s) would generate more relevant search results, the results
page links to a new search based upon a spelling suggestion, preceeded
by the familiar phrase, "Did you mean:" (Google,
2004a).
Over the past couple of years, there has been a trend to integrate search functionality into client applications. Google makes a tool bar that docks into Microsoft Internet Explorer 5.0 or later (Google, 2004b), and Apple's Safari browser ships with a built-in Google search field (Apple, 2004a). Many common mispelled queries could be completely avoided by adding a spellchecker feature that "listens" to the search field of the previous examples. A passive error indicator, like the red, zig-zag underline found in Microsoft Word, could signal a potential mispelling without interrupting the typing flow of the user. Figure 1 illustrates the proposed spelling error indicator. Neither Safari or the Google Tool Bar implement this feature. To date, no research has been conducted to determine what percentage of query errors this feature could prevent.
Belkin (2002) recommends providing search interface labels that elicit verbose information problem descriptions. He asserts that asking users to enter queries as complete sentences or questions, as opposed to lists of keywords or phrases, "led to significantly longer queries (even after non-content words were removed)." For example, in an experimental search interface, Belkin (2002) labeled the query input field with the phrase, "Please describe your information problem in detail," instead of the Web standard "Search" or "Go." This subtle difference in labeling resulted in substantially and significantly longer queries, significantly increased satisfaction with search results, and significantly fewer query iterations per search.
Considering how standard Web search input boxes constrain queries by limiting character length and the visible area of the query (often only large enough to view no more than two words before scrolling), and most likely applying Norman's principle of affordance (1988), Belkin (2001) found that larger text input boxes with multiple (five lines in his experiment) lines yielded longer queries than those from what has become the de facto standard single-line text input box. Investigating the relationship between query length and search performance, Belkin (2001) also found that longer queries lead to better performance and greater satisfaction.
The majority of commercial Web search systems present results as a lengthy ranked list, sometimes numbering in the millions. Such a daunting lists can intimidate and lower the confidence of users. In 1993, Kulthau noted that user emotions are an important element in the search interaction process. New research has attempted to find ways to present results in a way that not only increases user confidence, but also invites interaction to assist in query reformulation.
3.1 Categorized ResultsFor years now, Web searches using popular systems such as Google and Alta Vista have returned long lists of results on multitudes of undifferentiated topics. In 2001, Dumais, Cutrell, and Chen tested new interfaces that categorized search results to help users sort through and make better decisions in selecting resources. Their system implemented an automatic Web page classifier that identified textual patterns in documents and assigned it to categories based upon predefined category model. The study found that every category interface tested was more effective than a list interface. Participants not only preferred a categorized interface, but also were up to 50% faster using this interface.
If search results do not satisfy a user's information need, the user can either abandon the search or refine the query. Relevance feedback provides a mechanism to automatically or manually refine a user's query. Studies (Belkin, 2001; Belkin, 2002; Vectomova, Karamuftuoglu & Lam, 2003) have shown that user controlled or psuedo relevance feedback is most effective.
For those intrepid users who are willing to attempt more than one query, relevance feedback can be incorporated into the interface. Relevance feedback is based upon the assumption that users do not know the optimal descriptive phrase that would yield the desired results, and describes an iterative, interactive process whereby users first enter a query, then select relevant results or terms related to those results to reformulate the query. Theoretically, each successive reformalation yields better results (Hearst, 1999, p. 303). Users are usually able to make judgments of whether items which are retrieved are relevant, or not, to their interests or information problems. Relevance feedback takes advantage of this ability, by asking users to make such judgments, and then modifying the initial query on the basis of characteristics of the documents which have been judged relevant, or not. The typical modification is to increase the weight of query terms which occur in relevant documents, to decrease the weight of query terms which occur in non-relevant documents, and, most significantly, to add new terms to the query which are “important” in the relevant documents. All of this is typically understood to be accomplished without the user’s intervention, or even knowledge (p. 304).
4.2
Pseudo Relevance FeedbackIn 2001, Belkin investigated how to implement user controlled relevance feedback in a search system interface. This study revealed that systems which suggested terms for users to add to a query (with either positive or negative weights) based on relevance feedback were reasonably effective and usable. By contrast, systems that suggested terms to be added without asking for relevance judgments (using a pseudo relevance feedback technique, which assumes that the top n retrieved documents are relevant) was better accepted, led to increased satisfaction with the search results, and facilitated better performance. Taken together, these results suggest specific ways in which term suggestion for supporting query modification can be implemented in interface design to make searching more effective. For example, in an experimental interface, Belkin presented a list of relevant terms under the title, "Good terms to add," (see Figure 3).
This research was later reinforced by Vectomova, Karamuftuoglu, and Lam (2003) in a study that tested a search interface that showed users a list of noun phrases extracted from relevant documents in the result set. Users could then expand their query by selecting additional phrases. They found that allowing users to expand queries by selecting related phrases significantly improved performance.
Many searches are doomed before they even get off the ground. Based upon the findings presented in this paper, poor query formulation accounts for the majority of failed searches. In light of the fact that more than half of users attempt search only once then quit if their desired result is not presented, user interface designers should place a greater emphasis on helping users get search right the first time. Minor adjustments to the user interface such as labeling search input boxes to elicit more verbose information problem descriptions, or simply making the search input box wider and multi-lined improve task performance and satisfaction. More complex implementations of relevance feedback also produce significant performance and satisfaction improvements. The common thread among all these improvements is that they engage users to interact with the system so that they may better formulate and refine their queries.
Apple (2004). Safari. Retrieved April 12, 2004 from, http://www.apple.com/safari/
Belkin, N.J., et al. (2001). Rutgers’ TREC 2001 Interactive track experience. In E.M. Voorhees & D.K. Harman (Eds.), The tenth text retrieval conference, TREC 2001. Retrieved March 1, 2004 from, http://trec.nist.gov/pubs/trec10/papers/rutgers-interactive-paper.pdf.
Belkin,
N.J., et al. (2002). Rutgers interactive track at TREC 2002. In E.M. Voorhees & D.K.
Harman (Eds.), The eleventh text retrieval conference, TREC 2002. Retrieved
March 1, 2004 from,
http://trec.nist.gov/pubs/trec11/papers/rutgers.belkin.pdf.
Dalianis , H. (2001). Evaluating aspelling support in a search engine. Retrieved March 1, 2004 from, http://www.nada.kth.se/~ hercules /papers/SpellingIR.pdf
Dumais, S., Cutrell, E. & Chen, H. (2001). Bringing order to the web: Optimizing search by showing results in context. In Proceedings of CHI '01, Human Factors in Computing Systems, April 2001, pp. 277-283. Retrieved April 2, 2004 from, http://microsoft.com/~sdumais/chi2001.pdf
Google (2004a). Google web search features. Retrieved April 7, 2004 from, http://www.google.com/help/features.html
Google (2004b). Google toolbar. Retrieved April 7, 2004 from, http://toolbar.google.com/
Hearst, M. A. (1999). User interfaces and visualization. In R. Baeza-Yates & B. Ribeiro-Neto (Eds.), Modern information retrieval (pp. 257-323). New York: ACM Press.
Kuhlthau, C. C. (1993). Seeking meaning: A process approach to library and information services. Norwood, NJ: Ablex.
Nielsen, J. (2001). Search: visible and simple. Retrieved February 29, 2004 from, http://www.useit.com/alertbox/20010513.html
Norman, D. A. (1988). The design of everyday things. New York: Basic Books.
Shneiderman, B. (1997, January). Clarifying search: A user-interface framework for text searches. D-Lib Magazine. Retrieved April 1, 2004, from http://www.dlib.org/dlib/january97/retrieval/01shneiderman.html
Spool, J. (2001). Users don't learn to search better. Retrieved March 2, 2004 from, http://www.uie.com/articles/learn_to_search/
Vechtomova, O., Karamuftuoglu, M., & Lam, E. (2003). Interactive search refinement techniques for HARD tasks. In E.M. Voorhees & D.K. Harman (Eds.), The twelfth text retrieval conference, TREC 2003. http://trec.nist.gov/pubs/trec12/papers/uwaterloo-olga.hard.pdf