The Effects of Information Scent on Visual Search in the Hyperbolic Tree Browser

发布时间:2014-05-14 23:15:59

The Effects of Information Scent on Visual Search in the Hyperbolic Tree Browser

PETER PIROLLI, STUART K. CARD, and MIJA M. VAN DER WEGE

Palo Alto Research Center

word/media/image1.gif

The Hyperbolic Tree is a focus + context information visualization that has been developed to amplify users’ ability to navigate large tree-structured information systems. Information scent is a theoretical construct that captures one kind of interaction between task and display. Information scent is provided by task-relevant display cues, such as node labels on a tree that influence a user’s visual search behavior and navigation decisions. An empirical Accuracy of Scent (AOS) score was developed to characterize a set of tasks that required users to find (Retrieval Tasks) or compare (Comparison Tasks) information in tree structures. Two experiments investigated the effect of information scent (tasks with different AOS scores) on performance with the Hyperbolic Tree Browser and the Microsoft Windows File Browser, which is a widely available conventional browser. Experiment 1 found no overall difference in performance time between the two browsers, but did reveal a marginal interaction of information scent with browser performance on Retrieval Tasks. On high AOS tasks the Hyperbolic showed faster performance, but on low AOS tasks the Windows File Browser showed faster performance. Experiment 2 focused only on the Retrieval tasks and revealed that Hyperbolic Tree users examined more tree nodes at a faster rate and visually searched through the tree hierarchy at a faster rate than users of a Windows File Browser lookalike, however, visual search paths were shortened in dense areas of the Hyperbolic Tree display when information scent was low. Two processes appear to affect visual search in the Hyperbolic display: strong information scent improves visual search, and the crowding of targets in a compressed region degrades visual search especially when there is weak information scent.

Categories and Subject Descriptors: H.5.1 [Information Interfaces and Presentation]: Multimedia Information Systems—hypertext navigation and maps; H.5.2 [Information Interfaces and Presentation]: Theory and Methods—screen design; D.2.2 [Software Engineering]: Design Tools and Techniques—user interfaces; H.1.2 [Models and Principles]: User/Machine Systems—human information processing; I.3.6 [Computer Graphics]: Methodology and Techniques—interaction techniques

General Terms: Theory, Experimentation, Human Factors

Additional Key Words and Phrases: Information visualization, focus+context, fisheye-lens visual search, interactive computer graphics, Hyperbolic Tree, information foraging, information scent

word/media/image2.gif

This work was supported by an office of Naval Research contract No. N00014-96-C-0097.

Authors’ addresses: P. Pirolli, S. K. Card, Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA 94304; email: pirolli@parc.com; M. Van Der Wege, Department of Psychology, Carleton College, Northfield, MN 55057.

Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. °C 2003 ACM 1073-0616/03/0300-0020 $5.00

ACM Transactions on Computer-Human Interaction, Vol. 10, No. 1, March 2003, Pages 20–53.

1. INTRODUCTION

Research into information visualization techniques aims to discover and develop ways of amplifying human cognition. Cognition involving even modest problems usually requires repeated access to and interaction with information outside the mind [Card et al. 1999]. One way that cognition can be amplified is by increasing the speed of this access (and the amount of information quickly accessible) thereby increasing the capacity and rate of information processing [Card et al. 1999, 1991]. Advances in interactive computer graphics systems enable the creation of interactive information visualizations for this purpose that use a combination of compact visual representations and interactive responses over a sequence of changing displays.

To increase the amount of information that can be quickly accessed, a class of interactive computer graphic methods, focus + context visualizations [Card et al. 1999], displays most information compactly with lesser detail (the context part), but some of the information with greater detail (the focus part). The two types of displays are integrated so that the relationship of the focus part and the context part is clear, and animated transitions transform context parts into focus parts and vice versa as the user’s interest changes. Sometimes the distinction between focus and context is graduated rather than being two distinct regions. For instance, the Cone Tree [Robertson et al. 1993] uses a three-dimensional tree whose branches articulate via animation to the front when touched, making focus branches unoccluded and larger. Cone trees increase the number of nodes displayable on a screen by more than an order of magnitude compared to common two-dimensional layouts. The Perspective Wall [Mackinlay et al. 1991] uses perspective to bring forward and make readable a small focus area of a long movable tabular strip, thereby increasing by more than three times the size of very wide tables, such as timelines, that can be displayed.

Roughly, these visualizations try to amplify cognition by lowering the average time to access large amounts of information. Essentially, they partition the limited access capacity of the user between the conflicting needs for detail information and access to a wide environment of information in approximate analogy to the moving high-resolution fovea and low-resolution periphery design of visual systems [Resnikoff 1989]. Focus + context systems rest on a number of assumptions.

Single Display and Overview. Keeping an overview of the whole search space on a single display speeds search compared to multiple separate displays, even if detail is reduced, because the overview of the space gives cues (including overall structure) that improve the probability of searching the right part of the space and because more of the space can be searched visually (which is faster than moves that change the display).

Multiple Levels. Dividing information into focus and context parts allows room for more information at minimal representation levels and hence more information that can be searched at high visual speeds, while allowing more detailed information around the point of attention of the user.

Fig. 1. The Hyperbolic Tree browser.

Integrated Display. Integrating overview and detailed information into the same display avoids the visual search costs of going back and forth between two displays. Animated transitions as the focus moves around the context area avoid the costs of reorientation across display states.

Nondistortion. It is possible to integrate focus and context on a display without disorienting distortions to the user that degrade performance.

Although there has been a great deal of research on designing and implementing focus + context techniques, researchers have neither analyzed these underlying assumptions nor empirically tested the techniques’ impact on cognition and attention. It is especially unknown how the subtleties of task requirements interact with visual search in focus + context displays.

This article presents studies on the use of the Hyperbolic Tree browser [Lamping et al. 1995], an example of a focus + context technique (see Figure 1).These studies had several broad purposes.

Performance and learning. They aimed to evaluate performance and learning with the Hyperbolic Tree browser. As a benchmark for measurement, the Microsoft Windows File Browser was also tested on the same set of tasks.

Information scent. The notion of information scent has been used in developing models of people seeking information in document-clustering browsers [Pirolli 1997] and the World Wide Web [Card et al. 2001; Chi et al. 2001, 2000]. Information scent refers to the detection and use of cues, such as World Wide Web (WWW) links or bibliographic citations that provide users with concise information about content that is not immediately in focal attention. We expect that changes in level of information scent may have pronounced effects on the cost structure of information foraging in a tree structure. These changes in cost structure may then be reflected in performance and learning with different kinds of browsers.

Tests of the focus + context assumptions concerning visual search: Information scent and display density. Research [Drury and Clement 1978] has shown that increasing the density of background items on a display diminishes the efficiency of visual search for a target item. Cognitive engineering models of alphanumeric displays [Tullis 1985] suggest that local information density and grouping factors have a substantial impact on visual search. Thus, gains from having more information in the visual search area could be countered by the effects of visual density. Other research [Ware 1999], however, shows that task variables can ameliorate such effects of background display items. Thus, the focus + context displays may be superior in some tasks but not others. The studies presented here test the impact of the density of background information in the Hyperbolic display, and whether these effects interact with information scent.

The Hyperbolic Tree Browser

The Hyperbolic Tree, presented in Figure 1, is an example of a focus + context technique. This browser is used to display large hierarchical tree structures. As can be seen in the figure, more display space is assigned to one part of the hierarchy (focus) than others (context), in a graduated manner. Lamping and Rao [1994] discuss how this is achieved by laying out the hierarchy in a uniform way on an imaginary non-Euclidian hyperbolic plane and then mapping this plane to the Euclidian space of a circular display region. The hyperbolic geometry is non-Euclidian because it is defined such that parallel lines diverge. Because of this property, the circumference of a circle grows exponentially as a function of its radius. Consequently, exponentially more space is available as a function of radial distance from the center. This matches well to tree structures, which usually tend to grow exponentially in number of leaf nodes with increasing depth from the root. The effect of this mapping, which can be seen in Figure 1, is that more space is allocated to nodes presented in the center of the display, whereas nodes are packed more densely in the peripheral areas of the display. Initially, the Hyperbolic Tree browser presents a tree with the root at the center. New parts of the tree hierarchy can be brought into view, in smooth animation, by using the mouse.

Previous Evaluations of the Hyperbolic Tree Browser

Lamping et al. [1995] evaluated the earliest versions of the Hyperbolic Tree using a set of tasks that required users to find nodes in a large tree structure. To provide a comparison, users also performed tasks with a standard twodimensional tree layout presented in a scrollable window. There were no performance or learning differences between the two browsers, although the experiment did detect practice effects. Survey questions revealed that users preferred the Hyperbolic Tree browser and indicated that it gave them a better sense of the overall tree structure. Qualitative observations from the study brought about a number of design improvements. Czerwinski and Larson [1997] evaluated an improved version of the Hyperbolic Tree and also were unable to find performance gains relative to a standard 2-D browser similar to the Microsoft File Browser. In contrast to Lamping et al. [1995], Czerwinski and Larson [1997] found that users did not prefer the Hyperbolic Tree browser when asked about familiarity, likeability, and ease of use. In summary, the results of the qualitative survey questions were equivocal across the studies. However, neither study found quantitative performance or learning differences between the two types of browsers. As always, one explanation for such null results could be that the power of the experiments was too small to detect any such differences.

In contrast to these experimental evaluations, the CHI ’97 meeting in Atlanta, GA presented a panel called The Great CHI ’97 Browse-Off [Mullet et al. 1997]. Six leading browsers competed live in a series of head-to-head races. The Hyperbolic Tree was the clear winner in a field of research prototypes and off-the-shelf systems. The runner-up in the Great CHI ’97 BrowseOff was the Microsoft Windows Explorer File Browser (hereafter called just the File Browser). A rematch of the two browsers, held at a local meeting of the San Francisco Bay Area chapter of CHI, had the same result; the Hyperbolic Tree again beat the File Browser (with the same users on the Hyperbolic as in the CHI ’97 Browse-Off). One motivation for the following studies was the desire to understand why the Hyperbolic Tree browser performed so well in this kind of competition which seemed to contrast the previous laboratory studies [Czerwinski and Larson 1997; Lamping et al. 1995] that failed to find superior performance for the Hyperbolic Tree browser.

To this end, these experiments compared performance with the Hyperbolic Tree browser with the runner-up File Browser. The File Browser is a widely used standard application used to view the file system in the Microsoft Windows operating systems (Figure 2). It is a variant on what has become the conventional way of showing file hierarchies, based on the Apple Macintosh Hierarchical File system. The File Browser has two views. On the left side of the window is a left-to-right treelike layout. In this tree view, folders may be opened or closed by a mouse-click to view or hide subfolders. On the right, files and folders in the currently selected folder are listed. Clicking on folders in either the tree view or file view changes both views. Folders (but not files) can be viewed in either view.

Fig. 2. The Microsoft File Browser.

1.3 Information Foraging and Information Scent

Information foraging theory [Pirolli and Card 1999] is an approach to understanding how user strategies and technologies for information seeking, gathering, and consumption are adapted to the flux of information in the environment. The framework borrows from biology, especially from the field of optimal foraging theory [Stephens and Krebs 1986]. The theory has prompted research that has found mathematical predictions of user strategies based on the assumption that users optimize the expected costs and returns of their information-seeking behavior [Pirolli and Card 1999], computational cognitive models of momentby-moment user information-seeking activity [Pirolli 1997], engineering models of alternative information system designs [Pirolli 1998], and qualitative predictions concerning knowledge crystallization tasks performed by groups [Pirolli and Card 1999].

One class of information foraging activities—of particular relevance to the studies presented here—is information scent-following activities. Information foraging often involves navigating through spaces (physical or virtual) to find high-yield patches of information. Consider the example of foraging for information in a full-text bibliographic database. A user may browse through information displays making choices based on snippets of text that summarize subcollections in different subject areas or by different authors, or based on bibliographic citations that summarize a full-text document. The intermediate proximal cues provided by these snippets of text are used by foragers to decide on their browsing paths and their selection of texts to read in full. Such intermediate information has been referred to as “residue” by Furnas [1997]. In keeping with foraging terminology, we have elsewhere called this “information scent” [Pirolli 1997]. Information scent is the (imperfect) perception of the value, cost, or access path of information sources obtained from proximal cues, such as bibliographic citations, WWW links, or icons representing the sources. In the Hyperbolic Tree browser (Figure 1) and the File Browser (Figure 2), information scent is provided by labels on the nodes and the graphical structure of the trees.

Scent-following is very much like heuristic search strategies that have been studied in human problem solving and in artificial intelligence. Huberman and Hogg [1987] developed an analysis of phase transitions in the cost structure of heuristic search functions that is applicable to the relation of information scent to search through hierarchical information structures. On the World Wide Web, the information structure would be the lattice of linked pages, and the hierarchical search process would be the series of decisions at each encountered page about which link to follow from the presented set of alternatives. In a visual display, such as the Hyperbolic Tree, one imagines a series of decisions about what location to fixate next from the set of alternative visual items that are within the visual attention span surrounding the current fixation. Consider a general idealized case in which a user navigates an information structure using a hierarchical search process that has an average branching factor b that characterizes the number of alternatives available at each decision point. If information scent were sufficiently strong (i.e., “perfect”), the forager would not make incorrect choices. In this case, the false alarm factor f would be f = 0. If the desired target information were located at a depth d, number of search decisions from the start point, then the search cost would be a simple linear function of d. If information scent were nonexistent, the user would choose any branch with equal probability, so we define the false alarm factor to be f = 1, and the forager would perform a random walk in the search space. According to Huberman and Hogg [1987], the average number of nodes examined before a desired target is found will be

(b 1) f d1 1 (bf )

N(d, b, f ) =. (1)

Equation (1) indicates the interaction of the average number of alternatives available in search (the branching factor b) with the quality of the information scent (the false alarm factor f ). On the WWW, the branching factor may reflect the number of links available, and the false alarm factor may reflect the discriminability of the links from one another. In a visual display, such as the Hyperbolic Tree, the branching factor may reflect the density of items in the span of visual attention and the quality of information scent reflects the ability of the visual search to pick out the correct item to fixate upon next.

Figure 3 illustrates search cost functions for a hypothetical forager searching through a tree hierarchy that is 10 levels deep and that has an average

word/media/image6.gif

Depth

Fig. 3. Search costs as a function of information scent. Search cost is in terms of number of nodes visited in a hypothetical tree structure with 10 average branches per level. Quantities labeling the curves indicate a false alarm factor in making a decision to follow a branch leading from an arbitrary node. For a tree of this branching factor, a phase transition from linear search cost functions to exponential cost functions occurs when the false alarm factor grows larger than .100.

branching factor of b = 10. The curves in Figure 3 assume false alarm factors ( f = .100, .125, .150) that are within the range found for link descriptors on World Wide Web pages [Woodruff et al. 2001]. As discussed in detail in Huberman and Hogg [1987], b· f = 1 is a critical threshold separating search that is characteristically linear cost from search that is characteristically exponential cost. Huberman and Hogg define this threshold as the locus of a phase transition in search costs. In Figure 3, very small perturbations around a false alarm rate of .100 (with a branching factor of 10) produce very large changes in the cost function. Very small changes in information scent can have an enormous effect on search costs.

In the natural world, organisms employ different mechanisms and different search strategies that change with the amount of available scent [Bell 1991]. By analogy, and based on the theoretical cost structure analysis associated with Figure 3, browser users may employ different strategies depending on the amount of available information scent. One browser might be superior to others in high information scent tasks due to browser-specific and task-specific search strategies that are highly efficient. However, the same browser might be inferior to others on low information scent tasks because the highly efficient, task-specific search strategy fails.

1.4 Visual Search

The tree displayed in Figure 1 contains 6091 nodes. A standard two-dimensional tree-layout algorithm would not be able to give a detailed presentation of the entire tree on a standard display screen. Like many focus + context techniques, the Hyperbolic Tree uses distortion to fit all the information (the tree structure) into the display space. Information in the focus is “stretched” to occupy more pixels, whereas information in the context is “squeezed” to occupy fewer pixels. Thus the user can view detailed information in the focus while having much more contextual information present on the display.

Because more of the tree structure is accessible on the display, the Hyperbolic Tree browser might be expected to accelerate users’ browsing performance, compared to more conventional tree browsers. Distortion effects might enable users to visually search more of the information structure per unit time and would enable users to move through greater distances in the tree structure on each mouse-drag or mouse-click.

Results from the field of visual search and attention, however, suggest that focus + context display distortions may actually have a complicated relationship with the efficiency of visual search. Drury and Clement [1978] showed that the efficiency of visual search can sometimes be affected by the density of information on the display. Participants in the Drury and Clement [1978] studies were asked to find a target letter in displays that also contained background distracter letters. Among other things, the studies examined the effects of display density on visual attention and search. Drury and Clement [1978] found decreases in the efficiency of visual search (see Figure 4) as display density increased.

A variety of display and task factors can also modulate the effects of background distracters on the visual search for a target. For instance, under some task conditions visual search appears to pick out target information at a rate that is largely unaffected by the number of distracters in the visual display. In other words, visual search is seemingly done in parallel over all items in the display. This phenomenon, commonly known as a “pop-out” effect, is usually found when the target and distracters may be visually discriminated on the basis of preattentive features, such as color. For instance, a red X can be found in a field of black Xs at a rate that is largely unaffected by the number of black Xs on the display. Subjectively, the red X seems to “pop out” from the display. On the other hand, attentive visual search for a target occurs at a rate that is affected by the number of distractor items. In these situations, visual search is seemingly done serially over items in the display. Subjectively, there is no “pop-out” effect in attentive search. Recent meta-analysis of the visual search literature [Wolfe 1998] suggests that there is no clear evidence for a hard distinction between preattentive (pop-out) and attentive search.

Some recent models of visual attention, such as CODE-TVA [Logan 1996] or Guided Search 2.0 [Wolfe 1994] incorporate both top-down (conceptual) and

word/media/image7.gif

Density (characters per deg 2)

Fig. 4. The effect of the density of visual items on visual search time. Adapted from Drury and Clement [1978].

bottom-up (perceptual) factors as influences on the rate of processing of visual items. In previous work [Pirolli 1997], information scent has been modeled using a spreading activation mechanism in which goal concepts (i.e., the user’s information need) spread activation in a top-down fashion and text presented on a display spreads activation in a bottom-up fashion. The information scent judgment for an item is based on the combined top-down and bottom-up activation for an item, compared to the combined activations of other items. One possible extension of this activation-based model of scent judgment is to assume that the rate of visual item processing also depends on the combined top-down and bottom-up levels of activation [Pirolli et al. 2001]. In other words, higher levels of information scent may also produce faster rates of visual search through a kind of activation-controlled mechanism [Chun and Wolfe 1996]. Overall, research on visual search suggests that top-down task-related variables such as information scent might improve visual search, whereas the bottom-up displayrelated variable of density of visual items can degrade visual search.

1.5 Overview

The experiments presented in this article are aimed at (a) evaluating performance and practice effects with the Hyperbolic Tree browser, (b) testing whether this performance and practice varies with information scent, (c) determining if display density has an impact on visual attention and search, and

Table I. Sample Tasks Used in the Studies

(d) testing if the effects of density vary with information scent. In previous studies [Pirolli and Card 1999], the information scent of text snippets on a display have been computed a priori. In this set of experiments, users made judgments about the information scent for a pool of tasks intended for use in subsequent experiments. Using the information scent scores obtained in this manner, two experiments were conducted. Both of these experiments contrasted the use of the Hyperbolic Tree with the File Browser. Participants in both of these experiments were instrumented with an eye-tracker, which provided detailed data for the analysis of visual attention and search.

PRELIMINARY TASK ANALYSIS: MEASUREMENTOF INFORMATION SCENT

The Great CHI ’97 Browse-Off [Mullet et al. 1997] provided entrants with a large tree data structure, which was to be displayed in the browsers, and a set of tasks to be performed using this tree data structure. The tree was compiled in an ad hoc manner from a variety of sources. A portion of this hierarchy can be seen in Figure 1. A set of N = 128 tasks was acquired from the organizers of the CHI ’97 Browse-Off. Not all of these tasks had been used in the CHI ‘97 event. The organizers had divided these tasks into four types. A sample of these tasks is presented in Table I. Simple retrieval tasks required finding a leaf node in the tree. Complex retrieval tasks also involved finding leaf nodes, but involved either some ambiguity and lack of familiarity or a degree of depth in the hierarchy. Local comparison tasks involved comparison of several nodes that were reasonably close together in the tree structures. Global comparison tasks required comparison of several nodes in disparate parts of the tree.

This categorization was done on an ad hoc basis by the CHI ’97 BrowseOff organizers [Mullet et al. 1997]. Following initial pilot studies using the Browse-Off tasks, some tasks seemed to involve less familiarity and greater ambiguity (in terms of knowing a priori their location in the hierarchy) than others. Specifically, a task such as, “What’s the highest rank you can achieve in Freemasonry?” seemed less familiar and more ambiguous than, “Find a hammer.” Moreover, these properties seemed to have a large influence on performance, and thus should be controlled in the experiments. Consequently, some normative data were obtained for these tasks, and later used to operationalize the notion of information scent in the experiments. More generally, we were collecting data to establish some empirical psychological properties of the tasks to supplement the ad hoc categorization of tasks used in the CHI ’97 Browse-Off.

2.1 Method

Pilot studies with the tasks and data tree indicated that some tasks were inherently more likely to lead users to choose the wrong subtrees to browse. Consequently we wanted some measure related to (a) the ability of users to discriminate the information scent associated with different subtrees to explore and (b) the correctness of those choices with respect to the task. We called the metric we developed an Accuracy of Scent (AOS) score. We chose to have users rate only the top four levels of the tree structure partly for logistical reasons. There are 6091 leaves in the tree. If we eliminated the leaves and asked users to rate the likely location of terms in the remaining subtrees users would still have had to process 1436 node labels. It is also generally (although not universally) the case that once one sees the lower-level tree labels one can make a very accurate guess about the likely location of items, however, when using a browser it is usually a wrong decision at the top of the tree that leads the user to waste time browsing an incorrect subtree. Consequently we chose to have people rate only the top four levels of the tree structure since that seemed to be manageable by the raters and the most likely section of the tree to cause large differences in task performance. The fourth level of this reduced tree contained 66 nodes, and these were the nodes that were rated by our participants. Chance performance in guessing the correct location of a target node was 1/66.

Participants. N = 48 Stanford University students and members of the BayCHI organization were paid to answer our survey.

Materials. Each questionnaire contained two questions for each of the 128 Browse-Off tasks (a term, such as “Ebola virus” to be found in the tree). Along the left side of each page of the questionnaire was a tree diagram depicting the top four levels of the Browse-Off tree data (the actual terms to be found were farther down in the tree). Beside the 66 fourth-level nodes were identification codes. The tasks were counterbalanced in 24 different presentation orders.

Procedure. For each of the 128 tasks, the instructions asked participants to identify their top choices of categories for locating the answer to tasks (using the identification codes on the diagram).

2.2 Results and Discussion

For each task term we calculated an Accuracy of Scent score.

word/media/image8.gif

Fig. 5. The distribution of Accuracy of Scent scores for the N = 128 tasks developed for the CHI ’97 Browse-Off database.

Accuracy of Scent (AOS) = the proportion of participants who correctly identified the location of the task answer from looking at upper branches in the tree.

The chance of correctly guessing the location is 1/66 = .015, but node labeling could falsely lead participants to choose an incorrect node and thereby exhibit poorer than chance probabilities of selecting the target node.

Figure 5 presents a histogram of the AOS scores for the N = 128 tasks from the CHI ’97 Browse-Off. The histogram in Figure 5 includes data for all task types presented in Table I. From this overall set of tasks a smaller set was selected for use in the experiments reported below. These tasks were selected such that there were two lists, with each list containing the four types of tasks in Table I, with each task type stratified into tasks with seven levels of Accuracy of Scent centered on AOS = 0.00, 0.10, 0.15, 0.25, 0.30, 0.40. In summary, there were 2 (lists) × 4 (task types) × 7 (AOS levels) = 56 tasks selected for the experiments reported below.

EXPERIMENT 1. EFFECT OF BROWSER DESIGN AND INFORMATIONSCENT ON TREE SEARCH

In the first experiment, the primary aim was to help understand how performance was affected by the design of the visualization and interaction components of a browser. A secondary aim was to study how performance with different browsers varied with different levels of information scent. An eye-tracker collected participants’ eye fixations during the tasks.

3.1 Method

Participants. N = 8 participants were recruited from the Stanford University Psychology Graduate program or from Xerox PARC and completed Experiment 1. Additional participants were recruited and were eliminated due to problems with eye-tracking. The Stanford students were paid $50 for their participation.

Apparatus. Both the Hyperbolic Tree Browser and File Browser described above were used. An ISCAN RK-426PC eye-tracker was used to record eye fixations and saccades.

Materials. For the test portions of Experiment 1, 56 of the 128 Browse-Off tasks pretested were selected as described above. These 56 were divided into two test lists, with each list containing 7 tasks of each type: simple retrieval, complex retrieval, local relational, and global relational. To the extent possible, the tasks were matched on their scent scores across lists and across task types. These scent scores were the ones obtained from the survey discussed above. An additional 56 tasks were selected to use as practice tasks and these were also divided into two lists. Tasks on the two practice lists were also matched for their scent scores. For the purposes of this matching we collected tasks into seven levels of scent scores centered around scent = 0.00, 0.10, 0.15, 0.25, 0.30, 0.35, and 0.40 (see Figure 5).

Procedure. The participants proceeded through (a) a familiarization phase, (b) a practice phase, (c) a test Session 1 phase, and (d) a test Session 2 phase. During the familiarization phase, participants read onscreen instructions that described each browser’s basic functions. They were then invited to become familiar with each browser by exploring a hierarchy unrelated to the tasks in the experiment. Participants practiced with each browser until they felt comfortable.

After the participant expressed a degree of comfort with the browser, the practice phase began. During the practice phase, participants were presented with one list of practice tasks with one browser and the other list with the other browser. Each of the two lists was a randomized block of 28 tasks. Experimental tasks were counterbalanced so that half the participants began with the File Browser and half began with the Hyperbolic Tree browser.

After the practice tasks, the participants’ eyes were tracked, using the ISCAN eye movement monitoring system. A brief session was devoted to calibrating the tracking system along a nine-point grid (the four corners of the screen, the midpoint of each side, and the centerpoint). Following every set of 14 questions, the eye-tracking was verified by having the participant retrace the calibration grid. If the eye-tracking had drifted from the grid, the system was recalibrated. The subject took a break every 20 to 30 minutes (after completing each set of 14 questions).

The test Session 1 phase was conducted in the same way as the practice phase, except the two test lists of 28 tasks each were used instead of the practice lists. For each participant, one test list was presented with one browser, and then the second test list of 28 items with the other browser. List order

Table II. Geometric Mean Performance Times in Experiment 1 by Task Type and Browser

and browser order were counterbalanced across participants. The presentation order of test items within each list was randomized for each participant.

The test Session 2 phase occurred one to three weeks after the initial test phase. Each participant saw the same item lists combined with the same browsers, in the same orders, as in test Session 1.

3.2 Results and Discussion

Preliminary analyses. Exploratory analyses showed that performance times had lognormal distributions, as is typical of human performance times. Consequently, we performed logarithmic transformations on the raw performance times prior to conducting statistical analyses. In analysis of variance, t-tests, regressions, and other least squares error techniques, statistical inferences are computed for linear contrasts between the locations of the mean values of different experimental conditions. The mean of log-transformed data is equivalent to the log of the geometric mean of the data. In addition, geometric means tend to be less affected by large values in skewed data such as performance times. So, we report geometric means (using the notation GM) when we report averages of data that are logarithm transformed prior to statistical analysis.

A preliminary analysis of variance was conducted on the performance times with a focus on the analysis of the Question Types to determine if the ad hoc categorization presented in Table I made empirical sense. This preliminary analysis of variance was conducted based on the Browser (File, Hyperbolic) × Question Type (Local Retrieval, Global Retrieval, Local Relational, Global

Relational) × Accuracy of Scent (AOS, seven levels) × Test Session (Session 1, Session 2) factorial design. Table II presents the geometric mean time to complete tasks of different Question Types in the different browsers. Linear contrasts revealed that the pooled Comparison tasks (Global, Local) yielded significantly slower times (geometric mean, GM = 63.60 s) than the Retrieval tasks (Simple, Complex) performance times (geometric mean, GM = 43.26 s), t(18) = 38.16, s.d. = .01, p < .00001. So, analyses of Question Types reported below will only use the distinction between Retrieval and Comparison question types.

Use of the min-F0 statistic. Throughout our report, we make use of three

F statistics (F1, F2, and min F 0) motivated by criticisms originally noted by Clark [1973] regarding the use of analysis of variance in psychological studies of language. These criticisms are often equally applicable to analyses conducted in studies of human–computer interaction. In many studies, researchers use some fixed set of tasks to study effects of variations in user interface design. Often, an analysis of variance is conducted that treats the tasks as a fixed effect rather than a random effect. That is, the tasks are treated statistically as if they are the entire universe of tasks rather than just a sample (on the other hand, user-participants in studies are typically assumed to be randomly sampled from some population). When an analysis is conducted incorrectly in this manner, it can lead to incorrect inferences about user interface effects: A “significant” F statistic concerning a user interface may actually be due to the effects of the particular tasks used in the study, or some confounded mixture of task and interface effects.

The first statistic F1 tests experimental Treatments (e.g., effects of different user interface designs) against a mean square error (MSE) term associated with the Treatments by Participant (users) interaction. F1 would be the appropriate F statistic to report if Participants were the only random effect in an experimental design. F1 is what is typically reported in human–computer interaction experiments. The second statistic F2 tests Treatments effects against a MSE associated with the Tasks-within-Treatments effects. Roughly speaking,

F1 is a statistical inference about the likelihood that the same experimental effects would happen if the same set of tasks and treatments were given to a new (random) sample of participants, whereas F2 is an inference about what would happen if the same participants in the same experimental treatments were given a new (random) sample of tasks. The quasi-F statistic, F 0, is an approximation to a “true” F statistic that indicates what Treatment effects would happen if both a new sample of participants and new sample of tasks were tested. Unfortunately, it is often difficult to obtain all the components necessary to compute F 0, and that is true for the studies presented here. A minimum bound on F 0 is the min F 0 statistic, which is computed as the ratio of Treatment effects to a MSE that is the sum of the Treatment by Participant effects and Tasks-within-Treatments effects. It should be emphasized that min F 0 is a conservative lower bound on the quasi-F 0 value that is itself an approximation to a “true” F statistic. It is often common practice to report F1 and F2, and assume that if both of these are significant then the obtained effect is reliable over both subjects and tasks, however, there can be cases where this conclusion is not warranted, so it is advisable to also report the conservative min F 0 [Raaijmakers et al. 1999]. We report all three statistics, F1, F2, and min F 0.

Browsers. A new analysis of variance was calculated using our new definition of Question Type. Specifically, this was an analysis of the Browser (File, Hyperbolic) × Question Type (Comparison, Retrieval) × Accuracy of Scent (AOS, seven levels) × Test Session (Session 1, Session 2) factorial design. There was no significant difference in the overall task times between the Hyperbolic and File browsers (F1(1, 6) = 0.05, MSE1 = 0.04; F2(1, 28) = 0.26, MSE2 = 0.14; min F 0(1, 34) = 0.04, MSE = 0.85). Our failure to find an overall difference between the two browsers was somewhat surprising. At the Great CHI ’97 BrowseOff, the users of Hyperbolic Tree had completed the tasks much more quickly than the users of the File Browser. The victory was repeated in a separate contest at the BayCHI meeting. Furthermore, our participants expressed a preference for the Hyperbolic over the File Browser.

The first factor we investigated to account for this difference of outcomes was individual differences of browser user. In the Great CHI ’97 Browse-Off each browser was operated by a person who was an expert in using that browser. Therefore, the performances seen at the Great CHI ’97 Browse-Off could have been due to differences among the individual experts rather than due to differences among the browsers.

Recall that each of our participants worked with both the Microsoft File Browser and the Hyperbolic Tree Browser. To test whether it was individual aptitudes that largely determined performance with the browsers, we ranked participants’ performance with the File Browser. We then ranked the participants by their performance on the Hyperbolic Tree. The correlation in the two rankings, by Spearman rank correlation, was ρ = 0.78, which is significant, p < 0.01. This suggests that individual abilities can overwhelm browser design for the overall task. One possible reason this might be true is that a number of the tasks involved finding information in unobvious places and remembering where it was, thereby giving a role to individual factors such as participant preparation, ability to remember locations, and performance tricks with the browsers.

Another way to understand the relative contribution of browser versus user factors to performance time is to examine the sums of squares (SS) in our analysis of variance in which participants are considered a random effect. In Experiment 1, the participant SS = 8.58 and the Browser SS = 0.0363, the total sums of squares was SS = 345.67, and the error sums of squares was SS = 12.58. That means (for Experiment 1) that individual differences factors contributed more to the performance times than differences between browsers, although neither is large (because of the amount of variance in other factors such as task differences and learning). The contest participants were more highly trained than our experimental participants, potentially magnifying individual differences.

Overall, therefore, there were no net differences between the browsers.

Performance with both browsers improved with practice (F1(1, 6) = 165.90,

MSE1 = 0.15, p1 < 0.01; F2(1, 28) = 364.29, MSE2 = 0.07, p2 < 0.01; min

F 0(1, 30) = 114.0, MSE =.22, p < 0.01). However, there were interesting differences between the browsers that are hidden in the average result that we discuss below.

Question Type. Variations in task contributed most to performance times. Figure 6 shows the (geometric) mean performance times for Comparison and Retrieval tasks across AOS levels. Overall, Retrieval tasks were faster to complete than Comparison tasks (F1(1, 2) = 86.69, MSE1 = 0.41, p1 <.01; F2(1, 28) = 15.86, MSE2 = 2.24, p2 <.01; min F 0(1, 8) = 13.41, MSE = 2.65, p < 0.01). Figure 6 suggests that although information scent has little effect on time for complex comparisons, it reduces time for retrieval

word/media/image9.gif

Fig. 6. Performance times in Experiment 1 as a function of question type and accuracy of information scent (AOS). Error bars indicate the MSE used to calculate min F0.

tasks, however, this interaction is only significant on the commonly used F1 statistic, when participants are considered random effects and tasks considered fixed effects (F1(6, 36) = 17.22, MSE1 = 0.19, p < 0.01; F2(6, 28) = 1.45, MSE2 = 2.24; min F 0(6, 42) = 1.34, MSE = 2.43).

Information Scent. Browser differences become revealed when we examine the effects of different question types and the effects of information scent. Overall, AOS does not reduce task time (F1(6, 36) = 7.66, MSE1 = 0.29, p1 < .01; F2(6, 28) = 0.98, MSE2 = 2.24; min F 0(6, 58) = 0.87, MSE = 2.53). However, in retrieval tasks, information scent seems to act on the two browsers differently. As Figure 7 shows, both browsers are faster when there is higher information scent (Figure 7 collapses all tasks with scent scores less than or equal to 0.16 into low scent scores and all tasks with scent score greater than 0.16 into high scent scores). But the Hyperbolic Tree seems to be faster than the File Browser at high scent tasks and slower than the File Browser at low scent tasks. The analysis of variance of this interaction between scent and browser is marginally significant by F1(F1(6, 36) = 2.04, MSE1 = 0.27, p < 0.06), significant by F2 (F2(6, 14) = 3.59, MSE2 = 0.17, p < 0.05), but not by the conservative min F 0 (min F 0(6, 32) = 1.40, MSE = 0.44).

Effect of Information Scent on Eye Movements. The ISCAN eye-tracker software segments eye-movement data into fixations and saccades. A number of hardware and calibration problems led to missing data for some task trials. Consequently, we had to run less complete analyses of variance. For the data available for each task trial, we analyzed the number of fixations. Exploratory data analysis revealed that the number of fixations showed lognormal distributions. Therefore, an analysis of variance was conducted on logarithm-transformed

word/media/image10.gif

AOS

Fig. 7. Geometric mean performance times for retrieval tasks as a function of browser and accuracy of information scent (AOS). Error bars indicate the MSE used to calculate min F0.

fixations. This analysis of variance was conducted based on the Browser × Question Type × Scent × Test Session factorial design. Analyses were performed using the two broader categories of question type (Retrieval vs. Comparison). Participants used more fixations for comparison questions than retrieval questions (F1(1, 4) = 23.90, MSE1 = 0.55, p1 < .01; F2(1, 28) = 7.51, MSE2 = 1.41, p2 = .01; min F 0(1, 7) = 5.72, MSE = 1.96, p = 0.05), and used fewer fixations with more practice (F1(1, 4) = 23.63, MSE1 = 1.52, p1 < .01; F2(1, 28) = 66.53, MSE2 = 0.42, p2 < .01; min F 0(1, 27) = 17.44, MSE = 1.94, p < 0.01). There were no other main effects or interactions.

Scent-Finding and Scent-Following. To understand more about how the browser designs affected users’ eye movements, more detailed analyses of eyescan paths were conducted for a high scent and a low scent Simple Retrieval task using each of the two browsers. These detailed analyses were done by hand from videotapes recorded during the test sessions of Experiment 1. The videotapes recorded the users’ screens, as well as the point-of-regard as determined by the eye-tracker (i.e., where the users’ eyes were gazing at the screen). Because of the difficulty of such hand coding, we were only able to analyze a small subset of tasks in this manner. The high scent Simple Retrieval task analyzed was “Find the Ebola Virus,” which has an AOS of 0.44. The low scent task was “Find the Library of Congress,” which has an AOS of 0.0. These tasks were chosen because they were the easiest (Ebola) and hardest (Library) Simple Retrieval tasks. Data from all eight participants were included in this hand analysis of the eye-tracking data.

Figure 8 shows typical scan patterns for the Hyperbolic and File Browsers. The circles represent eye fixations and their area is proportional to fixation duration. The lines represent eye movements. The Hyperbolic Tree uses more of the screen. The File Browser involves concentration on two smaller regions of the screen corresponding to the tree view and the folder view.

Fig. 8. Typical eye scan and fixation patterns for (a) the Hyperbolic Tree and (b) the File Browser.

Watching these recordings of eye movements recorded by the eye-tracker gives the impression that there are at least two modes of visual search activity. The first we call scent-following [Pirolli 1997]. This kind of activity seems to be very directional, as if the eye focus were following cues up a gradient towards a maximum reward. The activity is very reminiscent of an organism following a stimulus gradient (scent) towards a reward [Bell 1991]. The other mode of visual search seems to be a nondirectional scent-finding activity, aimed at finding directional cues. This activity is very reminiscent of organisms that have been alerted to a scent (e.g., a puff of pheromones), but must acquire additional cues to identify the direction of the reward [Bell 1991].

Figure 9 shows a typical pattern of simple scent-following. The x-axis measures time in seconds. The y-axis shows the depth of the node in the tree. The

word/media/image12.gif

Fig. 9. Scent-following in the Hyperbolic Tree. Level 1 indicates looking back at the question.

word/media/image13.gif

Fig. 10. Scent following in the Microsoft File Browser on the Ebola task.

line indicates the level of the node on which the participants’ eyes are fixated as a function of time. Attention to a node was inferred from an eye fixation or a mouse-click on a node. The curve in Figure 9 moves monotonically and rapidly upward, indicating that the user progressed deeper into the tree with no backtracking. The short plateau on each level indicates that the participant had multiple fixations at the second, third, and fourth levels.

Figure 10 shows the eye movements for the same high scent search plotted for a participant using the File Browser. The pattern is similar. Triangles indicate mouse-clicks. We have also plotted as “Level –1” whenever the user looked back to reread the question. The pattern in Figures 9 and 10 seems to be more or less common across the two browsers, with the File Browser being somewhat slower. Figure 11 displays what a difficult case looks like for the Hyperbolic Tree. We have added symbols for the use of the mouse for dragging the display. The case starts out with scent following, which we see in the series of mouse-clicks. Notice that the eye movements indicate that the user is typically looking one or two levels ahead. When the scent following fails, this subject begins dragging the display to reveal different places for examination. This is typical of scentfinding activity.

Fig. 11. Scent finding and scent following on the Hyperbolic Tree for the Library of Congress task.

Fig. 12. Scent finding and scent following on the File Browser for the Library of Congress task.

Figure 12 is the same difficult case for the File Browser. “Level –1” now also includes eye movements and mouse-clicks on the File Browser controls and the scrolling bar. We see that superimposed on the search behavior there is some behavior devoted to the manipulation of controls, especially to scrolling. Scrolling involves taking the eyes off the primary display and added control manipulation. We have also indicated when the user is using the tree and when the user is using the list view of the File Browser. This user uses both displays, with the tree view being used to explore into deeper levels. Each node is more

Time (Sec)

Fig. 13. Regression line fits to visual search data on the task “Find the Ebola virus.”

expensive to explore in the Microsoft File Browser because it involves more control manipulation, so the user looks at fewer nodes. The descriptive statistics are in accord for the 16 tasks (4 participants × 2 task types (low scent, high scent) × 2 browser types (Hyperbolic, File) whose eye movements we examined by hand. If we look at scent-following moves (moves down the hierarchy on the current path) we find that the Hyperbolic users move down at a rate of

1.2 forward links/move as opposed to File Browser users who move at rate of 1.0 links/move. The Hyperbolic users also explore more of the hierarchy (14.5 total distinct paths from root to leaves) than File Browser users (11.9 total distinct paths).

Figure 13 presents solid regression lines fit through data from seven individual cases (one of the eight participants had instrumentation problems on this task) of the high scent task (R2 = .84 for the File Browser regression and R2 = .50 for the Hyperbolic Browser). Figure 13 illustrates that search in the high-AOS case with the Hyperbolic Tree is much faster, t(3) = 11.76, SE = .045, p < .0001. The Hyperbolic Tree requires only 0.92 sec/level (compared to 1.75 sec/level for the File Browser) or 53% as long.

3.3 Summary

Over all the question types, there was no difference in the time required by the Hyperbolic and Microsoft File Browsers. This is generally consistent with earlier published findings [Czerwinski and Larson 1997; Lamping et al. 1995]. It was not, however, what we expected given the results of CHI ’97 Browse-

Off for the collection of tasks used in Experiment 1. It is possible that the Hyperbolic Tree won its tournaments because of more skilled users (which of course includes the possibility that the users were able to exploit advanced techniques available through the browser). This possibility is more credible given that Experiment 1 found the effects of individual differences to be stronger than the effects of the browsers.

The two browsers differ, however, in the way they can take advantage of information scent. When we examine users’ eye movements, under high scent conditions and for simple retrieval tasks, the Hyperbolic Tree users can traverse levels almost twice as fast as the Microsoft File Browser users can. But the Hyperbolic is slower than the File Browser under low scent conditions. In addition, participants using the Hyperbolic Tree use more fixations to do the task, but their fixations are shorter.

EXPERIMENT 2

The aim of Experiment 2 was to provide a more rigorous statistical analysis of differences in visual search between browsers on various information scent tasks. The hand-coded analyses presented in Figures 9 to 13 provided descriptive data that were highly suggestive, but these analyses were too time consuming to carry out on enough tasks to perform statistical analysis. We decided to pursue a semiautomatic technique for coding users’ visual search over the information visualizations.

This approach required developing instrumented versions of each browser. These instrumented browsers would provide logs of the display states of the visualization and the mouse-clicks and mouse-drags of the user. The space and time coordinates of the browser logs would then be synchronized with the space and time coordinates of the eye-tracking logs. This synchronization was done by an analyst with the use of a playback simulator that integrated the two logs and allowed the analyst to coordinate time and space scaling parameters associated with each log. Once the logs were synchronized, they could be analyzed automatically to determine such things as the interface objects (e.g., tree nodes) being fixated by the eye.

In Experiment 1, retrieval questions produced browser effects and an interaction of browser with information scent effects. Consequently in Experiment 2, only Simple Retrieval questions were used, and these tasks were selected from the extremes of high and low information scent. Our analyses focused on differences in performance time and aspects of visual search, for example, the number of nodes visited, number of paths explored, and the range of paths explored.

4.1 Method

Participants. Eight participants were recruited from Xerox PARC and Stanford University. Four participants (Experts) were experienced in the use of both browser systems and the hierarchical tree structure. The other four participants (Novices) were unfamiliar with the Hyperbolic Tree and the dataset, although they reported that they were familiar with the File Browser display.

Fig. 14. VFM lookalike to the Microsoft File Browser.

Materials. A subset of the tasks used in Experiment 1 was selected for study in Experiment 2. Of the original 56 test tasks, 8 were selected. All selected tasks were from the Simple Retrieval category. Half were of low information scent (0 to .10); the other half were of high information scent (.35 to .40). These were divided into two test lists of four questions each, matched for level of information scent (two low scent and two high scent each). Eight equivalent practice items were also selected and divided into two practice lists based on scent.

Apparatus. An instrumented version of the Hyperbolic Tree was developed using source code obtained from the Inxight Software, Inc. We were unable to develop a way of directly instrumenting the Microsoft File Browser, so we obtained an instrumented prototype browser called the Visual File Manager, or VFM (Figure 14), also provided by Inxight. The VFM contains a set of windows that operate in the same way as the Microsoft File Browser. One significant limitation of this instrumented prototype is that the users could not double-click in the list view window (the right-hand window) as they could with the original Microsoft File Browser program. Doing this caused the VFM program to crash. Otherwise the two browsers were equivalent for purposes of this experiment. The instrumented versions of the Hyperbolic and VFM browsers produced logs that contained records of the location of every window and object on the screen, all mouse actions, and all keyboard actions. Each display update and each action

Table III. Geometric Mean Reaction Times (sec) for Experiment 2

was timestamped. The ISCAN RK-426PC eye-tracker was used to record eye fixations and saccades.

Procedure. As in Experiment 1, the participants proceeded through (a) a familiarization phase, (b) a practice phase, (c) a test Session 1 phase, and (d) a test Session 2 phase. The familiarization phase was the same as in Experiment 1.

The practice phase was similar to Experiment 1, but participants only saw four questions with each browser. The questions they saw were different with each browser, and the order of presentation of the two browsers was counterbalanced across subjects.

The eye-tracking system calibration was also the same as in Experiment 1. Subjects were calibrated to a nine-point grid, and the eye-tracking accuracy was verified after each set of four questions.

The test Session 1 phase was again the same as in Experiment 1, except the two test lists were composed of four tasks each. For each participant, one test list was presented with one browser, then the second list with the other browser. List order and browser order were counterbalanced across participants, and the test items in each list were presented in a randomized order for each participant.

The test Session 2 phase occurred approximately one hour after the initial test phase. This phase consisted of just an additional test phase identical to the test Session 1. Each participant saw the same lists with the same browsers in the same order as in test Session 1.

4.2 Results and Discussion

Performance Time. In contrast to Experiment 1, the Hyperbolic Tree obtained faster performance times than the VFM browser by about 38% (see Table III), as tested in an analysis of variance on log performance times (F1(1,

4) = 62.40, MSE1 = 0.12, p1 < .01; F2(1, 4) = 133.0, MSE2 = 0.06, p2 < .01; min F 0(1, 7) = 42.5, MSE = .18, p < 0.01). When we use the data of Experiment 1 to examine the same eight tasks used in Experiment 2, the Hyperbolic Tree is faster (by about 9%) but not significantly so (F1(1, 6) = 0.24, MSE1 = 1.22; F2(1,4) = 1.57, MSE2 = 0.19; min F 0(1, 5) = 0.21, MSE = 1.41). The Hyperbolic Tree is faster on the eight tasks (by about 25%) when all the Simple Retrieval Task data are combined across Experiment 1, Experiment 2, and a third experiment not reported here (F1(1, 18) = 7.48, MSE1 = 0.69, p1 = .01; F2(1, 4) = 13.67, MSE2 = 0.27, p2 < .01; min F 0(1, 9) = 4.8, MSE = 1.1,

word/media/image18.gif

Fig. 15. Performance times in Experiment 2 as a function of browser, information scent, and practice session.

p = 0.06). It is possible that strategic differences among the users of the VFM versus the File Browser contributed to the larger difference in performance times in Experiment 2. Recall that users of the VFM were asked to limit their mouse-clicks to the tree portion of the VFM browser and this may have made VFM users more cautious in Experiment 2. High scent tasks were faster than low scent tasks (Table III; F1(1, 4) = 81.90, MSE1 = 0.49, p1 < .01, F2(1, 4) = 30.90, MSE2 = 1.32, p2 = .01; min F 0(1, 7) = 22.42), MSE = 1.8, p < .005). There was no significant effect due to expertise.

Figure 15 shows performance times across practice sessions, browsers, and information scent (AOS scores). There was no overall effect of practice. In Session 1, the geometric mean performance time was GM = 39.14 sec and in Session 2 GM = 30.15 sec, (F1(1, 4) = 4.44, MSE1 = 0.49; F2(1, 4) = 5.46, MSE2 = .40; min F 0(1,8) = 2.45, MSE = .89, p = .16). There was also no interaction of Browser × Test Session (although the F2 test indicated significance; F1(1, 4) = 0.40, MSE1 = 1.68) F2(1, 4) = 13.9, MSE2 = 0.05, p2 < .05; min F 0(1, 4) = .39, MSE = 1.72). There was, however, a three-way interaction of Browser × Scent × Test Session, with the Hyperbolic showing much greater improvement on the Low AOS tasks over sessions than on the High AOS tasks, or compared to practice improvements for the VFM on either kind of task (F1(1, 4) = 16.50, MSE1 = 0.07, p1 < .05; F2(1, 4) = 24.30, MSE2 = 0.05, p2 = .01; min F 0(1, 11) = 9.8, MSE = 0.12, p < 0.01). The High AOS tasks may have been exhibiting ceiling effects due to the limits of the speed of

word/media/image19.gif

Fig. 16. Number of fixations per task in Experiment 2 as a function of browser, information scent, and practice session.

user actions. On Low AOS tasks, the Hyperbolic users seem to show greater practice effects than the VFM users. There were no other main effects or interactions. One possible explanation for this is that Hyperbolic users may have been learning the tree structure at a faster rate than VFM users. Recall that tasks are repeated from Session 1 to Session 2 and consequently users have the opportunity to remember something about the ultimate location of the target information, and to learn which paths not to follow (especially for the Low AOS tasks).

Fixations. Figure 16 presents the geometric mean fixation frequencies across practice sessions, browsers, and information scent (AOS scores). Overall there was no significant difference between the browsers in terms of the number of fixations made per task with the Hyperbolic GM = 657 fixations versus VFM GM = 375 fixations (F1(1, 4) < .01, MSE1 = 0.48; F2(1, 4) = .31, MSE2 = 0.07; min F 0(1, 4) = 0.01, MSE = 0.36). Low AOS tasks (GM = 641) required significantly more fixations than high AOS tasks (GM = 218; F1(1, 4) = 66.16, MSE1 = 0.54, p1 < .01; F2(1, 4) = 33.21, MSE2 = 1.02, p2 < .01; min F 0(1, 7) = 22.11, MSE = 1.56, p < 0.01).

As with performance times, there was no overall effect of practice on fixations per task: In Session 1 there were GM = 436 fixations versus GM = 324 fixations in Session 2 (F1(1, 4) = 2.44, MSE1 = 0.48; F2(1, 4) = 3.94, MSE2 = 0.39; min F 0(1, 8) = 1.51, MSE = .87). There was no interaction of Browser × Test Session (F1(1, 4) = 0.22, MSE1 = 1.54; F2(1, 4) = 6.82, MSE2 = 0.02; min F 0(1, 4) = .22, MSE = 1.54). As with the performance times, there was a three-way interaction of Browser × Scent × Test Session, with the Hyperbolic showing much greater reduction in fixations on the Low AOS tasks over sessions compared to the High AOS tasks and compared to the reduction in fixations across sessions for the VFM on either kind of task, (F1(1, 4) = 31.30, MSE1 = 0.03, p1 < .01; F2(1, 4) = 71.82, MSE2 = 0.02, p2 < .01; min F 0(1, 7) = 21.80, MSE = 0.05, p < 0.01). There were no other main effects or interactions.

The pattern of results for fixations does not entirely match the pattern of results for performance times. In particular, the significant superiority of the Hyperbolic over the VFM in performance times is not reflected in a significant reduction of fixations in the Hyperbolic versus the VFM. Consequently, we also examined the duration of fixations.

Fixation Durations. Users showed significantly shorter eye fixations when using the Hyperbolic (GM = 74 msec) as opposed to the VFM (GM = 102 msec;

F1(1, 4) = 34.73, MSE1 = 2.61, p1 < .01, F2(1, 4) = 361.84, MSE2 = 0.24, p2 < .01; min F 0(1, 5) = 31.68, MSE = 2.85, p < 0.01). There was also a marginally significant reduction in fixation durations for Low AOS tasks (GM = 85 msec) versus High AOS tasks (GM = 89 msec) with a significant F1(1, 4) = 10.05, MSE1 = 0.125, p1 < .05, a marginally significant F2(1, 4) = 5.50, MSE2 = 0.26, p2 = .08, and marginally significant F 0(1, 7) = 3.55, MSE = 0.39, p = 0.1). There were no other main effects or interactions in the analysis of fixation durations. Although there was no significant difference in number of fixations between the Hyperbolic and VFM, this analysis showed that the fixation durations were significantly shorter when using the Hyperbolic.

Number of Nodes Visited. Figure 17 presents the mean number of tree nodes visually searched by users in the two browsers across tasks at the two levels of information scent. On a single task, users may visually search nodes presented across many different views of the tree. The data in Figure 17 count all unique tree nodes visited across all views of the tree in a task. Overall the VFM eye movements were less dispersed over the tree structure than the Hyperbolic Tree eye movements. VFM users visually searched M = 30.3 nodes whereas Hyperbolic users visually searched M = 57.0 nodes, which was significant (F1(1, 4) = 25.04, MSE1 = 691.04, p1 < .01; F2(1, 4) = 150.09, MSE2 = 117.80, p2 < .01; min F 0(1, 5) = 21.46, MSE = 809, p < 0.01). The interaction of Browser × Scent that is apparent in Figure 17 is significant by F1(1, 4) = 30.58, MSE1 = 285.81, p < .01, marginally significant by F2(1, 4) = 5.66, MSE2 = 117.77, p2 = .08, and marginally significant by min F 0(1, 5) = 4.78, MSE = 404, p = 0.08. There were no other significant main effects or interactions. Hyperbolic users visually searched more of the tree structure than VFM users while performing the tasks at a faster rate.

Rate of Downward Tree Search (Scent-Following). In our hand analysis of fixation paths in Experiment 1, we found that Hyperbolic users were moving from the root of the tree down to the leaves at a faster rate than the File Browser

word/media/image20.gif

Fig. 17. Number of nodes visited in Experiment 2 as a function of browser and information scent.

users. In Experiment 2, the average number of levels traversed down the tree in a single move was: Hyperbolic mean = 1.3 levels and VFM mean = 1.1 levels, which is a significant difference (F(1, 106) = 9.13, MSE = 0.10, p < 0.005). These values are also very close to those obtained in Experiment 1. Users of the Hyperbolic are able to visually search the tree in bigger jumps than users of the VFM because they can see ahead. This is one reason why searches are faster in the Hyperbolic tree if the information scent is strong.

Eye Movements in the Context Area of the Hyperbolic Tree. The results of Experiment 1 suggested that under low information scent (low AOS) conditions Hyperbolic users seemed to be detrimentally affected by the visual density of the context area.

To test this hypothesis we divided the Hyperbolic display into (a) a central (Focus) region and (b) a peripheral (Context) region. The Focus was defined by a circle with a radius equal to 0.8 of the radius for the entire display, and the Context was the display region outside the focus. This radius, in addition to all fixation movement distances, was computed from the ISCAN eyetracker coordinate system. The coordinate system of the ISCAN eye-tracker is a 511 × 511 grid overlaid on the 19 inch effective display area of the computer used by participants. Consequently, 1 inch = 38 ISCAN units (approximately). We selected fixation-to-fixation movements that terminated in the peripheral Context area. As hypothesized, in the High AOS condition, fixation movements were longer than in the Low AOS condition: Low AOS movements had a median length of 7.21 (eye-tracker distance units), and High AOS movements had a median length of 8.94 (eye-tracker distance units). This was about a 25% increase in the length of fixation-to-fixation movements with increased information scent, which was statistically significant (F(1, 7) = 10.76, MSE = .062,

p < 0.02).

4.3 Summary

In Experiment 2, we used a more restricted set of tasks than Experiment 1 and developed a semiautomatic method of analyzing visual search over browser interfaces. In Experiment 2, we found the following.

—The Hyperbolic Tree browser yielded faster overall search performance than the VFM.

—Hyperbolic users examined more of the tree nodes at a faster rate than VFM users.

—On the Low AOS tasks Hyperbolic users showed greater practice effects than VFM users. One explanation would be that the Hyperbolic users were learning more of the tree structure and benefited more from repeating low AOS tasks than the VFM users. This seems plausible given that Hyperbolic users are visually attending to more nodes.

—Under low information scent conditions Hyperbolic Tree users examined more tree nodes than VFM users.

—It appears that visual attention can be adversely affected by focus + context distortion techniques, under conditions of low information scent.

Under conditions of low information scent, Hyperbolic Tree users appeared to perform less efficient search of the denser context areas, which is in line with the results of Drury and Clement [1978] (see Figure 4 above). Under low information scent conditions, Hyperbolic Tree users examined many more tree nodes than VFM users, which is in line with our analysis of the interaction of the number of visual items to choose from (the branching factor b) and information scent (the false alarm factor f ) discussed in connection with the analysis of hierarchical search in Figure 3.

We should be clear about the nature of our findings regarding the superiority of the Hyperbolic Tree browser in Experiment 2. The results of Experiment 1 were very much in line with the results of earlier studies [Czerwinski and Larson 1997; Lamping et al. 1995], which were unable to find superior performance for the Hyperbolic Tree browser. In Experiment 2, we studied simple retrieval tasks at extreme ranges of information scent, and under these conditions we were able to find superior performance for the Hyperbolic Tree browser. Experiment 2 suggests that performance with the Hyperbolic Tree is greatly enhanced by clear information scent cues and greater practice. In general, the superiority (or lack of superiority) of the Hyperbolic Tree over conventional tree browsers depends on task conditions such as type of task (e.g., comparison or retrieval) or information scent cues. Note that in all analyses of performance time or fixations in Experiments 1 and 2 that the effects due to tasks are much larger than effects due to differences in browsers.

5. GENERAL DISCUSSION

Focus + context techniques attempt to deliver more information into the span of human attention. One important subclass of such methods uses distortion of the display to achieve this effect. The Hyperbolic Tree browser, as an instance of these methods, showed some superior aspects relative to the more conventional File Browser and VFM Browser under certain task conditions, but not all. On retrieval tasks, which required users to simply find a specific labeled node in a tree, the Hyperbolic Tree yielded better performance times and stronger practice effects compared to the VFM Browser in Experiment 2. On comparison tasks, which required users to assess and compare information from different parts of a tree, there were no performance differences between the Hyperbolic Tree and File Browsers in Experiment 1. On retrieval tasks, visual search is faster overall in the Hyperbolic Tree than the VFM Browser, in terms of number of unique nodes fixated per unit time. On retrieval tasks, when there are strong information scent cues, Hyperbolic Tree users can search and navigate through the tree structure to the information they seek at more than twice the rate of users of the VFM Browser. Poor information scent cues, however, cause visual search and navigation in the Hyperbolic to become much less efficient. Under poor information scent, the density of visual items in the context portions of the Hyperbolic Tree appears to have a detrimental effect on the visual search and navigation process.

The effect of information scent on the use of the Hyperbolic Tree leads to a reexamination of some of its underlying design assumptions. The Hyperbolic Tree, like many other information visualizations, seems to assume that “squeezing” more information into the display “squeezes” more information into the mind. The studies reported here suggest that this simple assumption is probably wrong. Visual attention and visual search interact in complex ways with the density of information on the display as well as task-specific factors such as information scent or “pop out” of information from the display. In the case of the Hyperbolic Tree browser, there are a number of intuitive design improvements that make sense in light of our finding about the role of information scent. Providing landmarks is often proposed as a way of aiding navigation. Commercial versions of the Hyperbolic Tree use color coding to identify sutrees and landmarks. Both of these intuitive design improvements have the effects of improving the “pop out” effect of information scent or making the information scent more discriminable.

There appear to be countervailing processes affecting visual attention in focus + context displays: (1) strong information scent improves visual search, whereas (2) crowding of targets in the compressed region of the focus + context degrades visual search. The effects of information scent and the density of visual information have also been noted in usability studies of World Wide Web sites. [User Interface Engineering 1999]. Generally, WWW pages that provide links with high information scent (relative to some common user task associated with the site) lead to higher rates of task success and higher user satisfaction ratings. WWW pages that provide a higher density of high scent links often have higher task success rates and higher satisfaction ratings than lower density pages. However, if the density is extremely high then success rates and satisfaction ratings go down. In other words, information scent seems to have an effect on the WWW, density has an effect on the WWW, and there is an interaction of density and information scent effects on the WWW. Further empirical studies of information visualizations, informed by basic research on visual search and visual attention may provide more complex formal models from which new design principles may emerge.

ACKNOWLEDGMENTS

We thank Ramana Rao and the Inxight Corporation for their cooperation in providing the Hyperbolic and VFM source code. We would like to thank Paul Whitmore for his assistance in carrying out Experiment 1, Gwyneth Card for her assistance in coding the eyepath data in Experiment 1, and Ahna Girshick for developing the instrumented Hyperbolic and VFM browsers. We thank John R. Anderson for suggesting one of the analyses of individual differences.

REFERENCES

BELL, W. J. 1991. Searching Behavior: The Behavioral Ecology of Finding Resources. Chapman and Hall, London.

CARD, S., PIROLLI, P., VAN DER WEGE, M., MORRISON, J., REEDER, R., SCHRAEDLEY, P., and BOSHART, J. 2001. Information scent as a driver of Web behavior graphs: Results of a protocol analysis method for Web usability. In Proceedings of Human Factors in Computing Systems (Seattle, WA).

CARD, S. K., MACKINLAY, J. D., and SCHNEIDERMAN, B. 1999. Information Visualization: Using Vision to Think. Morgan-Kaufmann, San Francisco.

CARD, S. K., ROBERTSON, G. G., and MACKINLAY, J. D. 1991. The information visualizer: An information workspace. In Proceedings of Conference on Human Factors in Computing Systems, CHI ’91 (New Orleans).

CHI, E., PIROLLI, P., CHEN, K., and PITKOW, J. E. 2001. Using information scent to model user needs and actions on the Web. In Proceedings of Human Factors in Computing Systems, CHI

2001 (Seattle).

CHI, E., PIROLLI, P., and PITKOW, J. 2000. The scent of a site: A system for analyzing and predicting information scent, usage, and usability of a Web site. In Proceedings of Human Factors in Computing Systems, CHI 2000 (The Hague).

CHUN, M. M., and WOLFE, J. M. 1996. Just say no: How are visual searches terminated when there is no target present? Cogn. Psychol. 30, 39–78.

CLARK, H. H. 1973. The language-as-fixed-effect fallacy: A critique of language statistics in psychological research. J. Verbal Learn. Verbal Behav. 12, 335–359.

CZERWINSKI, M. and LARSON, K. 1997. The new Web browsers: They’re cool but are they useful? In People and Computers XII: Proceedings of the HCI ’97 Conference, H. Thimbleby, B. O’Conaill, and P. Thomas, Eds. Springer Verlag, Berlin.

DRURY, C. G. and CLEMENT, M. R. 1978. The effect of area, density, and number of background characters on visual search. Hum. Factors 20, 597–602.

FURNAS, G. W. 1997. Effective view navigation. In Proceedings of Human Factors in Computing Systems, CHI ’97 (Atlanta).

HUBERMAN, B. A. and HOGG, T. 1987. Phase transitions in artificial intelligence systems. Artif. Intell. 33, 155–171.

LAMPING, J. and RAO, R. 1994. Laying out and visualizing large trees using a hyperbolic space. In Proceedings of UIST ’94 (Marina del Rey, CA).

LAMPING, J., RAO, R., and PIROLLI, P. 1995. A focus + context technique based on hyperbolic geometry for visualizing large hierarchies. In Proceedings of CHI ’95, ACM Conference on Human Factors in Computing Systems. ACM, New York.

LOGAN, G. D. 1996. The CODE theory of visual attention. Psychol. Rev. 103, 603–649.

MACKINLAY, J. D., ROBERTSON, G. G., and CARD, S. K. 1991. The perspective wall: Detail and context smoothly integrated. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems. ACM, New York.

MULLET, K., FRY, C., and SCHIANO, D. 1997. On your marks, get set, browse! In Proceedings of Human Factors in Computing Systems, CHI ’97 (Extended Abstracts, Atlanta).

PIROLLI, P. 1997. Computational models of information scent-following in a very large browsable text collection. In Proceedings of the Conference on Human Factors in Computing Systems, CHI ’97 (Atlanta).

PIROLLI, P. 1998. Exploring browser design trade-offs using a dynamical model of optimal information foraging. In Proceedings of the Conference on Human Factors in Computing Systems, CHI ’98 (Los Angeles).

PIROLLI, P. and CARD, S. K. 1999. Information foraging. Psychol. Rev. 106, 643–675.

PIROLLI, P., CARD, S., and VAN DER WEGE, M. 2001. Visual information foraging in a focus+context visualization. In Proceedings of the Human Factors in Computing Systems, CHI 2001 (Seattle).

RAAIJMAKERS, J. G. W., SCHRIJNEMAKERS, J. M. C., and GREMMEN, F. 1999. How to deal with “The language-as-fixed-effect fallacy”: Common misconceptions and alternative solutions. J. Memory Lang. 41, 416–426.

RESNIKOFF, H. L. 1989. The Illusion of Reality. Springer-Verlag, New York.

ROBERTSON, G. G., CARD, S. K., and MACKINLAY, J. D. 1993. Information visualization using 3D interactive animation. Commun. ACM 36, 57–71.

STEPHENS, D. W. and KREBS, J. R. 1986. Foraging Theory. Princeton University Press, Princeton, N.J.

TULLIS, T. S. 1985. A computer-based tool for evaluating alphanumeric displays. In Interact ’84, B. Shackel, Ed. Elsevier Science, Amsterdam, 719–723.

USER INTERFACE ENGINEERING. 1999. Designing Information-Rich Web Sites. Author, Cambridge, MA.

WARE, C. 1999. Information Visualization: Perception for Design. Morgan-Kaufmann, San

Francisco.

WOLFE, J. M. 1994. Guided Search 2.0: A revised model of visual search. Psychonom. Bull. Rev. 1, 202–238.

WOLFE, J. M. 1998. What can 1 million trials tell us about visual search? Psychol. Sci. 9, 33–39.

WOODRUFF, A., FAULRING, A., ROSENHOLTZ, R., MORRISON, J., and PIROLLI, P. 2001. Using thumbnails to search the Web. In Proceedings of the Conference on Human Factors in Computing Systems, CHI 2001 (Seattle).

Received March 2001; revised August 2002; accepted October 2002

The Effects of Information Scent on Visual Search in the Hyperbolic Tree Browser

相关推荐