PRELIMINARY PROPOSAL


Dissertation Preliminary Proposal:

Multi-Modal Cues for Enhancing Web Search Memory

BY

Cynthia Ann Sikora




Jose Perez-Carballo _____________________

date

Linda Roberts _____________________

Outside member date

Zenon Pylyshyn _____________________

Outside member date

Nick Belkin _____________________

Chair date

Nick Belkin _____________________

Director date



Fall 1999








Dissertation Preliminary Proposal:

Multi-Modal Cues for Enhancing Web Search Memory

Cynthia Ann Sikora

Fall 1999




Abstract

People need to find things and they must be able to move from one thing to another and return to things they have already seen. If there is a direct analogy between navigating physical space and information space, information systems can be improved by using the same cues used to navigate in the natural world. In order to investigate the analogy between physical and information spaces, this dissertation explores the role of providing structural representation and multi-modal cues to improve web search memory during navigation of information spaces. In particular, I am going to investigate the appropriateness of providing hierarchical or linear displays of history information with associated thumbnail views and unique sounds.

Introduction

A woman goes to a shopping mall to buy her mother the perfect birthday gift. She is not sure what the gift will be, but expects to know it when she sees it as she wanders around the stores in the mall. After visiting department stores, clothes stores, jewelry stores, variety stores and specialty shops, she has browsed through dozens of stores. The search ends without the perfect gift. However, she does recall several items that may be appropriate. Unfortunately, she must return to the stores where they were located to decide which one to purchase. How does the woman know which stores to go back to? What kind of mental model might the woman have used to represent the relationships or associations between the stores she visited? Was it a linear representation based on the order in which she visited the stores? Perhaps she uses a categorical or hierarchical representation based on the store type or the merchandise sold there. What other ambient cues might assist her while trying to find the stores with the previously seen items? Will seeing the store colors or hearing a nearby merry-go-round provide cues?

Let us explore a similar example. A woman logs on to a web browser to search for the perfect birthday gift to buy for her mother. She is not sure what the gift will be, but expects to know it when she sees it as she clicks on various links that take her to different types of online stores. After visiting online department stores, clothes stores, jewelry stores, variety stores and specialty shops, she has browsed through dozens of e-commerce websites. The search ends without the perfect gift. However, she does recall several items that may be appropriate. Unfortunately, she must return to the websites where they were located to decide which one to purchase. How does the woman know which websites to go back to? What kind of mental model might the woman have used to represent the relationships or associations between the websites she visited? What cues or tools were available to assist her in her navigation effort?

The first example addresses issues associated with navigating through physical space and the cues that assist people. There is an established literature in psychology for navigating through physical space and the cues that are used (Fitz, 1998). The second example addresses the problems associated with navigating through information spaces. People need to find things and they must be able to move from one thing to another and return to things they have already seen. The analogy between physical spaces and information spaces seems intuitive; however, some researchers have argued that a physical metaphor is inappropriate (Stanton & Baber, 1994). Some studies have been able to show benefits of tools typically used in physical space (i.e., maps) when navigating in information space (Fitz, 1998). The question of the extent of that analogy still exists. If there is a direct analogy between navigating physical space and information space, information systems can be improved by using the same cues used to navigate in the natural world. The characteristics of memory and attention that support people in their navigation of physical space should also support them in navigating information spaces.

Can the multi-modal cues and mental models that help us navigate through the physical world assist in our efforts to navigate through the information space we wander through while searching for information? In order to investigate the analogy between physical and information spaces, this dissertation explores the role of providing structural representation and multi-modal cues to improve web search memory during navigation of information spaces. In particular, I am going to investigate the appropriateness of providing hierarchical or linear displays of history information with associated thumbnail views and unique sounds. To the extent that these cues can assist in navigation information space, support for a direct analogy between physical and information space navigation will be demonstrated.

Literature Review

Research is reviewed in the areas of information and physical spaces, web navigation, representational structures and multi-modal cues. The research on information and physical spaces investigates existing support for an analogy between the two types of spaces. The section reviewing web navigation research demonstrates the current difficulties experienced by people searching information spaces without explicit navigation cues. Research on representational structure is reviewed to examine the use of linear or hierarchical representations when navigating through information space. Research on multi-modal cues explores the usefulness of these cues in physical spaces. The effectiveness of visual and auditory cues in physical space demonstrates the applicability of these cues in an information space, if indeed, the two spaces are analogous.

Information and Physical Spaces

There are many references in the literature to the metaphor of navigating through physical space when discussing navigation through information space. Phillip (1994) argues that “the same psychological processes used in navigating physical space can be used in navigating electronic information space” (p. xiii). It is suggested that spatial navigation involves understanding our location in space and using that knowledge to move through space. One argument for the analogy between physical and information spaces is the use of maps in both cases.

Phillip (1994) demonstrated that when subjects were provided with a spatial navigation interface for an information space they were able to construct a cognitive map. Subjects were able to draw from the knowledge of the cognitive map to perform tasks involving representing the information and finding alternate routes through the information space. The author concluded that spatially-based cognitive maps can be used to solve problems in electronic information spaces. This research supports the analogy between physical and electronic information spaces.

Phillip (1994) suggests that “it is critical for designers of computer systems and applications to consider metaphors in designing the interface if the system is to be easily used” (p.20). Metaphors allow people to bring prior skills and knowledge to bear when using the interface. This will happen whether the designer intends it to or not if a metaphor is assumed or inferred. To the extent that people naturally conceptualize information spaces to be analogous to physical spaces, it is important to capitalize on that expectation. Providing navigation cues that occur naturally in the physical world in the interface to an information space allows users to draw on pre-existing knowledge of how to navigate in physical space.

Although maps are used in both physical and information spaces, Gibson (1979) argues that cognitive maps are not needed for navigation in physical space. He suggests that the natural world provides sufficient perceptual cues to direct our movements in physical space. Information space does not provide rich perceptual cues, which leads users to construct conceptual maps of the space. This may suggest that providing sufficient perceptual cues within an information space would lead users to be less reliant on cognitive maps, which are often inaccurate.

Web Navigation

Navigation in the World Wide Web has been identified as an issue of fundamental importance. There is a great advantage of being able to immediately traverse information links. However, the disadvantage is that users experience difficulty trying “to maintain an intuitive sense of where they are, and how they got there” (Forsythe, Grose and Ratner, 1998, p.225). This sense of confusion leads to a sense of disorientation within the information space. Helander, Landauer and Prabhu (1997) describe disorientation as the common difficulty experienced by users of web information spaces. Often users express concern about where they are in information space and feel that they must read each page that is relevant for fear they will not be able to get back to it. Four types of disorientation include not knowing where to go next, not knowing where one is in the information space, not knowing how one arrived at their current point, and knowing where the information is, but not knowing how to get there.

Forsythe, Grose and Ratner (1998) attribute the disorienting effect on the users to the transition from moving from a windowed interface to a hypertext system. Typically, windowed environments show only a few pages at a time, each in a separate window, with no information about the relationship between the windows. Shubin and Meehan (1997) assert that the model of navigation on the web conflicts with people’s mental model. Their mental model is based on a history of using other platforms, such as Windows, Macintosh and Unix. The difference in the model of navigation in the web causes users to get lost and confused. One of the biggest problems with the web is “knowing where you are and where you are going” (Shubin and Meehan, 1997, p.14). The less structured nature of the web makes it easy to get lost. Users follow links to explore new topics and forget what they were doing or where they were. It is suggested that a clear structure should be used such as a hierarchy or star pattern. It is also suggested that clear and consistent navigational aids be employed. This research argues for providing cognitive representations of information space. Without an explicit representation provided, the user will create their own cognitive representation, which will by nature reflect their experience with other systems. The users have a high probability of constructing a misleading or inappropriate mental map of the information space.

Navigating through the hypertext structure of the Web requires subjects to traverse links to get to the desired node (Helander, Landauer and Prabhu, 1997). The link-based navigation of the Web is easy for people to understand and they are quick to take advantage of exploring the links to other pages. Unfortunately, accessing information and keeping track of where pages are is often confusing and difficult. Helander, Landauer and Prabhu (1997) reported that, on a WWW user survey, more than a third of the people thought “finding known information” was a serious problem. “Being able to find Web pages already visited” was also identified as a big problem by 13.41 percent of the WWW users surveyed (p. 903). Navigation was reported as a significant problematic aspect of the Web, second only to response time.

Another area of confusion for users of the web involves the way information access features in browsers have traditionally been constructed. Cockburn and Jones (1996) describe the navigation options in web browsers. Users may enter the information space in many ways based on the features of the system. Once in the information space they may revisit or recall pages. To revisit a page, the user explicitly reloads the page with the reload button. Revisiting a page refers to actually going out to the source location and displaying an updated view. To recall a page, the user navigates through previously visited pages. To navigate through previously seen pages, most browsers include something equivalent to a backward button, a forward button and a history list of pages visited. A recalled page can be displayed from the local cache rather than going to the source location.

There is some variability from one application to the next, and from feature to feature within the applications, but there continues to be confusion regarding how to access previously visited pages. Unfortunately, the way the navigation features traditionally access the list of pages recently visited to recall those pages is not based on the chronological order of when they were visited. The most recently loaded page is considered the top page in a stack of pages. The page at the bottom of the stack was least recently loaded. This list of recallable pages is not necessarily a stack of all the pages previously visited. If a page is loaded when the user is not at the top of the stack, everything above the point in the stack where they are when they load the page is removed from the list. Once those pages are lost from the list, there is no navigation feature that allows users to find them again. People who assume the list is a chronological history of the pages they visited find navigation difficult and confusing with the stack-based list (Cockburn and Jones, 1996). The weight of the problem of the stacked list is made clear by research showing that the back button is the most commonly used function for navigating through previously seen pages. Not including selecting links, the back button is responsible for 41% of all page requests. The history function is not frequently used for accessing pages (Catledge and Pitkow, 1995).

Cockburn and Jones (1996) found that subjects were unable to predict accurately whether or not pages could be recovered from the history list. When asked to describe the structure of the history function, most subjects referred to it as a list. Considering the history structure as a list is consistent with the inference made by the “Back” and “Forward” buttons. This led the researchers to construct a graphical representation to display a visual memory of the information of interest. It provides a graphical overview of the web subspace through which the user is navigating. They believe there is currently not enough of a context within web subspaces to support the users’ awareness of where they are. The use of graphical or spatial overviews to assist users in understanding where they are within hypertext was recommended. This is consistent with many researchers’ views in the area of web navigation. It has been made clear throughout the research that a cognitive representation must be supplied to ensure that the user is operating from the most appropriate perspective.

Tauscher and Greenberg (1997a) discuss the advantages of reducing cognitive and physical overhead when navigation back to previously visited pages can be accomplished through a history function rather than repeating the search. It was found that the majority of the visits to Web pages are revisits. They point out that the traditional history functions available do not meet the needs of the users for the way they want to use them. The stack-based method of providing history in browsers was found to be inferior to providing a list of recently visited sites with the duplicates removed. It is suggested that a graphical overview map or a persistent index page may reduce the excessive backtracking that users engage in. The reasons people give for revisiting Web pages includes changes in the information contained by them, a wish to explore the page further, the page is used for a special purpose, the page is being authored by them, or the page is on the path to another page. The last 6 most recently visited pages have the highest probability of being the next page visited (Tauscher and Greenberg, 1997b). One example of a methodology for having subjects find a previously visited page was to have them browse until they found something interesting and then move them to a different system and ask them to find the same page (Chen, Housten, Sewell and Schatz, 1998).

Dillon (1995) attempted to identify characteristics of presenting information spaces that provide users with a sense of location and order. It is suggested that current research is emphasizing the issues of navigation, salience and structure in the World Wide Web environments. The need to navigate complex information spaces is given as a primary source of cognitive overhead for web users. This overhead is thought to cause the frequent findings of disorientation that is experienced by users. It is thought that navigating through an information space can be treated as analogous to moving through physical space and that both present a need to develop a sense of orientation. The best treatment of navigation is to study users’ perceptions of the structure of information space.

There have been many efforts to provide representations of information space to assist users with navigation and orientation. Examples of interfaces designed to provide a graphical representation of the relationships of the items include WebMap and WebBook (Doemel, 1994; Card, Robertson and York, 1996). To address the difficulty in visualizing links back to the starting page, representations of hierarchical structures are often used (Mukherjea, Foley and Hudson, 1995; Furnas and Zacks, 1994). Attempts to show detail while maintaining context within a representation of information space has included fisheye views (Furnas, 1986; Sarkar and Brown, 1994), three dimensions (MacKinlay, Robertson and Card, 1991), hyperbolic representations (Lamping, Rao and Pirolli, 1995), animation (Chang and Ungar, 1993) and zooming (Bederson and Hollan, 1994; Bederson, Hollan, Perlin, Meyer, Bacon and Furnas, 1996; Perlin and Fox, 1993).

Furnas and Zacks (1994) use multitrees as a structure for representing information. Multitrees are graphs of information spaces with easily identifiable substructures, each of which has a tree structure. It is suggested that the multitree graph provides hierarchical context for information and that people are familiar with tree-based graphical interactions. Many of the graphs used to represent information space in hypertext systems allow many routes between elements, but the structures are not easily laid out. It is easy for users to get lost with these fully general graphs. Standard tree graphs are limited to traversing to each node by a single connection. The tree representation of information space does not account for shortcuts or alternative organizations (Furnas and Zacks, 1994).

Tauscher and Greenberg (1997a) tested multiple structural representations. The two hierarchical methods had the best performance, but there is concern regarding the increased cognitive and physical effort with hierarchical structures. One of the methods was WebNet, which displays a scrollable graphical overview of the information sub-space visited. Page titles are used to label nodes that appear and a line connecting the source and destination nodes represents the navigation. The other method was MosaicG, which provided the pages a user had visited in a two-dimensional tree structure. It provided titles, URLs and thumbnail images of the pages. The flow from left to right and the visual cues were intended to provide a spatial and temporal context for navigating.

The use of hierarchical representation continues to be pursued with different characteristics to improve the use and understanding of the structural representation of the information space. Pirolli and Card (1995) discuss Information Foraging Theory as it relates to following an information scent. The scent refers to the amount of information the user can derive about the relative location of the target from the design of the information structure. Category labels and number of levels can influence the scent of information within a hierarchical structure. It is suggested that structures offering a stronger scent of information at the top levels of the hierarchy lead to better performance in search tasks. An information interface is discussed that presents an overview of the documents in a cluster hierarchy. It is automatically computed and users can navigate through the space.

Mukherjea, Foley and Hudson (1995) advocate providing the user with multiple hierarchical representations each giving different perspectives of the underlying information space. It is suggested that it is very difficult to visualize and comprehend diagrams representing complex network structures. “One of the best ways to comprehend a large complicated information structure is to form multiple simpler structures each highlighting different aspects of the original structure” (Mukherjea, Foley and Hudson, 1995, p.336).

The importance of the issue of finding previously seen sites is illustrated in research conducted by Tausher and Greenberg (1997). By analyzing six weeks of data detailing use of a commonly used Web browser by 23 users, they were able to determine that 58% of the Web sites that people visit are re-visited. Research has also shown that navigation problems in using Web applications have been associated with designing the application interface based on incorrect understanding of the users’ mental model (Cockburn and Jones, 1996). It is necessary to investigate the user needs and mental models associated with Web-based navigation during information searching.

Very few empirical studies have been conducted to identify the usability problems related to Web browser interfaces (Helander, Landauer and Prabhu, 1997). Alternatives to the problems clearly need to be studied. Specifically, the navigation model for “history” in most browsers is stack based and not based on chronological or reverse chronological order. Studies by Jones and Cockburn (1996) and Tauscher and Greenberg (1997a) have shown that most users do not understand how the “history” function works or what model was used to include or remove the Web pages they visited.

Representational Structures

The use of cognitive representations for navigation has been identified as one of multiple methods animals use to find their way back to previously visited places. Many animals use a cognitive map, which is a representation of the spatial relationships of landmarks and surfaces with the goal place. Animals use these relations to navigate more than they rely on scent trails. Evidence suggests that rats use the geometric relationships between the goal place and the general shape of the environment to navigate (Cheng, 1986). Spatial relationship research asserts that people develop cognitive representations of their environment. This is constructed by obtaining knowledge about the spatial setting by navigating through it, getting descriptions of it or using geographical maps (Wender, Wagener-Wender and Rothkegel, 1997). However, without an explicit representation of the information space, people are likely to make the same geometric relationship mistakes that rats do. When traversing links on the web, the shape of the space and the relationships between sites can look similar even when users are not where they think they are. This disorientation is particularly salient when navigating back through previously seen pages, because the same page can be represented multiple times. One instance of a page could confuse someone into thinking they know where they are when they do not.

Additional evidence for the construction of cognitive representations is provided by Morrow, Greenspan and Bower (1987) who found a tendency for people to develop situation models, a kind of cognitive map, for understanding narratives. These models tend to be organized around the sequence of events occurring in the story. By probing for information about the story as the situation model was being constructed, the researchers were able to determine that objects most recently discussed in the story were most easily accessible by subjects. This research reinforces the use and existence of cognitive maps and provides evidence for trying to promote the formation of appropriate representations of information space. However, some research argues that the cognitive maps available do not include spatial relationships. Subjects tend to use a route-map or plan to navigate through physical space. The order of locations is clear, but an accurate representation of how the individual components relate is not available (Gruneberg, Morris and Sykes, 1978).

It is not only important to recognize that cognitive representations are created and used, but also how they are used and what are their limitations. Farrell and Robertson (1998) consider the impact on the cognitive representation when the spatial relationship between the person and the world changes as a person moves around the world. The person updates their frame of reference continuously as their position in the world changes. People orient to their surroundings and update their orientation as they move through space. They integrate the information they collect as they move through space to construct integrated information about their surroundings. In the absence of direct perceptual information, people use symbolic media to construct spatial knowledge (Presson, DeLange and Hazelrigg, 1989). The spatial representation constructed is often referred to as the cognitive map. This research suggests that in the absence of an explicit structural representation of the information space, users will construct spatial knowledge that may be inappropriate and cause errors or confusion.

There is variability in the research concerning the types of cognitive representations created and those that are considered best. Wender, Wagener-Wender and Rothkegel (1997) looked at whether the temporal sequence of learning determines the mental representation of the spatial configuration, given that when learning a new environment, knowledge about spatial relations is necessarily acquired in one particular temporal order. They found that there is an influence of temporal order of presentation. People use routes to aid memory to construct a cognitive schema when learning a new spatial configuration. “Schemata like these help people to memorize locations and relocate objects,” (Wender, Wagener-Wender and Rothkegel, 1997, 270).

Abrams, Baecker and Chignell (1998) tried to understand the cognitive representations that people have by looking at how they organized their bookmarks of web pages. Thirty-seven percent of users do not organize their bookmarks at all. This method of maintaining the list of bookmarks in the order they were created was particularly popular with those who had fewer than 35 bookmarks. Users with more than 100, but fewer than 300, tended to organize the bookmarks into single tiered sets of folders. Users who organized the bookmarks into hierarchies, that is folders within folders, varied depending on the number of bookmarks. Approximately a quarter of the users having 26-100 or 101-300 bookmarks used a hierarchical organization. Nearly half of the users with 300+ bookmarks used the hierarchical structure. The list structure was associated with the least experienced users. The hierarchical structure was popular with the much more experienced users. The use of the hierarchical structure has the difficulty of remembering where the item is stored within the structure. “Finding an item in a deeply nested hierarchy is often challenging for users” (Abrams, Baecker and Chignell, 1998, p.47).

Ausubel (1963) asserts that knowledge is organized in memory hierarchically. Hunt (1982) also argues that cognitive representations are hierarchical. It is suggested that the use of images, sounds, symbols, meanings and the relationships between things are involved in every act of thinking and that these elements are all stored in memory. Memory is described as a reference library within which experiences can be cataloged and classified. This metaphor suggests a hierarchical structure to memory. Larson and Czerwinski (1998) found that neither a very shallow, nor a very deep hierarchical structure led to the best performance during Web searching. It was a condition of moderate depth and breadth that led to the best performance. It may be that the current Web interfaces poorly support deep or shallow structures. By addressing the mental models of searchers and providing tools to assist them within the interface, the depth and breadth of the data structure is less critical.

It is common in the literature for researchers to recognize the existence of multiple representations of space in memory. Presson, DeLange and Hazelrigg (1989) found that estimates of where items are spatially located vary in accuracy based on the size and orientation of the stimulus displayed. The research supports the view that people use distinct spatial representations for different task demands to orient while moving through space. One of the representations is episodic and perceptually based while the other is more integrated and model-like. Similarly, Stanney and Salvendy (1994) describe two types of cognitive styles. One type characteristically analyzes and structures information while the other perceives information globally without identifying the elements and their relationships to one another. This difference in cognitive style leads to individual differences in one’s need to understand the underlying structure of the information being searched. It is also suggested that “any task for which there are multiple levels of information may need to be visually conceptualized in order to perform efficiently” (Stanney and Salvendy, 1994, p.596).

Lipman and Caplan (1992) suggest that people create a representation of a route by attending to landmarks, turns and the sequential ordering of events. Over time, people can construct a spatial representation that includes the relationships of the elements to one another. It is suggested that these two kinds of representations can be thought of as scene and layout representations. It was found that route memory involves scene and layout representations and that it is influenced by individual differences (Lipman and Caplan, 1992). It has also been suggested that there are two processes for estimating one’s position and orientation in the world, continuous and episodic. Continuous processing, or dead reckoning, is determining the changes in one’s position and orientation by tracking time and velocity. Episodic processing is tracking one’s position relative to other objects (Gallistel, 1990).

Hunt and Einstein (1981) differentiate between item-specific and relational information in memory. Item-specific information in memory is based on unique characteristics of items and events that are encoded. Relational information in memory requires abstracting the relationships or common features between the items or events. It is suggested that optimal memory performance depends on item-specific and relational information being used to generate a memory trace. This memory structure includes the notion of hierarchical representation, which is supported by Elio and Anderson’s (1981) schema abstraction model. They found evidence of specific category instances and higher order category information abstracted from those instances. Episodic knowledge keeps a record of experience from both a historical and spatial perspective (Taylor and Evans, 1985). Overall, the research provides strong support for the existence of multiple cognitive representations of space. The importance of these findings to navigating in information spaces is that the structural representation provided to the user may need to be different depending on the task and the individual. It may be necessary to provide both types of structural representations to meet the variety of needs of unique individuals during various tasks.

A related area of research is the consideration of the discrepancies between cognitive and actual representations of space. One study found that memory for information learned from maps is orientation specific. Memory for information that is gained from more direct learning is not constrained by the orientation of the information when it was learned (Presson, DeLange and Hazelrigg, 1989). Levine (1982) investigated you-are-here maps and found that people had a tendency to interpret the upper part of the map as the area they have not yet visited. The predisposition to orient a cognitive map relative to oneself can contribute to people getting lost. Consistent with those findings, Palij, Levine and Kahan (1984) found evidence to suggest that cognitive maps move upward from the starting point, regardless of where the starting point occurred. It was also found that when making judgments about items on a path, subjects were faster when the actual path and their cognitive map of the path were aligned. Cognitive maps are aligned relative to the position of the subject’s body.

Misaligned map tasks have demonstrated that people have difficulty orienting themselves when their true position in the environment is different than their personal representation of their position. These tasks provide subjects with a map of a path to study. The subjects then make judgments about which direction particular points are on the real path. Levine, Jankovic and Palij (1982) found that when the real path and the map of the path are misaligned difficulties arise. Rossano and Warren (1989) using the same task have shown that additional processing is required by subjects to align the two frames of reference.

Rieser (1989) had subjects point to one of the targets in a circular array around them after being blindfolded. The subjects were rotated to a new position or asked to imagine being rotated to a new position and asked to point to the target item from the real or imagined new position. The imagined position led to more errors and longer response times. The response times in the imagined rotation condition increased as the angular distance between their actual and imagined position increased. Easton and Sholl (1995) using the same task replicated this finding. Shepard and Hurwitz (1984) obtained similar findings with schematically presented paths. Decisions about turning left or right took longer with greater differences between real and imagined points of view. Mental rotation is determined to be the additional processing required.

Farrell and Robertson (1998) provided additional support for this argument by investigating the same phenomenon from a different perspective. The same task was used as Rieser (1989), but all subjects were actually rotated. Half the subjects were instructed to update their position relative to the target during rotation and half were asked to ignore the rotation and imagine they were in their original position. The longer response times associated with bigger angular disparities in the imagination condition indicates that after updating target positions during rotation, a mental rotation back to the original orientation occurs. The research indicates that physical movement automatically updates spatial orientation and that having to imagine a spatial orientation that is not actually provided to the person requires additional cognitive processing. Implications for navigation and orientation in information spaces are considerable. Clearly, people have more cognitive processing associated with imagining a space. Information spaces by default require users to rely on their imagination. The overhead associated with keeping track of where they are and where items are in relation to where they are is too great a cost. The result is that users concentrate on their work at hand, get lost in hyperspace and feel frustrated with the systems.

Snowberry, Parkinson and Sisson (1983) found a strong advantage of structurally grouping like objects. Categorical groupings improved speed and accuracy of finding items. Practice effects did not influence this finding. This finding is inconsistent with Tauscher and Greenberg’s (1997a) opinion that “selecting items from a hierarchical sublist would be more effort than choosing from a sequential list” (p.404). In practice hypertext navigation is rarely linear. The existing linear history displays do not reflect the branching nature of following links. A hierarchical history display provides more information about the structure of the information space that the user has visited. It also provides information about where the branching points are and what sub-paths were followed (Tauscher and Greenberg, 1997b).

Chen, Housten, Sewell and Schatz (1998) explored the usability of techniques for improving the organization and categorization of large volumes of information. They compared performance of a self-organizing map representation to performance on an existing hierarchically structured browsing service. They had people browse for a relevant homepage on the web on either the system with a hierarchical or map-based structure and then asked them to find the same homepage using the other structure. Generally, people were able to find the homepage again when moving from the map-like structure to the hierarchical structure, but unable to find it when the order was reversed. It was suggested that the map-like structure facilitated browsing, but was not appropriate for known item searching. In both a hierarchical and map-like information space, subjects complained about getting lost when looking for information.

Stanney and Salvendy (1994) found that differences in performance time associated with the organizational structure of the task information was controlled for by allowing subjects time to understand the system’s structure. It was determined that providing users with tutorials about the system structure would improve computer performance. Stanney and Salvendy (1994) assert that performance in searching for information on the computer is influenced by the ability of people to recognize an existing structure, impose structure when it does not exist, and restructure the organization when it is apparent. Whether a map-like or chain-like representation is used, it is important to make the underlying structure obvious to the user.

Multi-Modal Cues

Trumbo (1998) suggests that through memory the user of an information space creates a sense of organization. It is emphasized that we rely on memory not only to get us from one place to another, but also to get us back again. The challenge presented is that a memorable environment of sensory elements be created to assist memory during navigation. The sensory experience needs to be memorable specifically to promote understanding and access to content. The elements currently presented by browsers, as memory triggers for navigation in the Web (e.g., URLs, link names) are generally not significant enough to move from short- to long-term memory.

The use of images, sounds, symbols, meanings and the relationships between things are involved in every act of thinking and it is suggested that these elements are all stored in memory (Hunt, 1982). Memory is equated to a reference library within which experiences can be cataloged and classified. It is further suggested that stimuli of different modalities would be stored separately in memory and that each type would be classified and retrieved in its own way. Based on this perspective, it would be valuable to have redundant cues in different modalities to provide information or feedback about items that need to be remembered.

The use of cues to supplement the visual representations can provide both new and redundant information to assist in navigation. The use of visual aids to help the users know where they are, is supported by the research that demonstrates that people have very high recognition accuracy for visual material (Shepard, 1967). It has been argued that visual cues can be effective in helping people to encode items into memory (Reed, 1982; Standing, 1973; Shepard, 1967). Even the visual cue of color could be helpful. Chen, Housten, Sewell and Schatz (1998) found in a map-like representation of information that subjects wanted colors to be used to differentiate and provide more meaning. Incorporating visual feedback in a structural representation of an information space can capitalize on people’s natural ability to relate things spatially by providing visual hooks to the temporal order of the sites. Providing a visual cue can encourage the use of mnemonic memory by linking the sites to a spatial path as suggested by the method of Loci. The method of Loci has been traditionally employed by using imagery to help people remember lists of items by linking each item with a previously learned sequence of locations (Klatzky, 1980). For the best results, each path item must have visual precision to reality and be highly distinctive (Reed, 1982). The additional visual cue of location on the screen may assist the user in finding previously identified sites.

Auditory feedback associated with navigational aids can also provide a navigational cue. In the vast amount of research on systems to help the user navigate and orient in information space, there was one interface enhancement that was conspicuously absent. Auditory feedback was not addressed in any of the applications presented or the theories discussed. Some of the more current papers discussed the importance of designing web interfaces from a multimedia perspective. However, sound as feedback within the framework of providing assistance with navigation in information spaces was never considered. The technical limitations of providing auditory feedback on the web do exist, but given the rate technology grows and changes, research should be studying the anticipated direction of systems, not the history.

The importance of using auditory feedback in user interface design cannot be overlooked. Sound has unique properties that make it very desirable particularly when the user’s attention is directed away from the primary task. Several sources of sound can be monitored simultaneously while the user is performing a task that requires both motor and visual attention (Buxton, 1989). A user could be talking on the phone while looking through a document and the auditory feedback regarding the task on the computer would still be perceived and processed. Selective attention is often used by people to focus their attention on one of multiple stimuli as appropriate for the situation. The “cocktail party” effect of suddenly hearing your name across a crowded room is often given as an example of attending to one foreground sound and not others even when they are of the same intensity (Hereford and Winn, 1994). Sound has the advantage of being available 360 degrees around the user. The ears are also unique in that they are a continuously open channel. Even in sleep, the auditory modality encodes sound. These qualities of sound have led to technology like the alarm clock, the doorbell and the ringing telephone.

Proposed Research

The analogy between physical and information spaces needs to be studied within the framework of searching an information space with the assistance of cues proven to be useful in physical spaces. The effectiveness of the physical space cues in an information space will provide a metric of the analogy between the two types of spaces. To measure the effectiveness of these cues for navigating in the information space, there will be an emphasis on the process of getting back to where they have been before. The proposed investigation will focus on providing a navigation tool that makes explicit structural representations of the pages visited and by associated specific visual and auditory cues to those pages. Humans and animals frequently use two types of structural representations of physical space. These are similar to a linear representation by page order visited and a hierarchical grouping by site visited. Each of these structures will be provided explicitly in this study. Visual and auditory cues will be provided within the tool displaying the structural representation. These cues can be provided to users at the time a page is visited and when a node on the structural representation tool is explored. Mapping the sounds and visual cues from the initial visit to the node on the tool will provide cues to navigating back to previously visited sites. If the use of visual and auditory cues can be shown to assist people in their search tasks, the analogy between physical and information spaces is supported.

The specific research question addresses how structural and ambient cues assist people in finding something that they have previously encountered. Specifically, it is hypothesized that cues generally used to navigate in physical space will be useful in assisting users to navigate in an information space. The proposed research will investigate the analogy between physical and information spaces by focusing on techniques or methods to assist the user in finding relevant information previously encountered while exploring the Web. A structural representation of where the searcher has been will be displayed linearly or hierarchically, so that the two structures can be compared. It is expected that a hierarchical organizing scheme of representation will provide a more clear sense of structure and will therefore lead to better performance than a linear representation. Each of these structural representations will be enhanced with visual and auditory cues identified in this research. Visual cues will be a thumbnail representation of the web page visited displayed next to the item in the history mechanism. The specific auditory feedback used will be identified in Experiment I. The sounds investigated will not determine the optimal sounds possible, rather appropriate and feasible sounds based on existing knowledge and sound design expertise. The use of a visual or auditory cue is expected to provide an additional memory trigger and should result in better performance with either modality cue. A combination of a hierarchical representation, visual cues and auditory cues should provide the richest information for navigating in the information space and should therefore realize the best performance supporting the notion that information spaces are analogous to physical spaces. This would allow system designers to provide navigation tools for information spaces based on those that are effective for navigating in physical spaces.

Experiment I

Method

A 2x2 between subjects study will be conducted with a total of 40 subjects, ten subjects in each of four conditions. The subjects will be tested in four groups. Due to constraints imposed by the patent process, the methodology will be explained in complete detail in the final proposal.

Subjects

Forty knowledge workers at Lucent Technologies will participate as subjects during paid work hours and will be additionally compensated for their time by receiving a small gift. A demographic questionnaire will assess the level of the subjects’ Web use and their musical background.

Stimuli

Due to constraints imposed by the patent process, the sounds will be described in complete detail in the final proposal.

Procedure

After signing the consent form and filling out the demographic questionnaire, the subjects will be instructed on how to do the study task. Due to constraints imposed by the patent process, the task will be described in detail in the final proposal.

Dependent Measures

Due to constraints imposed by the patent process, the performance and satisfaction measures will be described in more detail in the final proposal.

Analyses

Due to constraints imposed by the patent process, the analysis of the data for Experiment I will be described in complete detail in the final proposal.

Experiment II

Method

A 2x2x2 between subjects study will be conducted with a total of 80 subjects, with ten subjects in each of eight conditions. Two levels of representational structure include a linear and hierarchical representation. The presence or absence of a visual cue for navigation will be provided. An auditory cue will be provided to half of the subjects in each condition. A preliminary investigation will be conducted to identify the appropriate way to present the structural representations and the specific visual and auditory cues to be used. The study design is shown in Figure 1. The linear condition without auditory or visual feedback is equivalent to a control condition. Typically, neither auditory nor visual feedback is provided within current history mechanisms. Although most browsers have traditionally provided a stack-based cache, research has shown the inadequacies of using the stack-based history (Cockburn and Jones, 1996; Tausher and Greenberg, 1997). The latest versions of browsers are now providing a linear history as the default, which will be considered the baseline situation. To the extent that explicit hierarchical structure, visual cues and auditory cues improve performance over the baseline condition, the analogy between physical and information spaces will be supported.


Figure 1: Study design with two levels of each of three variables including structural representation, visual enhancement and auditory enhancement.

Structural Representation



No Auditory Aid


Auditory Aid



Linear

(n = 40)

No Visual Aid

10

10

20


Visual Aid

10

10

20


Hierarchical

(n = 40)

No Visual Aid

10

10

20


Visual Aid

10

10

20



40

40

80


Subjects

Eighty knowledge workers at Lucent Technologies will participate as subjects during paid work hours and will be additionally compensated for their time by receiving a small gift. All subjects will have at least some familiarity with using the Web. A demographic questionnaire will assess the level of the subjects’ Web use.

System

The subjects will conduct their searches on the Worldwide Web in real time. By performing a consistent task before and after each subject, the speed of the web can be calibrated. Prior to each subject, all file and history caches will be cleared to ensure that each subject begins the search with no stored pages or addresses. Visual Basic will be used to provide an Internet Explorer 5.0 browser, the history mechanism with the structural representations, thumbnail images and sounds. The linear representation will duplicate web pages that are re-visited in the list. The hierarchical structure will be based on the conceptual search path, not on the site structure. A link matrix will be used to determine if the current page was accessed from the previous page, so that it can be indicated as the next level down the hierarchy. Miniature representations of the actual web page (i.e., thumbnail) will be used to provide the visual cues. The auditory cues used in Experiment II will be determined by Experiment I.

Procedure

After signing the consent form and filling out the demographic questionnaire, the subjects will be administered a standard test of spatial imagery. They will then be instructed on how to do the study task. The task will require the users to search for some information that is the same for all subjects. The search for information will lead the subjects through multiple levels of links within and between sites. The task must be carefully constructed to avoid relevant sites that are accessible from the same higher-level site. During the course of searching, the subjects must indicate each time relevant information is found. During the search, the linear or hierarchical representational structure and the spatial and auditory navigational aids will be provided in the appropriate conditions. The experimental sessions will be videotaped.

Subjects will be asked to find examples of a particular search topic and determine which one is the best example. They will be required to go back to the previously visited web pages where they found relevant examples. To eliminate the cognitive processing associated with rating sites as relevant or not relevant, subjects will be asked to print any relevant pages as they find them. To simulate an actual situation where users would return to relevant sites without knowing in advance that they needed to return to those sites, a minor deceptive maneuver will be employed. The computer will be configured to print to a printer in a different room from where the subject is performing the task. When the subject is finished with the task, the experimenter will leave the room to check the printouts of the relevant sites. The experimenter will return to the subject, indicate that the printer did not work and ask that they reprint all the pages again. The computer will be reconfigured to print on the printer in the room with the subject, so that the first and second prints of the sites will be separated. After the study, the subjects will be debriefed on the deceptive tactic that was used and asked questions about the difficulty associated with finding the relevant sites again.

Dependent Measures

Both performance and satisfaction measures will be taken. Performance measures will include items found, search time and errors. Items found measures the number of relevant items printed during the search compared to the number of items they go back and print again. Search time refers to the time spent looking for the items previously printed divided by the number of items printed. A measure of overall speed can be attained by looking at the time it takes to do the search the first time through and the time it takes to go back and print the previously printed sites. Measures related to time will be evaluated with respect to the calibration measure of the speed of the web for each subject. Relative measures of recall and precision can be computed by investigating the number of sites printed on the second search compared to the number of sites printed during the initial search. Errors will address the number of sites visited or printed, which did not contain a previously printed relevant item. The satisfaction measures will include usability of the system, structure and aids. Usability of the system will be assessed by collecting subjective ratings on a nine-point scale of how effective this system was for finding the relevant information needed to complete this task. Usability of the structure and aids will be measured by asking for subjective ratings on a nine-point scale of how helpful the navigational structure, spatial and auditory enhancements were for assisting them in finding the previously printed relevant information.

Analyses

Multiple analysis of variance (MANOVA) and post hoc comparisons will be computed to determine the differences and interactions between the eight groups on each of the six dependent measures. Specifically, the difference between linear and hierarchical representations will be assessed. The use of visual and auditory cues will be evaluated, as well as the interactions of providing these navigational aids. If there are demographic variables such as spatial imagery score, sex or Web use that are evenly distributed across groups, those variables will be evaluated for differences.

Implications

To the extent that people are able to use structural representation, visual and auditory cues to navigate in information space, the argument can be made that information space is analogous to physical space. Given that analogy, information systems can be designed to capitalize on the users’ pre-existing knowledge about navigating through physical space. Moving through an information space can become as intuitive as moving through physical space. The “lost in hyperspace” phenomenon will no longer be the expectation, but the exception. The frequency of getting lost in an information space should be similar to the occurrence of getting lost when navigating physical space. Once the analogy between the two types of space is understood, appropriate analogous tools can be provided for navigation in information space, like route maps and landmarks. In the event that people are not able to use their knowledge of physical space to navigate through information space, other ways to approach the problem of navigating through information space can be investigated.

References

Abrams, D., Baecker, R. & Chignell, M. (1998). Information archiving with bookmarks: Personal web space construction and organization. Proceedings of ACM CHI'98 Conference on Human Factors in Computing Systems, USA, 41-48.

Ausubel, D. P. (1963). The Psychology of Meaningful Verbal Learning. New York: Grune & Stratton.

Bederson, B. B. & Hollan, J. D. (1994). Pad++: A zooming graphical interface for exploring altenate interface physics. Proceedings of UIST ’94 User Interface Software and Technology, New York, 17-26.

Bederson, B. B., Hollan, J. D., Perlin, K., Meyer, J., Bacon, D. & Furnas, G. (1996). Pad++: A zoomable graphical sketchpad for exploring alternate interface physics. Journal of Visual Languages and Computing, 7, 3-31.

Brewster, S. (1997). Using non-speech sounds to provide navigation cues.

Buxton, W. (1989). Introduction to this special issue on nonspeech audio. Human-Computer Interaction, 4, 1-9.

Card, S. K., Robertson, G. G. & York, W. (1996). The WebBook and the Web Forager: An information workspace for the World-Wide Web. Proceedings of ACM CHI'96 Conference on Human Factors in Computing Systems, USA, 111-117.

Catledge, L. D. & Pitkow, J. E. (1995). Characterizing browsing strategies in the world wide web. Computer Systems and ISDN Systems: Proceedings of the Third International World Wide Web Conference, Germany, 27, 1065-1073.

Chang, B, & Ungar, D. (1993). Animation: From cartoons to the user interface. Proceedings of UIST ‘93 User Interface Software and Technology, New York, 45-55.

Chen, H., Houston, A.L., Sewell, R.R. & Schatz, B.R. (1998). Internet browsing and searching: user evaluations of category map and concept space techniques. Journal of the American Society for Information Science, 49 (7), 582-603.

Cheng, K. (1986). A purely geometrical module in the rat's spatial representation. Cognition, 23, 149-178.

Cockburn, A. & Jones, S. (1996). Which way now? Analysing and easing inadequacies in WWW navigation. International Journal of Human-Computer Studies, 45 (1), 105-129.

Dillon, A. (1995). What is the shape of information? Human factors in the development and use of digital libraries. SIGOIS Bulletin, 32-34.

Doemel, P. (1994). WebMap – A graphical hypertext navigation tool. Proceedings of the Second International Conference on the World-Wide Web, USA, 785-789.

Easton, R. D. & Sholl, M. J. (1995). Object-array structure, frames of reference, and retrieval of spatial knowledge. Journal of Experimental Psychology: Learning, Memory and Cognition, 21, 483-500.

Elio, R & Anderson, J. R. (1981). The effects of category generalizations and instance similarity on schema abstraction. Journal of Experimental Psychology: Human Learning and Memory, 7 (6), 397-417.

Farrell, M. J. & Robertson, I. H. (1998). Mental rotation and the automatic updating of body-centered spatial relationships. Journal of Experimental Psychology: Learning Memory and Cognition, 24, 227-233.

Fitz, P. E. (1994). Cognitive maps of electronic information space. Ph.D. Dissertation. Order number AAD94-31166.

Forsythe, C., Grose, E. & Ratner J. (Eds.). (1998). Human factors and Web development. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Publishers.

Furnas, G. W (1986). Generalized fisheye views. Proceedings of ACM CHI'86 Conference on Human Factors in Computing Systems, USA, 16-23.

Furnas, G. W & Zacks, J. (1994). Multitrees: Enriching and reusing hierarchical structure. Proceedings of ACM CHI'95 Conference on Human Factors in Computing Systems, USA, 330-336.

Gallistel, C. R. (1990). Organization of Learning. MIT Press.

Gibson, J. J. (1979). The Ecological Approach to Perception. Boston: Houghton-Mifflin.

Gruneberg, M.M., Morris, P.E., & Sykes, R.N. (1978). Practical Aspects of memory. Academic Press.

Helander, M. G., Landauer, T. K., & Prabhu, P. V. (Eds.). (1997). Handbook of Human-Computer Interaction. NY: Elsevier.

Hereford, J. & Winn, W. (1994). Non-speech sound in human-computer interaction: A review and design guidelines. Journal of Educational Computing Research, 11 (3), 211-233.

Hunt, M. (1982). The Universe Within: A New Science Explores The Human Mind. New York: Simon and Schuster.

Hunt, R. R. & Einstein, G. O. (1981). Relational and item-specific information in memory. Journal of Verbal Learning and Verbal Behavior, 20, 497-514.

Jones, S. & Cockburn, A. (1996). A study of navigtional support provided by two World Wide Web browsing applications. Hypertext ’96 Conference Proceedings, 161-169.

Lamping, J., Rao, R. and Pirolli, P. (1995). A focus+context technique based on hyperbolic geometry for visualizing large hierarchies. Proceedings of ACM CHI'95 Conference on Human Factors in Computing Systems, USA, 401-408.

Larson, K. & Czerwinski, M. (1998). Web page design: implications of memory, structure and scent for information retrieval. Proceedings of ACM CHI 98 Conference on Human Factors in Computing Systems, USA, 2, 25-32.

Levine, M. (1982). You-are-here-maps: Psychological considerations. Environment and Behavior, 14, 221-237.

Levine, M., Jankovic, I. N. & Palij, M. (1982). Principles of spatial problem solving. Journal of Experimental Psychology: General, 111, 157-175.

Lipman, P. D. & Caplan, L. J. (1992). Adult age differences in memory for routes: Effects of instruction and spatial diagram. Psychology & Aging, 7 (3), 435-442.

Mackinlay, J. D., Robertson, G. G. & Card, S. K. (1991). The perspective wall: Detail and context somoothly integrated. Proceedings of ACM CHI'91 Conference on Human Factors in Computing Systems, USA, 173-179.

Morrow, D. G., Greenspan, S. L. & Bower, G. H. (1987). Accessibility and situation models in narrative comprehension. Journal of Memory & Language, 26 (2), 165-187.

Mukherjea, S., Foley, J. D., & Hudson, S. (1995). Visualizing complex hypermedia networks through multiple hierarchical views. Proceedings of ACM CHI'95 Conference on Human Factors in Computing Systems, USA, 331-337.

Palij, M., Levine, M. & Kahan, T. (1984). The orientation of cognitive maps. Bulletin of the Psychonomic Society, 22 (2), 105-108.

Perlin, K. & Fox, D. (1993). Pad: An alternative approach to the computer interface. Proceedings of SIGGRAPH ‘ 93 Computer Graphics, New York, 57-64.

Pirolli, P. & Card, S. (1995). Information foraging in information access environments. Proceedings of ACM CHI 95 Conference on Human Factors in Computing Systems, Denver, 51-58.

Presson, C. C., DeLange, N. & Hazelrigg, M. D. (1989). Orientation specificity in spatial memory: What makes a path different from a map of the path? Journal of Experimental Psychology: Learning, Memory, & Cognition. 15 (5), 887-897.

Reed, S. K. (1982). Cognition: Theory and application. Brooks/Cole Publishing Company, CA.

Rieser, J.J. (1989). Access to knowledge of spatial structure at novel points of observation. Journal of Experimental Psychology: Human Learning & Memory, 15, 1157-1165.

Rossano, M. J. & Warren, D. H. (1989). Misaligned maps lead to predictable errors. Perception, 18, 215-229.

Sarkar, M. & Brown, M. H. (1994). Graphical fisheye views. Communications of the ACM, 37 (12), 73-84.

Shepard, R. N. (1967). Recognition memory for words, sentences, and pictures. Journal of Verbal Learning and Verbal Behavior, 6, 156-163.

Shepard, R. N. & Hurwitz, S. (1984). Upward direction, mental rotation, and discrimination of left and right turns in maps. Cognition, 18, 161-193.

Shubin, H. & Meehan, M. M. (1997). Navigation in web applications. Interactions, 4 (6), 13-17.

Snowberry, K., Parkinson, S. R. & Sisson, W. (1983). Computer display menus. Ergonomics, 26 (7), 699-712.

Standing, L. (1973). Learning 10,000 pictures. Quarterly Journal of Experimental Psychology, 25, 207-222.

Stanney, K. M. & Salvendy, G. (1994). Effects of diversity in cognitive restructuring skills on human-computer performance. Ergonomics, 37 (4), 595-609.

Tauscher, L. & Greenberg, S. (1997a). How people revisit Web pages: Empirical findings and implications for the design of history systems. Proceedings of the International Journal of Human-Computer Studies, 47 (1), 97-137.

Tauscher, L. & Greenberg, S. (1997b) Revisitation patterns in World Wide Web navigation. Proceedings of ACM CHI 97 Conference on Human Factors in Computing Systems, 1, 399-406.

Taylor, J. C. & Evans, G. (1985). The architecture of human information processing: Empirical evidence. Instructional Science, 13 (4), 347-359.

Trumbo, J. (1998). Spatial memory and design: A conceptual approach to the creation of navigable space in multimedia design. Interactions, July-August, 26-34.

Wender, K. F., Wagener-Wender, M. & Rothkegel, R. (1997). Measures of spatial memory and routes of learning. Psychological Research, 59 (4), 269-278.