HTML version of paper presented at the 5th Australian Conference on Information Systems (ACIS '94), Melbourne, Australia, September 27 - 29, 1994. (c) Andrew Treloar, 1994. Last revised 23/7/94.
NOTE: This electronic version contains hyperlinks only to those references for which a URL is possible. Where a choice of formats is available, the text version will take precedence. All other references have been left as plain text. Readers can refer to the reference list at the end of the document for more details.
People have always processed information in information spaces of one kind or another. The Internet provides a range of possible information spaces to enable users to interact with each other and with networked information. This paper analyses these information spaces in terms of the affordances they provide to the user, and distinguishes five different sets of spaces. These support the activities of communication, retrieval, searching, browsing and organising. The characteristics of these information spaces are considered and some possible implications of this analysis for the design of Internet access tools discussed.
Throughout history one of the distinguishing characteristics of homo sapiens has been our use of information (Burke, 1985). Indeed it has been argued (Goody 1981 and Diamond 1992) that it was our ability to work with symbolic information through the medium of language that started our species on its rapid and continuing process of cultural (as opposed to biological) evolution.
This work with information has always occurred in what might be called an 'information space'. Such a space is the location where the human mind interacts with information or communicates it to another. Location should be understood fairly broadly in this context, and not necessarily seen as tied to a particular physical place. Information spaces facilitate the storage and retrieval of data and information, processing of data into information, communication of information, navigation through structured information and the linking of different pieces of information. In a sense, such spaces are where information 'exists'.
Originally, such an information space would only have been within the brain, in the air transmitting sound between people, or as part of artistic/cultural ritual manifestations like painting and dance. The information technologies used then were 'soft' technologies, based on what has been called the 'wetware' of our brains rather than on any hardware. Communication occurred through the use of spoken language, and processing took place without any technological assistance. Storage and retrieval used techniques such as rote memorisation of poems or stories, and the use of recall methodologies that linked concepts to imagined spatial locations (Burke 1985 and Yates 1966).
The development of external representations for information such as graphic art and writing freed information from the confines of the human brain and allowed the storage and communication of information independently of a specific individual. A range of information could now be readily manipulated in symbolic form in information spaces external to the brain. The development of printing enhanced this by enabling the rapid duplication and transmission of information, as well as redundant storage in multiple locations.
The second half of this century has seen explosive development in both computing and communications technologies. Communications technologies such as fax machines, mobile phones, satellite systems, and fibre-optics have enhanced our ability to move information rapidly and in a range of forms. Computer technologies have enormously expanded our ability to store and retrieve information, as well as to analyse it in an increasing variety of ways. These two technologies are now converging to fundamentally change the ways in which we work with information, and dramatically increase the range of possible information spaces. Nowhere is this convergence and some of the changes it might bring more clear than on the Internet.
The Global Multiprotocol Open Internet includes the IP Internet (what most people think of as the Internet proper), Bitnet, FIDONet, MHSNet, UUCP, and OSI. It can best be thought of as a 'network of networks', linking over 14,000 separate networks around the world. These networks share backbones in places and interconnect through gateways. Some parts (FIDONet) have access to email only, while others (the IP Internet) have direct access to a much larger range of services. Gateways exist that enable some email access to ftp sites and even Web servers.
This paper will focus on the facilities provided by the IP Internet. The significance of the Internet with respect to information is the way in which computer storage and retrieval and networked communications are together enabling the development of new information spaces. The challenge is how best to understand this emerging entity. Existing approaches to considering the Internet have either focussed on the content of the networked information stores, the ways in which these stores are provided to the network, or the tools used to access them.
The problems of cataloguing networked information content relate both to the need for better meta-data about the content (Lynch et. al. 1992) and the difficulties of working with information that may appear in multiple forms at multiple locations (Caplan 1993). These two concerns are being discussed at present in the context of what is being called the 'virtual library' (Rooks 1993).
The ways in which networked information is made accessible have been tackled both from the perspective of particular technologies and attempts to produce general frameworks. Some of the currently deployed technologies are the World-Wide Web (Berners-Lee 1992), the Archie database for anonymous ftp sites (Deutsch 1992), Gopher (Wiggins 1993), and WAIS (Kahle 1992). General approaches include the Prospero system (Neuman 1992 and 1993), the taxonomy proposed by Schwartz et. al. (1992) which looks at both data and meta-data design choices, and Rheingold's (1993) distinction between the underlying protocols, the services layered onto these protocols, and the interfaces used to access these services.
Much of the literature deals with the interface tools used to access the Internet. A number of ways have been proposed to group these tools. December (1994a and 1994b) suggests a division into information retrieval (NIR) tools, computer-mediated communication (CMC), and other services. Foster, et. al. (1993) propose Interactive Information Delivery Services (things like gopher, and WWW), Directory Services (such as WHOIS and X.500) and Indexing Services (like archie, WAIS, and online library catalogues). Neuman et. al. (1993) analyse information retrieval tools by their core functions: Storage, Access, Search, and Organisation. Obraczka et. al. (1993) propose a taxonomy based on different set of functions (Query/Browse/Organise), together with the granularity of the objects manipulated, the way in which the information space is organised, how the data is stored, and the preferred interface(s) to the system. Janes et. al. (1994) consider all of the above as aspects of navigation through a networked environment.
Too much of the discussion of the Internet has focussed on the tools or the technology, rather than on the information itself. This in part reflects the orientation of those who have been instrumental in making the Internet what it is today. As the Internet continues to develop, there is a need to focus more on the ends and less on the means. This paper attempts to do this by proposing a new way of looking at the Internet
Any technology, indeed any item, can be considered in terms of the affordances it offers to users. Affordances are the properties of an environment or entity that enable particular types of interactions and activities (Norman 1993, Gaver 1992). Thus a pencil affords writing, as well as less obvious things like tapping, poking, turning and so on. As should be obvious, the perceived primary affordances of a technology are the most important to its users (Gaver 1991), although secondary affordances may be useful. For example, a knife primarily affords cutting, but can be used as a screwdriver if required. The perceived affordances of a technology determine its exploitation by defining its suitability for particular uses.
The Internet is often discussed as a single entity. This is a convenient shorthand but ultimately misleading. The Internet 'consists' in some sense of the information accessible through it, the hardware that supports this access (both communications and computer hardware), the software used to create, provide and work with networked information, and perhaps even the users themselves. The users are not necessarily aware of the full dimensions of the Internet. They are primarily focussed on the remote information sources they are accessing, other users with whom they are communicating, and the software tools they are employing to do this. To stretch the 'information superhighway' metaphor a little, they are concentrating on the destination and on the driving, rather than on how the ignition system works.
With such a heterogenous environment, it is not surprising that users perceive a wide range of affordances as they interact with different aspects of the Internet. This paper proposes to group these affordances into five different classes of activity: communication, retrieval, searching, browsing and organising. This is not the only possible grouping but seems reasonably intuitive and complete. Each of these classes of activity is associated with an information space or cluster of related information spaces, in the same way that the affordance of writing is associated with the set of writing implements. But in contrast to physical implements like a pencil, these information spaces have their affordances mediated through software. Users do not interact directly with remote information but employ a range of software tools (what Norman (1993) would call 'cognitive tools') to do their bidding in cyberspace. Some of these tools and their associated technologies afford multiple operations to users and hence appear in multiple information spaces. These five categories of information space summed together define the Internet, and their shared affordances define the limits of what one can do with the Internet. What then are the characteristics of these information spaces, and how do they manifest themselves on the Internet?
Communication spaces afford communication with other users of the Internet in a variety of ways. The non-Internet analogue for this information space would be the familiar technologies of the telephone, paper mail, and physical bulletin boards. This communication can be viewed as taking place along the twin dimensions of cardinality and synchronicity (Treloar 1993), and is represented in Table 1.
Cardinality is viewed from the sender's perspective, as the sender is the initiator of the communication act. It can be classified as either one-one and one-many. One-one involves direct communication between sender and receiver alone. One-many involves direct communication between sender and multiple receivers. Many-many communication can occur over the Internet, but all the current tools assume one user per workstation at any point in time. Thus multiple users sharing a single workstation can be treated as a series of one-many interactions.
Synchronicity can be classified as either synchronous or asynchronous. With a synchronous connection the communication channel between sender and receiver operates and stays open in real time. With an asynchronous connection the channel operates with time intervals between messages, and typically only stays open long enough to deliver the message.
Both the cardinality and synchronicity dimensions have their own separate affordances, and the communications tools or technologies along these dimensions are selected by users accordingly.
Cardinality Synchronicity Tool/Technology One-One Synchronous Talk, Phone, Chat One-One Asynchronous Electronic mail* One-Many Synchronous IRC, MUD/MOO/MUSH One-Many Asynchronous Listservers, NetNews *Note that while it is possible to send email to a list of people (one-many) via group addresses, or to CC it to people other than the original addressees, the primary purpose of email is to communicate one-one.
Table 1: IP Internet Communication Tools
The apparent divisions between the tools and technologies that underlie these communication spaces are not quite as rigid as the table suggests. Email is used to send commands to listservers and receive the results of requests or distributions. Email can also be used to contribute to shared Netnews discussions or to continue the discussion directly and alone. Conversations begun in a MUD may similarly be moved to the privacy of email if desired. These shifts between tools reflect the differing interaction styles afforded by other synchronicity/cardinality combinations.
Retrieval spaces afford the ability to get information from somewhere remote to somewhere local. As the information is in electronic form, this always means an implied copy, rather than a move. This is different to the non-Internet analogues of retrieval, such as borrowing a book or requesting a journal article, where copying is not necessarily involved.
In the past the type of information retrieved was mostly 7-bit ASCII, the lowest common denominator of the Internet. This was later extended to binary data such as program files and non-text data, sometimes encoded to let them move through mail gateways. More recently, standards have evolved to allow the easy movement across hardware platforms of sound, video and graphics data. Some of the retrieval tools only allow the retrieval of documents, while others allow the user to select arbitrary files, regardless of type. In any case, the distinction between document and application is increasingly becoming less clear and less important to many users. Common retrieval tools include ftp, alex, gopher, WWW and WAIS.
Ftp and alex both allow the user to navigate through directory hierarchies on a remote machine and get files to a local machine. Ftp distinguishes only between text and binary files. Binary does not necessarily refer to the file's original form - many binary files are encoded as text and stored on ftp servers. Macintosh files are a good example of this; due to their unique file format, they are usually encoded as text in BinHex format for storage on non-Macintosh servers. Anything from sound to video to graphics to application programs may be stored and retrieved as binary data. Gopher (Wiggins 1993) allows the user to retrieve files once they have been located in gopherspace via browsing menus or searching menu titles. Again, these files may be text, binary, or encoded binary. WWW (see below), while primarily a browsing tool can also be used to retrieve files. WAIS (Kahle et. al. 1992) is primarily a searching tool. However, once a document is located, it can be retrieved to a local machine. WAIS documents are usually text files, but may also be sound, video or graphics documents. WAIS servers do not usually store binary application files.
Retrieval assumes the user knows what they want to retrieve and/or how to get there. Users locate the information they want by being told about it (electronically or in print form), by browsing other information spaces (see below), or by executing an electronic search.
Searching should be clearly distinguished from retrieval. Retrieval means getting something once the location is known (although the location may be found by browsing - Gopher is particularly amenable to this). Searching spaces allow the user to ask a question and receive an answer. Searching spaces can best be considered according to the type of object that is the subject of the search. This may be people, files, servers or services.
Many users want to search for people, and by extension for some form of contact information. Here the question involves supplying some identifying information with the answer being an email address, or possibly full contact details. Tools that support this are X.500, WHOIS, and Netfind. X.500 (CCITT/ISO 1988) is a directory service which has been under development for a number of years. Ultimately it is intended to use X.500 to provide an Internet-wide distributed directory of users. WHOIS (Herrenstein et. al. 1985) relies on a database of registered network names which is local to an organisation. WHOIS servers do not share a common directory with other WHOIS servers, nor do they know where to locate information about other institutions. Netfind (Schwartz et. al. 1991) attempts to locate information about Internet users based on their name and some approximate location information. Netfind does not maintain a directory but searches for people using a number of Internet services and lookup heuristics.
Users may also want to locate files, particularly files stored at anonymous ftp archive sites. Due to the number of possible sites, and the range of files stored, searching is often the only feasible way to locate such files. Archie (Deutsch 1992) provides access to a database of anonymous ftp files stored at sites worldwide. It thus complements the ftp retrieval service discussed above. The question asked is of the form 'locate a filename containing these characters'. While it is possible to specify quite precisely the exact sequence of characters, it is only the filename (including directory path) that can be the object of the search. What is returned is a list of site names that have the file, together with the necessary directory path for each site. This means that archie is very good for locating a file once the name is known. It is much less useful for locating files in a particular subject area, such as graphics, or files containing particular words. WAIS (Wide Area Information Server) on the other hand indexes the contents of text files. A WAIS search consists of specifying words or phrases, perhaps combined with boolean operators. It is also possible to ask a WAIS server to look for documents that are 'similar' to a specified sequence of text. The answer in either case is a list of document titles. Each document can be retrieved if desired.
Users may have an idea of what they want to search for, but be unsure of where to ask the question. Two tools allow users to search for servers: veronica and WAIS. Veronica (which allegedly stands for Very Easy Rodent-Oriented Net-wide Index to Computerised Archives) is a sort of 'archie for gopher servers'. Veronica allows the use to search for particular words in the titles of gopher menus, thus facilitating the correct choice of gopher server. WAIS requires the user to specify a server or servers that will provide the answer to a particular question. It is possible to use WAIS to first search for servers dealing with a particular area before running a specific question against those servers.
Browsing refers to the ability to navigate through an information space looking for items of interest. This is somewhat analogous to browsing along the shelves of a bookshelf or library. Browsing tools typically do not offer any search facilities, although some are starting to add this capability. Rather they rely on the underlying organisation of the information space to inform the user's movements. The pre-eminent browsing spaces available today are the World Wide Web (WWW), and Gopher.
World Wide Web (Berners-Lee et al. 1992) is a distributed networked hypermedia system. Users navigate by selecting links which might point to another directory on the current machine, another machine on the same campus, or a machine on the other side of the world. As in other hypertext systems some users may become disorientated or be unable to find what they are looking for. WWW can also be used to retrieve documents in some client implementations. Mosaic from the National Centre for Supercomputing Applications is perhaps the best known (although not the only!) WWW client tool, and provides gateways to other services, such as WAIS, Gopher, NetNews and ftp, as well as excellent multimedia support.
Gopher (Wiggins 1993) offers access to files and interactive systems using a hierarchical menu system. Users navigate through menus to locate resources, which may include documents of various types, interactive telnet sessions, or gateways to other Internet services.
All of the above spaces (with the exception of communication) allow the user to interact with networked information in pre-determined ways. Retrieval spaces can only provide access to certain items. Searching spaces only allow particular kinds of searches for limited sets of objects. Browsing spaces only allow navigation along pre-defined links. There is no way for the user to provide their own organisation for the information spaces with which they interact. This facility is provided in an embryonic way by the World-Wide Web and Prospero.
The World-Wide Web provides two ways in which users can impose their own organisation onto networked information. The first is through the annotation facility which allows text or audio to be added to retrieved remote information. The annotations are stored locally and appear at the end of their associated remote document once retrieved. This annotation facility is being extended to provide support for group annotation. The second, and more powerful technique is to create WWW documents that contain user-defined links to other parts of the web. This requires the user to be able to edit documents in HyperText Markup Language (HTML), the native document format for the WWW, but editors are becoming readily available. In this way users can define their own web architecture. They cannot, however, change links that already exist in the overall WWW.
Prospero (Neuman 1992, Neuman et. al. 1993) is a distributed file system based on the Virtual System Model. By using the tools provided by Prospero, users can organise Internet resources and construct customised views of available resources while taking advantage of the structures imposed by others. Prospero provides a framework for accessing meta-information about Internet resources and can use this framework to link together indexing services. At present, Prospero is mostly used to query remote archie databases.
There is clearly a need to provide the sort of rich organising framework for networked information that was envisaged for Project Xanadu (Nelson, 1974) long before the Internet even existed.
The above division between information spaces and their affordances is blurred somewhat when one comes to consider the tools that access these spaces. A number of the tools, as has revealed in the previous discussion, provide affordances across more than one information space.
The ftp technology, while designed to facilitate retrieval of remote information, can also serve as a browsing tool, although in a somewhat user-hostile way. Users can navigate up, down and across the directory hierarchy of the remote ftp server as a crude way of locating items of interest. This assumes that the hierarchy is in some logical sequence that makes sense to the user. Obviously, neither of these assumptions may be true!
Some archie clients allow the user to search a remote database of anonymous ftp files and then use ftp to retrieve particular located files. Archiea for Unix and Anarchie for the Macintosh both provide this very useful facility.
Gopher, while primarily a browsing tool does provide two forms of search capability. Users may access the Veronica system to find particular keywords in other gopher menus, and then continue their browsing there. They may also search WAIS-compatible indexes to locate documents stored on the gopher server. Once located, these documents may be transferred to the user's machine for viewing - a form of retrieval.
The Mosaic client for the WWW allows the user to browse through the interlinked structures of the Web, retrieve documents for display or saving to disk, and also perform limited searches at particular locations in the Web.
The combinations of affordances provided by some tools significantly enhance users' interactions with the Internet. In the same way that we appreciate a rich set of affordances in the real world, we enjoy richness in our virtual cognitive artefacts. This is probably a large part of the attraction and current hype surrounding the Mosaic WWW client. There is a caveat - we do not want tools that are so complex that they do not get used at all, or where only a subset of their functions are ever employed. No-one wants to see the Internet equivalent of programming a VCR!
The challenge for designers of Internet access tools is therefore to design information systems and their attendant tools from an affordance perspective. This user-centred approach would start with the users and
i) determine the types of information spaces they need to access;
ii) analyse the kinds of affordances these information spaces should provide;
iii) design the tools as cleanly as possible to provide transparent access to the functions the user requires, in an intuitive way.
While there has been no formal reference to this process in the literature, it is possible to see the results of its application already both in the information spaces themselves and the tools used to access them. Excellent 'second-generation' Internet tools are appearing for older (which on the Internet probably means more than two years!) information spaces. Anarchie and archiea, referred to above, combine searching of the archie database and retrieval via ftp in a very natural way. The Eudora email client provides excellent communication support, as do a range of newsreaders and MUD/MOO clients. The most recent version of the Newswatcher for the Macintosh (at the time of writing) has inbuilt intelligent support for URLs. The user can option-click on a URL in a news item, and an appropriate ftp tool will be started to retrieve the resource reference.
The process of improved design can also be observed in newer information spaces like Gopher and WWW. Here searching, browsing, and retrieval are inherent in the information space itself, and are well-supported by a range of client software.
The various players that comprise the Internet community have not always collaborated in the development of new information spaces. This needs to be addressed as the actual and potential user population continue to expand at a dizzying rate. With input from a range of users, the correct match between user requirements and information space, and creative and intuitive implementations of networked information tools, the Internet will continue to develop as the most exciting place to work with information today.
I wish to acknowledge the insightful and helpful comments of Dr Trevor Hales, Division of Information Technology, CSIRO, Melbourne in the preparation of this paper, and the comments of the anonymous referees.
Burke, James (1985), The Day the Universe Changed, Little, Brown, and Co., Boston.
Berners-Lee, Tim et al. (1992), "World-Wide Web: The Information Universe." Electronic Networking: Research, Applications and Policy, vol. 2, no. 1, pp. 52-58.
Caplan, Priscilla. (1993) "Cataloging Internet Resources." The Public Access Computer Systems Review vol. 4, no. 2 (1993): 61-66. To retrieve this file, send the following e-mail message to [email protected]: GET CAPLAN PRV4N2 F=MAIL.
CCITT/ISO (1988) The Directory, Part 1: Overview of Concepts, Models and Services. CCITT Draft Recommendation X.500/ISO DIS 9594-1, CCITT/ISO, Gloucester, England.
December, John (1994a), Internet-Tools. Available in Text, Compressed Postscript, and HTML.
December, John (1994b), Internet-CMC. Available in Text, Compressed Postscript, and HTML.
Deutsch, Peter (1992), "Resource Discovery in an Internet Environment-- The Archie Approach." Electronic Networking: Research, Applications and Policy, vol. 2, no. 1, pp. 45-51.
Diamond, Jared (1992), The Third Chimpanzee - the evolution and future of the human animal, HarperCollins, New York, 1992.
Foster, Jill, Brett, George, and Deutsch, Peter (1993), A Status Report on Networked Information Retrieval: Tools and Groups, Joint IETF/RARE/CNI Networked Information Retrieval - Working Group (NIR-WG)
Gaver, William (1992), "The affordances of media spaces for collaboration", Proc. CSCW 1992, ACM, New York, pp. 17 - 24.
Gaver, William (1991), "Technology Affordances", Proc. CHI 1991, ACM, New York, pp. 79 - 84.
Goody, Jack (1981), "Alphabets and Writing" in Williams, Raymond (ed.), Contact: Human Communication and its history, Thames and Hudson, London, 1981.
Harrenstein, K., Stahl, M., and Feinler, E. (1985) NICName/WHOIS. RFC 954, SRI International.
Janes, J. W. and Rosenfeld, L. B., "And Magellan Thought He Had Problems: 'Navigation' in a Network Environment", Libres: Library and Information Science Research Electronic Journal, Vol. 4, No. 1 (February 28, 1994). To retrieve this file, send the following e-mail message to [email protected]: GET LIBRE4N1 JANES.
Kahle, Brewster et al. (1992), "Wide Area Information Servers: An Executive Information System for Unstructured Files." Electronic Networking: Research, Applications and Policy , Vol. 2, no. 1, pp. 59-68.
Lynch, Clifford and Preston, Cecilia M. (1992), "Describing and Classifying Networked Information Resources", Electronic Networking, vol. 2, No. 1 (Spring 1992) 13 - 23.
Nelson, Theodor H. (1974), Computer lib : and Dream machines, self-published, Chicago.
Neuman, B. Clifford (1992), "Prospero: A Tool for Organizing Internet Resources", Electronic Networking: Research, Applications and Policy, Vol. 2, No. 1, (Spring 1992).
Neuman, B. Clifford, and Augart, Steven Seger (1993), "Prospero: A Base for Building Information Infrastructure", Proc. INET '93.
Norman, Donald (1993), Things that make us smart: defending human attributes in the age of the machine, Addison Wesley, Reading Mass.
Obraczka, K., Danzig, P. B., and Li, S. (1993), "Internet Resource Discovery Services", IEEE Computer, (September 1993), pp. 8 - 22.
Rheingold, H. (1993), The virtual community : homesteading on the electronic frontier, Addison-Wesley Pub. Co., Reading, Mass.
Rooks, Dana (1993), "The Virtual Library: Pitfalls, Promises, and Potential." The Public-Access Computer Systems Review 4, no. 5 (1993) 22-29. To retrieve this file, send the following e-mail message to [email protected]: GET ROOKS PRV4N5 F=MAIL.
Schwartz, M. F. and Tsirigotis, P. G. (1991) "Experience with a Semantically Cognizant Internet White Pages Directory Tool", J. Internetworking: Research and Experience, Vol. 2, No. 1, (March 1991), pp. 23 - 50.
Schwartz, M. F., Emtage, A., Kahle, B. and Neuman, B. C. (1992), "A Comparison of Internet Resource Discovery Approaches", Computing Systems, Vol. 5, No. 4, (Fall 1992) pp. 461 - 493.
Treloar, A. [1993], "Towards a user-centred categorisation of Internet access tools", Proceedings of Networkshop '93, Melbourne.
Wiggins, Rich. (1993), "The University of Minnesota's Internet Gopher System: A Tool for Accessing Network-Based Electronic Information." The Public-Access Computer Systems Review, vol. 4, no. 2, pp. 4-60. To retrieve this file, send the following e-mail messages to [email protected]: GET WIGGINS1 PRV4N2 F=MAIL and GET WIGGINS2 PRV4N2 F=MAIL.
Yates, Frances, A. (1966), The Art of Memory, Penguin, Harmondsworth. 1966.
ÿ