Andrew Treloar, School of Computing and Mathematics, Deakin
University, Rusden Campus, 662 Blackburn Road, Clayton, 3168,
Australia. Phone +61 3 9244 7461. Fax +61 3 9244 7460. Email:
[email protected].
Home Page: Andrew Treloar
This paper will consider the possibilities inherent in
scholarly publishing on the World Wide Web and compares them to
traditional print publishing. The paper starts by considering
those technologies used for scholarly publishing to date. The
four main networked electronic technologies (listserv archives,
AFTP repositories, gopher servers, and the Web) are then
contrasted with traditional print publishing technologies. The
paper next considers some of the issues for electronic
scholarly publishing with particular application to the Web
environment. Finally, some tentative conclusions are drawn
about the likely direction of scholarly publishing.
The paper will not deal with the available tools for Web
publishing. This is a topic in its own right and is extensively
covered elsewhere. Nor will it consider electronic publishing
in general, or scholarly electronic publishing over other media
(such as CD-ROM).
The paper has been designed in a self-exemplary manner for
parallel print and electronic publishing. Each piece of
underlined text is a hyperlink in the electronic version. In
the print version, the footnote contains the URL for the
hyperlink. My apologies to those print purists who hoped that
underlining had gone the way of Courier and upper-case only
headings. The online version is available at
<http://www.deakin.edu.au/people/aet/vala96/>.
For the purposes of this paper, scholarly publishing will be
taken to mean the production of journal articles, both refereed
and non-refereed. The focus will be on what
Harnad (1995b) calls 'esoteric' publication publication by
specialists for other specialists- as opposed to trade
publication. Esoteric publishing is a much more significant
publishing activity for most academics than either trade
publication or the production of monographs, and probably
represents the majority of all academics' published output.
Such publishing has traditionally taken place using the
technology of print. This is still the primary technology for
all disciplines, and is also the technology that provides the
official archival record for almost all publications. However,
print publication suffers from a number of disadvantages:
For all these reasons, as soon as the available technology
made it practicable, pioneering scholars began to use whatever
means they could to produce and distribute their writings
electronically. Such electronic publishing is sometimes
referred to as epublishing, by analogy with email (for an
excellent selective bibliography on the subject of scholarly
electronic publishing over networks, refer to Bailey
1995). In roughly chronological order, the technologies
adopted were:
New technologies tended to be used in addition to older
technologies, rather than supplanting them. Thus, it is not
unusual to find journals that were initially distributed by
listserv, and which then added AFTP, and later perhaps gopher
or Web access.
In order to be successful, all of the above technologies
needed to provide either equivalent functionality to print, or
if this was not possible, enough alternative functionality to
compensate for any deficiencies. In practice, it turns out that
for any scholarly publishing medium to be useful, three core
sets of functions are needed:
How did these new technologies provide these functions, and
how do they compare to print?
To begin with, published information needs to be produced
and formatted in a way that the scholar can use. In all cases,
the technology chosen places constraints on what can be
represented and how.
Listserv archives are usually restricted to documents in
7-bit ASCII. This is because of the need for such documents to
pass through email gateways in transit and because no
assumptions can be made about the display device at the other
end.
Anonymous File Transfer Protocol (AFTP) archives can be used
to store any kind of file. In practice, most ejournals using
this technology have tended to use 7-bit ASCII text documents.
Some journals are experimenting with storing articles in richer
formats like Hypertext Markup Language (HTML)
or Postscript.
Gopher servers can also provide a range of document types,
but most ejournals mounted on gopher servers also store
documents in 7-bit ASCII text. A wider range of Multipurpose
Internet Mail Extension (MIME)
types is now supported by available gopher clients and servers
- the lack of adoption of this facility to distribute documents
in other formats is probably being affected by the general rise
in popularity of the Web.
World Wide Web documents are written in HTML. This provides
for formatted text, inline graphics, hyperlinks within
documents, links to other HTML documents, and links to
documents in other formats altogether. However, the scholar
writing for the Web needs to be aware that a wide range of
browsers will be used to access their work. Not all browsers
format HTML in the same way, and the available range of markup
tags is restricted, particularly compared to SGML. Thus a
lesser degree of control over the final appearance of the
document is inevitable, compared to the richness of print.
Adobe's Acrobat (PDF)
format is a dialect of Postscript. PDF provides for
device-independent, page-based cross-platform electronic
documents. Readers are available for most popular platforms.
Future versions of Netscape's Navigator product will include
support for page-at-a-time viewing of PDF files. PDF is a very
good solution for complex electronic documents with a high
graphical content or lots of formulæ. An example of an
ejournal using PDF is the Cajun Project
being developed by the Electronic Publishing Research
Group.
At first glance, print publishing might seem to provide few
restrictions; multiple fonts, sidebars and images are all
possible. However, hyperlinks within the one publication are
clumsy, and links (footnotes and citations) to other
publications rely on the scholar having ready access to the
publications linked to. As well, print is limited to
information that can be represented statically on paper. Audio
and video are impossible. For most publications, colour still
images are technically possible, but prohibitively
expensive.
In order to access a new scholarly publication, the scholar
needs to be notified of its existence.
In the domain of epublishing, the standard solution to the
notification problem is to use one of a number of of
computer-mediated communication technologies. By far the most
popular is electronic mail, with network news a distant second.
Two distinct strategies can be employed. The first is to email
the entire text of the latest issue of an ejournal direct to a
scholar's mailbox. In this case, the notification is directly
analogous to the arrival of a print journal. An alternative
increasingly being adopted is to notify the scholar of the
publication of a new journal, include author, title and
abstract information, and provide advice on how to access
either the entire journal or particular articles of interest.
For FTP, Gopher and Web journals, this access information is
usually in the form of a Uniform Resource Locator (URL).
In the print world, notification is often limited to the
physical arrival of a new issue of a journal (often on a
semi-regular, predictable schedule). If the journal comes to a
library, the scholar has to check the shelves periodically, or
rely on some sort of alerting service. Such a service might be
provided by the library (in the form of photocopied contents
pages) or a commercial information provider like DIALOG (via
the results of an SDI search on a contents database).
Alternatively, scholars can directly search online databases of
abstracts and citations looking for relevant information, but
this requires them to take the initiative and can easily get
crowded out of a busy schedule.
Once notified, the scholar needs to be able to gain access
to the information. This includes locating the journal, and
being able to identify and read articles of interest.
Listserv archives enable scholars to access information via
email. All that is required is to email a GET command to the
listserver address requesting that a specified file be sent by
return email. As email is the lowest common denominator for
users of the Internet, this provides the widest possible
audience. As an example, consider the reference in this paper
to Harnad (1991). This article in the refereed ejournal
Public-Access Computer Systems Review (PACS-R) can
be retrieved by sending the e-mail message get harnad prv2n1
f=mail to listserv@uhupvm1. Of course, before issuing a GET
command, one needs to know that the file exists. Some journals,
including PACS-R , handle this by sending the
table of contents and abstracts to users subscribed to the
PACS-L or PACS-P mailing lists. Alternatively, it is possible
to email commands to some listservers instructing them to
search a database and return a list of articles that match the
search criteria. These articles can then be retrieved as
above.
Scholars can access articles in anonymous FTP archives
either by using a dedicated FTP client, or by providing an FTP
URL to a Web browser like Lynx, Mosaic or Netscape. If the URL
formalism is not being used, then the FTP location of the
article will need to specify host machine, directory path and
filename. For example, the information encoded in the URL
FTP://cogsci.ecs.soton.ac.uk/pub/harnad/Harnad/harnad95.quo.vadis
can also be expanded into (more or less) plain English as 'Make
an anonymous FTP connection to cogsci.ecs.soton.ac.uk, move
into the directory pub/harnad/Harnad/ and get the file
harnad95.quo.vadis'. The URL formalism has the advantage of
being more compact as well as parseable by both humans and
machines. One example of a journal accessed by AFTP is Psycholoquy,
edited by Stevan Harnad.
Gopher was initially developed to provide a basis for
mounting Campus Wide Information Systems (CWISs). It is based
around the idea of hierarchical menus, and allows the server
administrators a lot of flexibility in how they structure their
information space. One fairly standard way to mount ejournals
on a gopher server is to have a menu of possible journals. Each
journal points to a menu of issues for that journal. Each issue
points to the individual articles. Given unambiguous
information about the path to be followed, scholars can
navigate through the menus until they locate the files they
want. It is also possible to provide Gopher URLs for direct
access using a Web browser. An example of a journal available
through Gopher is the
Mathematical Physics Electronic Journal .
The Web, with its non-hierarchical document-based networked
hypermedia architecture provides a much richer environment for
electronic publishing. Documents can either be reached by
following an existing link, or can be accessed directly by
entering a valid URL. Documents can in turn refer to other
documents and provide direct links to them (something that is
not possible with documents accessed using a Gopher client).
Examples of a range of scholarly journals on the
Web will be discussed below. The Web can also be used to
point to documents in PDF format.
In the print world, if the journal is delivered directly to
the user, the problem of journal location is limited to finding
the journal within the context of the scholar's own personal
information management system. If the journal is delivered to
the library, it will be filed in some well-defined sequence. To
assist with locating articles within journals, the publishing
industry has developed a range of standard tools: contents
pages at the front of issues, yearly cumulative printed
indexes, and the like.
As scholarly journal publishing continues what some (Odlyzko
1995,
Harnad 1995b) regard as its inevitable transition to an
electronic form, a number of issues need to be confronted. A
number of these are applicable to all forms of electronic
publishing. Others are either specific to, or have the greatest
impact on, the Web. The list below is not intended to be
exhaustive. Barry (1995) provides another perspective on some
of these issues.
This is a term taken from Kaufer and Carley (1993), and
refers to the length of time the article is available for
communicative transactions. Paper documents printed on paper
that is not acid-free have a durability of some 100 years
unless corrective action is taken. The durability of Web
documents is entirely unknown, but there are no technological
reasons for their life to be limited in any way, provided they
are archived in some systematic way. At present there are no
mechanisms to ensure that this will occur.
In many ways, the digital nature of all electronic
publishing can be both a strength and a weakness in the area of
durability. A strength, because digital documents can easily be
copied and replicated at multiple sites around the world. A
weakness, because destroying a digital document is far easier
than destroying a physical document. It is easy to assume that
the document will exist elsewhere on the Net and that the fate
of a single copy is irrelevant. Of course, there is no
mechanism to prevent everyone making this assumption and
causing the loss for ever of a piece of scholarship. In some
ways, the analogy of the single manuscript forgotten on top of
a cupboard in a monastery somewhere in the Dark Ages may well
be a forgotten directory on a rarely used hard-disk somewhere
in a university. Unfortunately, it is all to easy to delete a
directory - throwing away a manuscript without realising is
somewhat harder. Given the lack of any mechanism to ensure the
archiving of print publications, it seems unlikely (although
relatively technologically simple) that anything will be done
about the situation for digital documents.
The Web allows us to dramatically expand our view of what is
possible within a scholarly publication. A Web document can
directly include colour images, something only reserved for a
very few print publications. In addition, HTML documents can
provide links to video clips and sound files, as well as access
to other programs through Web
gateways. This enables a significant enhancement to the
traditional published scholarly document. A number of
electronic scholarly journals are experimenting with the
possibilities inherent in this medium.
PostModern Culture routinely contains
hypermedia articles alongside more traditional text-only
material. As an example,
McNeilly (1995) contains links to a number of sound files
which are used to illustrate particular points in the
article.
I am not aware of any ejournals that use the gateway
facility to provide access to data sourced from other systems.
As an illustration of what might be possible, consider ERIN, the Australian Environmental
Resources Information Network. While not a scholarly
journal itself, this system does provide access to a wide range
of scholarly information. Use of a Web gateway allows the user
to generate distribution
maps for nominated species and run simulation models in
real time. Imagine the possibilities if a journal article
allowed the reader to run a simulation directly while varying
the input data and monitoring the results.
JAIR,
the Journal of Artificial Intelligence Research , is
using the Web to deliver articles in Postcript or HTML format.
As an example, Schlimmer & Hermens (1993) is available in
both a PostScript
version and an
HTML version. JAIR is also experimenting with
delivering other forms of supporting information. The Schlimmer
and Hermens (1993) article comes with an
appendix containing a 1.3MB Quicktime video which
illustrates some of their research findings.
At the moment at least three things are limiting the wider
use of anything other than text in scholarly publishing:
Bandwidth is widely predicted (Odlyzko
1995]) to become a much less severe limitation as scholarly
use of the Internet piggybacks on the infrastructure servicing
video on demand and similar services. Bill Gates talks about
bandwidth being 'essentially infinite' within the decade. Many
developed countries are proposing to run cable connections to
individual households that will support 10 Mbps at least.
Therefore, bandwidth seems to be a short-term problem at
worst.
While there will no doubt be an application for VT100 Web
browsers like Lynx for a few years, the computing world is
rapidly going graphical. Already the majority of Web browsers
run under a GUI, and this trend will continue. Having to code
for non-graphical browsers is probably another short-term
difficulty.
Scholarly conservatism may prove a more long-term
constraint, only susceptible to generational change. Many
scholars will no doubt only use the Web (if at all) to publish
what they publish already but faster and in electronic form.
The habits of centuries of print publishing (in the case of
scholars in general) and of decades of practice (in the case of
individual scholars) will take a while to change.
The Web makes it possible for authors to provide access to
extension material that supplements or complements their
primary publications. Stevan Harnad talks about 'scholarly
skywriting'
(Harnad 1990) and argues for supplementing peer review with
interactive publication in the form of open peer commentary on
published and ongoing work
Harnad (1995a). In the spirit of this suggestion, JAIR,
the Journal of Artificial Intelligence Research ,
has just implemented a facility to allow readers to comment on
published articles and to review the comments of others. Harnad
himself has archived
contributions from readers to a discussion of publicly
retrievable FTP archives for esoteric science and scholarship
as an example of what is possible.
The High Energy Physics community has already moved to a
model of electronic publishing which allows for ongoing
corrections and addenda. The hep-th e-print archive which
provides this facility 'serves over 20,000 users from more than
60 countries, and processes over 30,000 messages per day' (Ginsparg 1994).
The Web's ability to link to other information makes it
possible to envisage a range of extensions to traditional
scholarly publishing. These include:
Little of this nature is happening at present, but the
possibilities are certainly wider than the few suggestions
outlined above.
Deciding how to organize a Web document depends somewhat on
whether the document is intended to be read largely on screen,
or printed out, read (perhaps annotated) and then filed. In
fact, the entire issue of the most appropriate
style for HTML documents in general is a vexed one.
Price-Wilkins (1994a) argues that "because the Web does not
include structure awareness in its protocol and because HTML
markup provides so little support for structural representation
of features, the author and the administrator are forced to
fragment documents into a sets of reasonably sized
components.". This is no doubt true for large documents with
complex internal structures, but is less of an issue for the
shorter documents typical of scholarly publishing.
Tim Berners-Lee's preferred
style is for shortish (up to 5 pages) nodes linked together
in some logical sequence, preferably based on a tree structure.
On its own, this implies that the reader will have to navigate
back up branches in order to access the next section. Documents
designed using this model should provide the reader with a link
labelled "next" at the end of each node to let them move
through the document in a linear manner if desired. This style
works well for things like online reference material but seems
less appropriate for scholarly publishing. A scholarly article
is more of a single entity and should be represented as such.
If the article is a long one, it may be appropriate to split it
into sections or place a table of contents with links to
internal anchors at the beginning. The advantage of keeping the
article as an entity is that the user can easily print it out
(if required), without having to retrieve multiple segments and
ensure that they are collated in the correct order. Until a
majority of the intended audience is comfortable with reading
entirely from the screen, and has the hardware to make this
possible, the likelihood that material will be printed out has
to be kept in mind when writing scholarly Web documents.
PDF documents are designed around a page model so the issue
of document design is less critical. It is possible to choose a
page size smaller than normal paper but then the documents will
translate less well to paper output. PDF documents can also be
designed to have additional navigation features such as live
tables of contents or thumbnails but these only just compensate
for the deficiencies of the viewing software relative to
flicking through paper.
Electronic publication of scholarly 'esoteric' publication
is continuing to grow in popularity. An increasing number of
ejournals are adding Web access to their range of access
technologies. As the Web continues to grow in popularity, and
as the ratio between all potential readers and potential
readers with Web access approaches unity, I suspect that older
electronic delivery technologies will simply fade away. The
shift from text-only production to production in some
graphically richer form may well take a little longer. A number
of journals now have a Web homepage that points to articles in
7-bit ASCII, but have not yet made the change to HTML or PDF
for the articles themselves.
In the longer term, the Web is probably not the future of
scholarly publishing. It is both a part of the present, and a
pointer to the future. Other technologies will no doubt surpass
the Web in time. Hyper-G looms as a possibility, and Project
Xanadu may move from virtuality to reality before the end of
the millennium. The significance of the Web is the way in which
it enables a far more significant break from print than has
been achieved to date. It does this because it does all that
print does and then more. For scholars, exploring the
implications of that more for their publishing and
communication is sufficient challenge for the near term.
C. Bailey Jr. (1995), "Network-Based
Electronic Publishing of Scholarly Works: A Selective
Bibliography", The Public-Access Computer
Systems Review , Vol. 6, Number 1.
T. Barry (1994), " Publishing on the Internet with World
Wide Web ", in Proceedings of CAUSE '94 in
Australia , CAUDIT/CAUL, Melbourne.
T. Barry (1995), "Network Publishing on the Internet in
Australia", in The Virtual Information Experience -
Proceedings of Information Online and OnDisc '95 ,
Information Science Section, Australian Library and Information
Association, pp. 239-249.
P. Ginsparg (1994), "First Steps towards Electronic Research
Communication", Computers in Physics , August.
S. Harnad (1990),
"Scholarly Skywriting and the Prepublication Continuum of
Scientific Inquiry", in Psychological Science
, Vol. 1, pp. 342 - 343 (reprinted in Current Contents 45:
9-13, November 11 1991).
S. Harnad (1991),
"Post-Gutenberg Galaxy: The Fourth Revolution in the Means of
Production of Knowledge", in The
Public-Access Computer Systems Review , Vol. 2, No.1,
pp. 39-53 .
S. Harnad, (1995a),
"Implementing Peer Review on the Net: Scientific Quality
Control in Scholarly Electronic Journals", in Peek, R.
& Newby, G. (Eds.), Electronic Publishing Confronts
Academia: The Agenda for the Year 2000 . Cambridge MA:
MIT Press.
S. Harnad, (1995b) "Electronic
Scholarly Publication: Quo Vadis?", in Serials
Review Vol. 21, No. 1, pp. 70-72.
D. S. Kaufer & K. M. Carley (1993), Communication
at a Distance - The Influence of Print on Sociocultural
Organization and Change , Lawrence Erlbaum
Associates.
K. McNeilly (1995),
"Ugly Beauty: John Zorn and the Politics of Postmodern
Music", in Postmodern Culture , Vol.5, No.2
(January).
A. Odlyzko (1995), "Tragic
loss or good riddance? The impending demise of traditional
scholarly journals"in Electronic Publishing Confronts
Academia: The Agenda for the Year 2000 , Robin P. Peek
and Gregory B. Newby, eds., MIT Press/ASIS monograph, MIT
Press.
J. Price-Wilkin (1994a),
"Using the World-Wide Web to Deliver Complex Electronic
Documents: Implications for Libraries" in The
Public-Access Computer Systems Review , Vol. 5, No. 3,
pp. 5-21.
J. Price-Wilkin (1994b),
"A Gateway Between the World-Wide Web and PAT: Exploiting SGML
Through the Web.", in The Public-Access Computer
Systems Review , Vol. 5, No. 7 , pp. 5-27.
D. Schauder (1994), Electronic Publishing of
Professional Articles: Attitudes of Academics and Implications
for the Scholarly Communication Industry , Unpublished
Ph. D. Dissertation, University of Melbourne.
J. C. Schlimmer & L. A. Hermens (1993),"Software Agents: Completing Patterns and Constructing User Interfaces", Journal of Artificial Intelligence Research , Vol. 1, pp. 61-89.
ÿ