A General Model for On-line Publishing

David G. Green
Johnstone Centre,
Charles Sturt University,
PO Box 789 Albury New South Wales 2640 AUSTRALIA

Email: dgreen@csu.edu.au

Abstract

Here I describe a general model of on-line publishing that focuses on the sequence of major processes involved - from submission to delivery. It identifies submission, acquisition, quality control, production and delivery as the main steps involved. This model applies to virtually any material and provides a basis for automating many editorial and publishing functions via specialised high-level languages. The benefits include fast installation of new publications and cost-effective operation. Many applications of this approach have been implemented at Charles Sturt University.

1 Introduction

As a publishing medium, the World Wide Web has many advantages. To the user, the most obvious of these are hypertext and interactive multimedia. For authors, we can add the prospect of fast publication and a worldwide audience.

To date, most effort in on-line publishing has focused on formats, markup, HTTP browsers and other issues of concern to authors. Relatively little attention has been paid to the publishing and editing processes themselves.

A fundamental problem in on-line publishing at present is that setting up and running a publication is a major task. This is particularly the case for periodicals and other publications that involve regular editing of contributions. Establishing virtually every new publication involves installing a suite of software to handle such procedures as submission, file management and editorial control. However, this software has usually been developed piecemeal and is not easily adapted for other purposes.

Here I propose a general model of the procedures involved in on-line publication. I also show how it can serve as the basis for flexible publishing/editing systems that are easily installed and adapted. As examples, I describe some of the uses to which this approach has been applied in network publishing at Charles Sturt University.

2 A model of the publication process

As a first step, we distinguish between publishing and editing. Editing involves managing the contributions from submission to on-line delivery. In this context, publishing involves setting up working systems to implement a publication.

Our general model of publishing traces the path of an item from author to availability on-line. Any contribution goes through this same general set of editorial steps (Figure 1).

Figure 1. Stages in the publication of on-line information. The model applies to most kinds of publications and materials. As many steps as possible should be automated.

Submission
E-mail and FTP are unsatisfactory as methods of submission as they can require considerable action by the editor. A cleaner approach is to exploit file uploads, which are now supported by several Web browsers (for example, Netscape 2.0). Embedding the upload in a submission form ensures that all relevant information can be provided at one time.
Acquisition
From its source (normally an author) the item goes through a reception process. This includes such operations as creating files or directories to hold the item, registering the item in the publication records, acknowledging the submission and notifying the editor.
Quality control
Every publishable item goes through quality control of some kind. In submitting a record to a virtual library, for instance, we have to ensure that the link exists. In databases, new records can be checked for obvious typos and inconsistencies, for guarantees that the data has been entered correctly, and to validate the methods used to generate the data in the first place (Saarikko & Green 1995) Scientific journals normally require peer review. Software should be checked to ensure that it works as claimed.
Production
Once an item has been cleared, it is prepared for publication. In traditional publication this means formatting, typesetting etc. For on-line publications however, production can be largely automated, especially if the work of initial markup is distributed amongst authors. Other tasks at this stage include assembling all new items (for example articles in a journal issue) and announcements.
Delivery
The final stage is on-line release of the item for public access.

Notice that the above model is not restricted to one specific type of publication, such as text-based periodicals. It applies to any sort of material, including (say) software, databases and images as well as text. Also it applies to many different styles of publishing, from (say) academic texts to company reports.

3 Automation

The importance of the above model is that it provides a systematic framework for automating many editorial and publishing functions. For instance, when an author submits an item for publication (using a form upload), an automatic process can store the files, record the submission, return an acknowledgment to the author and notify the editor. It could even carry out elementary quality checks, such as parsing for correct markup, ensuring that all pertinent information has been provided, or testing the validity of embedded URLs.

Although the model in Figure 1 is general, the details of each stage can vary considerably. For example, a general submissions tool should not be hard-wired to one particular publication. It should be able to handle submissions to many separate publications on the same site. To achieve this it is necessary to be able to feed configuration details to the software each time it is run.

When an author submits a manuscript (or other item of information) for publication, several tasks must be performed immediately. These include:

Note that the above procedure is completely general. It applies to any sort of information, whether it be text, images, video or sound. In general, we must assume that a submission will require several files to be uploaded (for example an article, plus the figures). If the submission requires an elaborate directory system to be created, then the best approach is for authors to bundle the entire directory system and files into a single archive file (for example using the Unix tar facility) and submit the archive file.

If circumstances do not require separate files to be uploaded as part of the submission, then the above procedure is somewhat simpler, but most of the above steps still apply. In principle, there is no difference between the mechanism for submitting (say) a conference abstract, an entry to a database or an entire book.

The entire procedure described above can be complex. It is not practical to attempt to write separate submission filters for every publication. However, by implementing filters for each of the tasks involved, we can then implement the entire system via a high level language that is interpreted by a central control program. We could then define the above procedure in a high level language such as the following:

      submit(publication,editor,register)
      assign(ref_no)
      date_stamp = get_time_and_date
      location = get_location(publication,refno)
      extract_registration_data
      write_input_file(location)
      register(location,date_stamp)
      queue(publication,location)
      run_prelim_checks(location)
      notify(editor,publication,register)
The program that reads this script runs each of the filters in turn as defined in the script. Each of the filters is a piece of software (written in Perl, say) that carries out a simple generic function. Notice that each command is generic, so the semantics - the exact operations performed - can change. For instance 'notify(.,.,.)' might mail a message to 'editor' about 'publication', with 'register' as the body of the message. Alternatively it could enter details in an HTML page for the editor to read later.

This high-level language approach has several major advantages:

3.1 Editing

The approach described above for the submissions also applies to other stages of the publication process. The editor of (say) a public journal on-line should be able to call up a series of management forms that list incoming manuscripts and allow various functions to be performed. These would include viewing submissions, contacting authors, selecting and notifying referees, and moving accepted submissions on to the production stage.

3.2 Production

An essential issue in automation is to be able to build documents 'on the fly' from information in a database. In some instances, this may mean building book chapters, or articles, but more generally it is necessary to create (say) index entries, or editorial correspondence in a standard way. The approach that we have taken is the equivalent of a word-processing merge operation; that is, the document is created by entering fields from a database in a 'template'. At CSU, we have adopted this mechanism in generating, for example, index and contents pages for all of our virtual libraries - Green, Eddy & Bristow (1995); Bristow & Green (1995).

When extended to the entire production system of an on-line publishing site, the above approach leads to the notion of a publishing database. The following are some of the components that might be included in such a database.

Documents
A series of files containing the bulk of textual material;
Multimedia store
Files containing multimedia elements, such as images, video, audio, and Java applets;
Document register
Relevant information (for example provenance, keywords) about all the documents and multimedia elements;
Bibliographies
References to publications, for citation purposes;
References
Centralised index of URLs, which can be used to provide urls in processed documents (may be combined with bibliographic information for on-line publications);
People
Contact and other relevant information about authors, referees, and editors.
Organisations
Contact and other essential information about relevant agencies, institutions and other organisations;
Glossaries
Indexes of relevant key words and terms, with references to relevant on-line information and other resources;
DTDs
Document Type Definitions for all document types used in the system;
Scripts
A central store of scripts to handle all aspects of document processing and editing;
Indexing
Indexes to all of the above information resources.

4 The publishing environment

Besides looking at the management of a single publication, a publishing model must also consider the context in which publication take place. Professional and other groups are beginning to realise the potential of the World Wide Web as a medium that permits collaboration on a vast scale. In the future, we are likely to see increasing focus on subject matter rather than sites. In this knowledge web, we will see a variety of "virtual communities" and special interest networks - Green (1994), Green (1995a), Green (1995b) - in which many different publishing sites pool their activities to create a complete information environment on particular topics or activities.

The above collaborative frameworks have several implications for the publishing process, including:

5 Examples - network publishing at Charles Sturt University

The above ideas are being put into practice at Charles Sturt University in a wide range of projects. Much attention has been devoted to developing software in a coherent and flexible manner. Take for instance the submissions procedure. A single facility - the Auto-reply program written by Paul Bristow - is currently being used to handle submission forms for about 30 different functions, including virtual libraries, site registers, bibliographies, conference registrations, and student assignments. An adapted version, which permits file uploads, is used to handle submissions for several academic publications.

The development and implementation of a suite of tools to support on-line editing and publishing has led to several major initiatives at Charles Sturt University. One is the development of a network publishing house - 'CSU On-line' - which will undertake a range of academic and commercial publishing. However, perhaps the most significant initiative is the development of the New South Wales Higher School Certificate On-line, which will be publicly available from the start of 1997. This cooperative project will publish relevant material and information to support students and teachers in the NSW HSC. Each 'subject node' will be treated as a separate publication. The entire site has been developed with the model shown, and automation, firmly in mind.

Examples of automated applications at CSU include the following: