Local time: Friday, 05-Dec-2008 12:10:16 EST
Last update at /special/conference/apwww95,Friday, 21-May-2004 09:46:50 EST


Software documentation with WWW
Using an HTML workbench

Andreas Vogel

CRC for Distributed Systems Technology - DSTC Pty. Ltd.
Level 7, Gehrmann Laboratories
The University of Queensland , 4072
Australia

andreas@dstc.edu.au
http://www.dstc.edu.au/public/staff/andreas-vogel.html


Abstract
This paper presents an approach to software documentation using the World Wide Web. The design and implementation of this software documentation kit is based on an HTML workbench which provides various filters which make simple transformations. These filters can be piped together to provide more complex task oriented functions.

Additionally, the paper presents an approach to embedding software documentation in an overall project management document hierarchy. The paper also describes our experience with the approach when it was applied in a number of projects.

Keywords
Software documentation, WWW, project management.

Introduction

Background

When looking back at the last three decades for programming language and software engineering research, a desire for structures in programming can be recognised. Two main milestones are marked by the introduction of of structured programming in the late 1960s and early 1970s, represented e.g. by ALGOL and PASCAL, and object orientation in the 1980s, represented e.g. by Smalltalk and C++. Where this desire is driven by the programmer, it aims to keep code readable and understandable, primary requirements for development shared among various people and for code re-use.

A similar request comes from the user who needs documentation consistent with the source code. Nearly everyone working with software products will have experienced minor or major inconsistencies between software application or user interface and its documentation.

This combined task was taken on by Donald Knuth when he developed in the early 1980s his first WEB system. The idea behind this approach is to combine source code and documentation in one well structured document. From this document two sub-documents can be extracted (using corresponding tools) to create a documentation source, e.g. in TeX format, and (programming language) source code, e.g. in C. These two documents are then further processed by the usual tools such as TeX and a C compiler.

This is also known as literate programming and introduced and explained in Knuth's book [KnA92] with the same title. The most recent version of the WEB system is CWEB, version 3 [KnA94].

Another example of combining documentation and, in this case formal specifications, is Z [SpA89]. Z is a formal description technique based on set theory which embeds formal expressions, so called schemas, into explaining text.

Motivation

The general motivations explained in the previous section can be summarised by: The CRC for Distributed Systems Technology - DSTC is, as the name suggests, a research institute devoted to distributed systems. Our software development is heavily based on the middleware platforms.

A middleware platform is software which sits on top of operating systems and programming languages and provides distribution transparent interaction between various components of a system. The best known middleware systems are the Open Software Foundation's Distributed Computing Environment - DCE [XOA93] and the Open Management Group's Common Object Request Broker Architecture [OMA91].

Middleware platforms provide Interface Definition Languages - IDL to specify the interfaces of components. There seems to be a trend towards using IDL specification as an integral part of component documentation. This certainly done by OMG with their Common Object Service Specification [OMA95].

In our own software development we are following OMG's approach and are consequently in particular interested in:

The DSTC is distributed over a number of sites in the Brisbane area and other states. This results in the requirement for:

The documentation should be available on-line as well as printed. The printed documentation should be customisable to different styles and formats, e.g. DSTC Technical Reports or OMG submissions: From a project management point of view, one would like one entry point which leads to all project related information including:

Overview

The remainder of the paper is organised as follows. First we introduce and justify our main design decision, using the World Wide Web - WWW. This is followed by the presentation of various tools combined in an HTML workbench. The following section describes how the software documentation kit is designed and implemented with the HTML workbench. After that we describe the embedding in a project management documents hierarchy and present our experiences when applying the kit to various projects. Finally we add some related work we have done in the context of this paper, identify and outline future work and conclude the paper.

Fundamental design decision - using the WWW

The fundamental design decision in our approach was to use the WWW. Using the WWW we gain the following advantages which already satisfy a number of the above stated requirements.

Hypertext Linking
Allows the embedding of management documents, e.g. milestones or meeting minutes, and development documents into one document hierarchy.

Accessibility
The combination of HTML and HTTP provides the user with distribution transparency when browsing HTML documents. Additionally, HTTP solves to a large extent the interoperability problems between various kinds of computer architectures and operating systems.

Text processing
The HTTP clients, e.g. from Netscape or Mosaic, provide the user with text processing capabilities. Text can be formatted according to the common structural concepts of modern text processing systems, and images and graphics can be inserted.

However, HTML does not provide means to determine the page layout. The layout is done by the HTML browser depending on the size of the widget which displays the HTML text. This adaptive method is quite satisfactory for browsing documents, however it becomes a weakness when a printed version of an HTML document is required in some (predefined) layout.

Customisation
The comparable small number of HTML commands, the human readability of HTML documents and the Common Gateway Interface CGI ease the task of converting documents from or to HTML format, extract information from HTML documents, create HTML documents, etc.

Graphical user interface
HTML includes basic facilities to build graphical user interfaces. Combined with the CGI these enable the rapid prototyping of project demonstrations available on virtually any platform.

HTML workbench

The HTML workbench follows the idea of the UNIX workbench [KeA84]: to provide to number of tools which provide relatively simple functions but can be piped together for more complex, goal oriented tasks. In fact we have implemented the HTML workbench following the same approach using UNIX tools. For PC user, these tools are publicly available for DOS/Windows environments as well.

Currently, the workbench provides the following tools:

doc
encloses an HTML document in between a header and a footer which can be customised, e.g. for a particular project.

html2ms
converts an HTML document into a troff (MS macros) document. Troff is one of the earliest text processors. There is a GNU (Gnu is Not Unix) implementation available for most major platforms. Troff [KeA78] documents are formatted similar to HTML documents, with commands. These commands have the form
 .xy
On top of the basic commands macros can be defined, there are predefined macro packets such as MS, ME or MAN. The MAN package is traditionally used for UNIX man pages.

Due to the similarity in the style of HTML and troff, and to the fact that HTML provides only a subset of troff's text processing functionality, the mapping is straightforward.

html2ps
is an extension of html2ms. It puts a troff header file on top of the output html2ps to determine the page layout and to customise the transformation of HTML. The page layout includes page length, page indent, line length, font definitions, page headers and footers. The customisation of the transformation includes, e.g., the handling of preformatted text (<pre>). It could be displayed indented, in a special font, in a box, etc.

dhtml
removes all HTML commands from a document.

source2html
is a filter which takes source code, e.g. C or OMG IDL and converts it into HTML. The two transformation rules are quite simple, code is enclosed in (<pre>) and (</pre>) commands and comments appear as normal HTML text.

mail2html
converts a mail folder into two HTML files, one containing all the mail headers which are linked to the mail messages in the other file.

href
is a filter which allows the use of a bibliographic database. We have chosen to support refer style databases (refer is a troff preprocessor), however, other bibliographic databases such as TeX's bibtex could be supported in a similar manner.

The filter converts lines of the form

	[[
	key
	]]
in [label] which points to the reference which is added to the end of the document. The label can be just a number [1] or can be constructed (using the refer feature) from any field in the bibliographic entry. This paper was produced using href and might serve as an example; According to the author's guidelines, the label has been constructed from the first two characters of the last name of the first author, a capital letter for unambiguity and the last two digits of the year component in the date field ( see References ). The label for this paper would be [VoA95].

Building the software documentation tools

When comparing the original requirements with the benefits gained by using the WWW the following problems remain: To satisfy these requirements we have developed the following approach.

We keep one original source version. These are the files which are used for compiling. Explanatory information is kept in comments. These anotations can also provide structuring of the document. They can include any kind of HTML commands, the most useful of which are headers, font changes and hypertext links to related documents. To distinguish such anotated sources from ordinary sources we will call them enriched source code. Figure 1 shows an example of enriched source code which is genuine OMG IDL code.

To convert enriched source code in on-line and printed documentation we use tools from the above described HTML workbench. Figure 2 illustrates the use of various filters to create different forms of documentation.

Figure 2: Software documentation kit architecture

On-line documentation to be displayed on WWW browsers is produced using the filter source2html. The filter dhtml creates a more traditional ASCII text version. From these HTML documents, we can produce customised postscript using the filter html2ps.

To increase user acceptance, we have modified the original source2html and combined it with the filter doc. The modified source2html creates some additional HTML code providing links to other representations. These are the purified ASCII text, a postscript preview and the printed version, i.e. sending the postscript to a printer. The filter doc adds project defined headers and footers to the HTML converted enriched source code.

Figure 3 illustrates the HTML based on-line documentation of the OMG IDL interface specification shown in Figure 1.

Project management

As stated in the introduction, there is also a motivation to embed the software documentation into an overall project management document hierarchy. Technically, this is a straightforward process since HTML's hypertext capabilities provide the necessary features.

Figure 4 shows an example of a project's home page providing links to the project charter, project and individual milestones, an archive of project related mail (mail2html), minutes from project meetings and the current state of the development represented by the software described above.

In addition, we found it useful to provide graphical user interfaces to demonstrations of the current version of the developed code. This issue is discussed in more detail in the Section Add-ons.

Applications

The approach introduced in the previous sections has so far been applied to three projects.

One is a purely in-house development: DCE Type Manager - Next Generation [BrA95]. The enriched source code is DCE IDL [XOA93]. The documented software contains a number of hierarchically structured interface specifications representing various components of the type management system.

Another project, whose results will be available soon (Nov. 1995), is DSTC's submission to OMG's RFP 5 on the Common Object Trading Service [OMB95]. For this project we have developed a rich project presentation structure, which will be publicly available, and internal project management documents. The project presentation contains links to related OMG documents, to detailed documentation of interface specifications and a demonstration and manual of the implemented prototype.

The third project is an international project led by the DSTC - the Interworking Trader Project [VoA95]. The emphasis of this project is on the distribution of the project partners which spread out over four continents. The documented software contains DCE IDL as well as OMG IDL.

Add-ons

Rapid prototyping of graphical user interfaces
The HTML form feature allows rapid prototyping of simple graphical user interfaces available on any platform. We have used this capability to provide demonstration of the current software development. We have used perl [WaA91] and shell scripts at the CGI to get the parameters, to invoke the demonstrator program and to convert the results into HTML. A more detailed presentation of our work on bridging middleware and WWW is given in [BeA95].

Firewalls which are usually built up between between public WWW servers and internal development machines create a major barrier when trying to provide such demonstrations to external users.

Distributed Editing
We have used HTML to write papers where the authors have been locally distributed.

There are no means for locking the documents and there was very little demand for such mechanism. Conflicting situations occurred only once or twice during the whole writing and editing process and were easily resolved by e-mail communications.

However, the immediate and convenient availibility of the current version of the paper in a very readable form has proven very useful. Also, the production of the postscript versions using was very convenient; the header was customised to meet the publishers style requirements. The task of writing was also eased by the filter href which handled all the citations.

Future work

Robustness and extensions of tools
If the technology and tools introduced in this paper find wider acceptance then there will be a need to make them more robust, e.g. add proper error treatment, provide a better user interface for customising etc. We can also envisage various extensions of the tools and further tools, e.g. a tool which sets up an initial structure for new projects and templates for various files.

Version control
An interesting extension of the software documentation kit would be it's combination with a version control system.

Security
We see a useful application, as explained in the previous section, for secure task oriented communication through firewalls, e.g. to bridge between external WWW servers and internal development machines.

Conclusion

We have shown a new approach to automatic software documentation. Our approach is based on the World Wide Web.

We have taken on the idea of the UNIX workbench and have implemented a number of relatively simple tools combined in an HTML workbench. The software documentation kit is built by piping these tools together.

Furthermore, we have extended the idea of pure software documentation by embedding it in a project management document structure. This is particular well supported by the HTML hypertext capabilities.

The technology and the tools have been successfully applied to three projects.

Acknowledgements

The work reported in this paper has been funded in part by the Cooperative Research Centres Program through the Department of the Prime Minister and Cabinet of Australia.

We would thank our colleagues and friends at the DSTC, in particular Kerry Raymond, for the comments, help and acceptance. Special thanks to David Jackson whose paper reviews are always appreciated.

References

[BeA95]
A. Beitz, R. Iannella, Z. Yang, A. Vogel, and T. Woo, "Integrating WWW and Middleware", Proccedings of Ausweb'95, (1995), pp. in press..
[BrA95]
W. Brookes, J. Indulska, and A. Vogel, "A Type Management System Supporting Interoperability of Distributed Applications", DSTC Symposium, DSTC Pty. Ltd., Brisbane, (1995).
[KeA78]
B. Kernighan, "A Troff tutorial.", UNX 4.3.3, University California, Computing Services, Berkeley, CA, (1978.).
[KeA84]
B. W. Kernighan and R. Pike, "The UNIX programming environment", Prentice-Hall software series, Prentice-Hall, Englewood Cliffs, N.J., (1984).
[KnA92]
D. E. Knuth, "Literate Programming", CSLI 27, Centre of the Study of Language and Information, Stanford, CA, (1992).
[KnA94]
D. E. Knuth and S. Levy, "The CWEB system of structured documentation: version 3.0", Addison Wesley, Reading, Mass., (1994).
[OMA91]
OMG, "The Common Object Request Broker: Architecture and Specification", OMG TC Document 92.11.1, Object Management Group and X/Open, (1991).
[OMA95]
OMG, "CORBAservices: Common Object Services Specification", OMG Document Number 95-3-31, Object Management Group, Framingham, MA, (1995).
[OMB95]
OMG, "Request for Proposals 5", OMG Document Number 95-6-18, Object Management Group, Framingham, MA, (1995).
[SpA89]
J.M. Spivey, "The Z Notation Reference Manual.", Prentice-Hall International, (1989).
[VoA95]
A. Vogel, M. Bearman, and A. Beitz, "Enabling Interworking of Traders", Open Distributed Processing III, ed. K. Raymond and E. Amstrong, Chapman & Hall, London, (1995, in press).
[WaA91]
L. Wall and L. Schwartz, "Programming perl", O'Reilly & Associates, Inc., Sebastopol, CA, (1991).
[XOA93]
X/Open Company Ltd., "X/Open Preliminary Specification X/Open DCE: Remote Procedure Call", (1993).

[Return to Table of Contents]
COPYRIGHT © 1995 by AUUG95 and APWWW95 Charles Sturt University. ALL RIGHTS RESERVED. ISBN 1 875781 43 9