Local time: Sunday, 05-Jul-2009 07:42:08 EST
Last update: at /special/conference/apwww95 , Friday, 21-May-2004 09:47:33 EST
![]()
New Approaches to Custom WWW Interfaces
Steve Ball
PASTIME Project
Cooperative Research Centre for Advanced Computational Systems,
Australian National University,
ACTON 0200 ACT AUSTRALIA
Steve.Ball@pastime.anu.edu.au http://pastime.anu.edu.au/steve/
- Abstract
- The PASTIME Project [pastime95] seeks to investigate issues concerning the construction of extremely large hypermedia datasets. The project uses the common Web browser Mosaic for client access and has identified major shortcomings in this browser, as well as Netscape. WAIS is used by the project for searching the dataset, and both of these browsers have the ability to directly query WAIS databases. Another approach is to use a forms interface. However, both of these methods are quite crude and do not present a flexible nor very sophisticated interface to the user. In particular, it is difficult to specify arbitrarily complex query expressions, and neither allows iterative query refinement. In order to provide an interface with the required sophistication we have prototyped a "helper application" for Mosaic (v2.6) - SearchTool. SearchTool uses Mosaic's CCI protocol to interact with the browser and establishes its own connection to WAIS databases. However, SearchTool is not a general solution. What is needed is a method for an information provider to supply user interface semantics, as well as hypertext data, to a WWW browser.
Parliamentary data has a video clip associated with each Hansard speech (which are the hyperdocuments of the dataset). The Mosaic and Netscape models for playing video data have poor real-time characteristics and so are unacceptable. Instead, PASTIME uses an external application which manages the data transfer while simultaneously playing the video. It is desirable to have a fine-grained interaction between the browser and the external application, for example to allow a transcription of a video to be scrolled while the video plays. Neither Mosaic nor Netscape support this granularity of interaction.
To overcome these difficulties a new Web browser has been prototyped using the Tk toolkit [Ouster94]. It is able to execute downloaded Tk scripts. This general solution also allows active message content and "hypertools". Active message content may be used to create many new applications, as well as improving current Web features such as forms. Interactive forms could be created that can check the details supplied by a user as they are being entered without having to perform a submission operation, or could create or select subforms dynamically, based upon previous selections. An example hypertool will be a video player which will interact with the browser to allow scrolling.
- Keywords
- active message content Tcl Tk Mosaic CCI browser
Introduction
Some hyperbases have user interface requirements that are well catered for by the standard technologies offered by the World Wide Web. However, other more extensive or special purpose hyperbases may possess capabilities or have access requirements that call for new and innovative approaches to providing an interface to their unique functions.Any large hyperbase will incorporate a large amount of textual material. Such hyperbases will often include a free text retrieval system to allow the user to search for documents relevant to their task. Most common World Wide Web browsers, such as Mosaic and Netscape, already provide standard facilities for accessing search indexes. However, the users of an information system for a particular organisation may demand a more customised user interface which these browsers are unable to provide. The PASTIME Project is developing an information system whose intended users have requirements that may be characterised in this fashion. This paper discusses new applications which have been prototyped in an effort to create a hyperbase that has a user interface which provides all of the features needed by its intended users. The technology developed for the implementation of these prototype applications is also applicable to other information systems for the World Wide Web.
The PASTIME Project
The Department of the Parliamentary Reporting Staff oversees the various departments within the Australian Federal Parliament that record and archive the activities of the Parliament. Hansard has online text holdings which date back to 1970 and amount to over 2 gigabytes of text data. The Sound And Vision Office records on videotape the proceedings of both Houses of Parliament, and of the various committees. Each year, approximately 1600 hours of video are recorded. If this were to be stored using "Motion JPEG" technology this would require approximately 3.5 terabytes of storage.The PArliament Sound Text and IMage Environment (PASTIME) [Thistle95] demonstrator project of the Cooperative Research Centre for Advanced Computational Systems seeks to demonstrate the applicability of advanced computational techniques in the area of hypermedia and to explore issues involved in the creation and maintenance of large hyperbases by constructing an example of such a hyperbase. The Australian Federal Parliament, by virtue of its data holdings as shown above, is the subject of the Project's initial research. The hyperbase that has been developed for the Parliament employs Word Wide Web technology for storing, delivering and presenting information to users of the system. The popular Web browsers Mosaic and Netscape may be used to browse the hyperbase. A subset of the hyperbase, in terms of data holdings and functionality, developed by the PASTIME project is available on the Internet through the Parliament Internet Trial [pit95].
The text holdings of the Parliament are partitioned into small units, each corresponding to a speech, a question and answer, a report, and so on. These small units become the basic unit of retrieval, ie. the hyperdocuments in the hyperbase. Hansard articles also contain other information apart from the text of the speeches themselves, such as the name of the speaker(s) involved, the date and time, the type of speech (Question Without Notice, Matter Of Public Importance), and so on. In a similar fashion video, which is recorded on three hour tapes, is partitioned into a separate video clip for each speech. Each article is then cross-referenced with the corresponding video clip (via a hyperlink).
The PASTIME hyperbase makes extensive use of CGI scripts to insert stuctural and referential hyperlinks in the articles at the point of delivery to the client browser. This technique allows the system to effectively deal with a very dynamic hyperbase, and to cope with the maintainance problems imposed by a large organization. For example, the hyperlink for the video clip is only inserted into the article if the clip is currently accessible online. This method, however, does require that the hyperdocuments are accessed via the HTTP protocol.
Free Text Retrieval
An important aspect of the hyperbase is the inclusion of a free text retrieval system to give users the ability to search the textbase to satisfy "conceptual links". The query language for the system must allow questions where the user wants to know "all articles in which Keating spoke about Mabo or native title issues between 1 March 1993 and 23 September 1994 during question time in which Downer interjected". Such a query implies that the language must have relational capabilities in addition to boolean operators. The question of whether a document is relevant or not to a query is not strictly boolean; relevance ranking is used to distinguish matching documents.Parliamentary researchers using the PASTIME information system will pose much more sophisticated queries than relatively simple example given above. The researchers need to be able to specify arbitrarily complex queries, and so require a user interface to support the expression of such queries. The application which provides the user interface to the PASTIME system will then need to translate such complex queries into the query language of the free text retrieval subsystem. The users may also desire the usual "user-friendly" functionality, such as being able to save and restore queries or to use a complex query as a sub-query of another search.
It is the very nature of the use of a free text retrieval system that the user does not know what they are looking for! The general objective of the user will be to find a set of documents that as closely as possible provide the information that the user is seeking. When a user sends a query to the free text retrieval system the search engine will respond with a list of documents that may possibly be related to the user's query. If the query was too broad then hundreds or even thousands of documents may be found to be relevant to the query. For example the query "all articles in which Keating spoke" will result in a great number of articles being listed for the period in which Keating was a Member Of Parliament. On the other hand the query may have been too restrictive, with very few documents being listed as relevant, or possibly none at all. In these cases the user will need to modify their query in some way and then resubmit the new query to the free text retrieval system. This process is known as "query iteration" and may occur several times as a researcher carefully refines their query to achieve a good set of matching documents for their purpose. This is another function that the application will need to support.
Typically the user will need to view several of the documents that have been listed as the result of performing a search in order to judge which of the documents are actually of interest to them. The user may use this information to reiterate the query, or simply select those documents to be processed in some fashion (for example, saved to local disk or printed). For this reason the user interface of the system should make the list of matching documents readily accessible, even when viewing other documents. It is up to the user to decide when the list is no longer needed.
The PASTIME system uses the WAIS protocol [wais88] to provide access to a free text retrieval system. However, when the user has chosen to view a matching document then the document must be retrieved via HTTP, for reasons described above.
The design and implementation of an application to meet the functionality described in this section is a research topic in it's own right, and will not be discussed. What is of concern in this paper is that the PASTIME system provide a system architecture with sufficient tools to be able to implement such an application.
Search Facility User Interface
World Wide Web browsers include standard user interfaces that may be used to access search indexes. These include the <ISINDEX> HTML header element, Direct-WAIS capabilities and. more generally, form processing.Using <ISINDEX>
One approach to providing access to the PASTIME system's text searching feature would be to include the HTML <ISINDEX> header element in every document of the hyperbase which the user may then use to specify a search query. However, this is highly redundant since the text searching facility of the system is not related to any one particular document, but rather is an attribute of the hyperbase as a whole.Therefore, it would be preferable to use an application for accessing the search engine with a user interface that is permanently available to the user whenever they are viewing a document which belongs to the PASTIME hyperbase. The interface should be removed when the user views a document that does not belong to the hyperbase, since that functionality is not applicable to documents outside of that document domain.
Using Direct-WAIS
Many popular browsers for the World Wide Web, including Mosaic and Netscape, provide the ability to access the WAIS protocol either directly or via a WWW-WAIS gateway. However, this simple and obvious approach is not suitable for the PASTIME system for several reasons as outlined below:
- Documents must be retrieved using the HTTP protocol so that they may be manipulated by CGI scripts.
- The list of documents relevant to the search criterion are presented as a hyperpage by the browser. The user may then select a document to examine, but then must navigate back to the page containing the list of relevant documents in order to inspect other listed documents.
- The direct use of WAIS by a user is only suitable for expressing very simple search queries with few terms.
- There is no support for query iteration.
Using Forms
Another approach for providing access to the search capability of the PASTIME system is to use a forms-based interface via a forms-capable browser. The use of a forms-based interface allows the PASTIME hyperbase to meet some of the user interface and system objectives given above, and indeed the Parliament Internet Trial currently offers a search capability using this technique, see figure 1. A hyperpage has been designed to allow the user to express a search query, as well as specifying which of the many WAIS databases available are to be searched. The hyperpage that is produced in response to the submission of a search query may be generated in an arbitrary fashion, and so correct URLs are given for the documents that are relevant to the query. It may also be possible to support query iteration by including an alternative submit element which augments the previous query with new search terms or operators given by the user.
![]()
Figure 1: Search Hyperpage of the Parliamentary Internet Trial
However, using forms does not solve all of the problems in attempting to satisfy the user interface requirements imposed by the PASTIME system. While some progress may be made in supporting the expression of complex search queries, it is still cumbersome for the user to be able to construct very lengthy and involved queries. Also the user is still inconvenienced by having to navigate back to the search results hyperpage in order to explore other relevant documents.
Prototype Application: SearchTool
In order to satisfy the user interface requirements of the PASTIME system a new application, SearchTool, has been developed, see figure 2. SearchTool functions as a "helper application" for Mosaic 2.6 (and also newer versions). It uses a seperate top-level window on the display so that the search capability is always available to the user. Figure 3 gives the architecture for the user interface subsystem. Internally, SearchTool uses the WAIS protocol to directly query the PASTIME WAIS server in order to find which databases are available for searching and to send search queries to the WAIS search engine. When searching against multiple databases the searches may be performed in parallel. The list of relevant documents from each database searched is then presented to the user in another separate window on the display. In this way the list of documents relevant to that search is also readily available so that the user may immediately view listed documents. When the user selects a document from the search results window, SearchTool translates the WAIS URL for the document into a HTTP URL and uses Mosaic's Client Communication Interface (CCI) to instruct Mosaic to load the document for viewing. The normal Mosaic browser functions are unchanged so the user is free to navigate to linked documents, or navigate back to previously viewed documents, and so on.
![]()
Figure 2: The SearchTool Application, with Mosaic in the Background
![]()
Figure 3: SearchTool Architecture
SearchTool does not currently support the expression of arbitrarily complex queries, nor does it support query iteration. However, it is a stand-alone application and it will simply be a matter of further engineering work to provide those functions.
SearchTool has been implemented using the Tcl/Tk language and toolkit. In order to access the WAIS and CCI protocols from the Tcl language extensions have been implemented which provide a Tcl programming interface for those protocols: TclCCI [1] and TclWAIS [Ball94].
TclCCI
With version 2.5, NCSA Mosaic introduced the Client Communication Interface protocol to allow external applications to interact with a Mosaic application. CCI allows the Mosaic process to become a server for the CCI protocol and for other applications to become CCI clients of a Mosaic server. The CCI protocol specifies various functions which a client may request the Mosaic application to preform, such as loading a new document, informing the client when source anchors are activated, sending the data for given MIME-types to the client (instead of the Mosaic application rendering the data itself), and so on.TclCCI is an extension for the Tcl language which makes the functionality of the CCI protocol available for a Tcl-based application. The extension only allows the Tcl application to be a CCI client, not a CCI server. SearchTool uses TclCCI to instruct the Mosaic application to load documents that the user has selected from a list of documents that matched a search query.
TclWAIS
TclWAIS is an extension for the Tcl language which provides a Tcl interface to the Z39.50 (WAIS) protocol. It is based upon the freeWAIS distribution. The TclWAIS extension allows a Tcl-based application to send search queries to a WAIS server and to receive the results. Any of the documents listed in the response to the search query may then be retrieved, also via the WAIS protocol.SearchTool uses TclWAIS to receive a list of the databases available for searching from the PASTIME WAIS server. When the user specifies a search query, a WAIS search request is sent to the WAIS search engine for each database that is to be searched and the documents specified in the WAIS response as being relevant are displayed on the screen. However, when the user selects a document to be retrieved the WAIS document identifier for that document is translated into a HTTP URL and the Mosaic application is instructed to load the doucment using TclCCI.
Inadequacies Of SearchTool
While SearchTool effectively deals with all of the user interface issues that have been raised by the PASTIME Project it is not a general solution for the World Wide Web at large. In particular, the algorithm for translating a WAIS document identifier into a HTTP-based URL are specific to the PASTIME system. Also, there is the problem of software distribution; any casual users who may wish to make use of the extended facilities of the PASTIME information service must have the SearchTool application installed on their computer system. This makes the SearchTool functionality attractive only for in-house use or for clients who have a pre-existing interest in the Parliamentary system.Other Applications - Video
Another important issue for the PASTIME project is the hypermedia aspects of the Parliament's data holdings. Neither Mosaic nor Netscape handle audio/video data in an way that produces acceptable results. They either first copy the data to the local computer system before playing it, the latency for which is too high, or the data is transmitted using a TCP-based network protocol, which does not deal with the real-time characteristics of audio/video data. For this reason the PASTIME system uses an external "helper application" to play video clips. We are currently experimenting with the Continuous Media Toolkit [Rowe92], which is a Tcl-based system for the playback of audio and video clips across a local area network with adjustable quality-of-service.Hansard records are close to being transcripts for the video clips associated with each speech. Parliamentary staff have expressed the desire to have the browser scroll the Hansard text as the video clip is played, such that the word being spoken in the video clip appears in the browser's window. This allows the speech to be read as the video is playing. Other hypermedia information systems also have this characteristic: that the transcription of a video clip is available and should be displayed with that clip. Examples include news broadcasts for which Teletext is available, or movies for which the script of the movie is available online. In these cases it is also desirable for the user to be able to select a word or passage in the text and then for the video to be played from that point.
These user interface requirements of hypermedia require a very tight integration between the Web browser and the video playback application. One solution is to have these two applications tightly coupled, ie. to be included in the one program. However, a better solution may be to have them loosely coupled since the network- and system-architectures of the two applications are very different. It is proposed that the Tk toolkit be used to achieve tight integration between the two loosely coupled applications by use of its "send" command.
Active Message Content
The advanced user interface to the search function of the PASTIME system may be considered to be a "capability" of the hyperbase. Ideally, when a client browser requests a document from the PASTIME system the enhanced search capability would be sent along with the document data. The ability to transmit semantic details along with (syntactic) data is known as "Active Message Content". More informally, the "document" that is loaded from a server includes program code as well as HTML data.Several systems are emerging for providing the possibility of Active Message Content. The Java language and Hot Java Web browser [Gosling95] are one such example. However, the windowing toolkit provided by the Java environment is currently not mature enough to deal with the user interface requirements of the PASTIME hyperbase. Instead, a new World Wide Web browser is being developed based on the Tcl language and Tk toolkit. This browser is known as "SurfIt!", see figure 4. Tcl is well suited to active message content since Tcl/Tk scripts are plain ASCII text.
Figure 4: The SurfIt! WWW Browser
SurfIt! [Ball95] is being developed into a fully-featured World Wide Web browser, with all of functionality found in most popular Web browsers. It currently parses and renders HTML 2.0 compliant documents, including forms. HTML 3.0 tables have also been implemented, and all other HTML 3.0 elements will be supported in the near future. Inline images are supported and the graphics formats GIF, PPM and X Bitmap are all accepted. JPEG encoded images will also be supported.
Tcl/Tk programs may be downloaded over the network as documents with the MIME type "application/x-tcl" in much the same way as any other textual data, for example using HTTP. Such programs are termed "applets". Since applets may originate from an untrustworthy source they are run using the Safe-Tcl extension. Safe-Tcl allows applet code to be evaluated inside a "safe" interpreter which is separate from the main application interpreter. All "dangerous" commands are removed from safe interpreter, and the applet's use of other potentially dangerous commands is carefully restricted. There is no limit on the number of applets that may be running in browser at any time, and every applet is independent of other applets.
The enhanced search interface for the PASTIME system has been implemented as an applet for the SurfIt! browser. The algorithms for translating WAIS search responses into HTTP URLs are still hard-coded into the applet, but the program code is always maintained by the information provider, PASTIME, and so there is no software maintainance burden placed upon the user. Other information systems can use a similar approach to provide custom user interfaces to their unique capabilities. Because almost all of the functionality of the Tk toolkit is available to the applet there is no restriction on the implementation of the PASTIME search applet, in fact the applet uses almost the same code as the SearchTool application and behaves in almost the same manner.
SurfIt! includes an API which allows an applet to interact (with some restrictions) with the browser's configuration, such as finding which hyperpage the applet was loaded from so that it may manipulate that window. Using this feature allows an applet to achieve special effects, such as animating a hyperdocument. The API also defines various "meta-events" and allows event handlers to be declared for them. One such meta-event is the loading of a new hyperdocument. The PASTIME applet uses this facility to check whether the hyperdocument being loaded belongs to the PASTIME hyperbase. If it is not then the applet is withdrawn from the screen. If the applet is in a withdrawn state and a hyperdocument is loaded that does belong to the PASTIME hyperbase then the applet remaps itself onto the screen. Thus the special capabilities of the PASTIME system are only made available when it is appropriate to do so.
Hypertools
Since SurfIt! is a Tk application it may use the Tk "send" command to achieve a tight integration with other Tk applications, and in this way "hypertools" may be implemented. The same API that is made available to applets is also available for hypertools. Since hypertools are trusted applications they may perform more functions than applets, but the burden is upon the user to install and maintain the application.It is planned to modify several existing Tk applications to work in conjunction with SurfIt! as hypertools. Initial applications shall be exmh (a mail user agent) and nn-tk (a network news reader). These applications will perform functions on behalf of the SurfIt! application in response to user actions, and similarly SurfIt! may do the same when the user requests certain functions in the other application. For example, if a source anchor is activated which uses the mailto: protocol in its URL then the destination address may be passed to exmh to have the mail composed and sent. If a source anchor is activated which uses the nntp: or news: protocols in its URL then the specified newsgroup may be passed to nn-tk which will view that particular newsgroup. exmh and nn-tk have the ability to identify URLs embedded in messages, so if the user actives such a source anchor then the URL may be passed to SurfIt! for the hyperdocument to be loaded. The intention is that these applications will not be developed into what might be called "kitchen sink" applications, but rather that they focus on their relevant task.
Another proposed hypertool will be the Berkeley CMT audio/video playback tool cmplayer. CMT defines a new WWW protocol, cmtp:, for URLs referring to audio/video material. When a URL specifying the cmtp: protocol is activated, SurfIt! will send the URL to the cmplayer application. The CMT system will then arrange for the audio/video clip to be played. Further interaction between SurfIt! and cmplayer will then be possible. For example, if the hyperdocument rendered by SurfIt! is a transcription of the movie being played by cmplayer (as is the case with Parliament speeches in the PASTIME system), then the CMT application will be able to command SurfIt! to scroll the text as the movie is played. Of course, this functionality requires either a cross-indexing of words in the text to frames in the movie or for some heuristics to be used to try to determine when words are being spoken. Similarly, when a word is selected in the SurfIt! browser cmplayer will be sent a command to play the video from the point where that word is being spoken. It will also be possible for SurfIt! to provide inline video in conjunction with the CMT toolkit.
Conclusion
The current technology available for implementing user interfaces for World Wide Web applications fails to meet the requirements of the PASTIME system. The SearchTool application has been developed, using experimental technology introduced by NCSA Mosaic, which satisfies most of the PASTIME requirements but this solution does not generalise to information systems for the World Wide Web at large.A new Web browser, SurfIt!, has been developed which supports active message content using the Tcl/Tk language and toolkit. Because Tk is a fully featured, mature windowing toolkit a user interface may be developed that satifies the user interface requirements of the PASTIME system. Further, active message content, as supported by the SurfIt! browser, may be used by any Web developer to provide customised user interfaces to unique hyperbase functions or innovative applications for the World Wide Web.
Footnotes
- [1]
- My implementation of TclCCI has been superceded by ccitcl by Stan Letovsky.
References
- [pastime95]
- Paul Thistlewaite: Hypermedia in the Australian Parliament, AUUG Sixth Annual Canberra Conference and Workshops, 15 Feb, 1995.
http://pastime.anu.edu.au/pbt/hypermedia.html- [Thistle95]
- Paul Thistlewaite: Managing Large Hypermedia Information Bases: a case study involving the Australian Parliament, AusWeb'95 Conference, Byron Bay, April 1995.
http://pastime.anu.edu.au/pbt/AusWeb95.html- [Ouster94]
- John Ousterhout: Tcl: A Universal Scripting Language, Invited Talk at the USENIX Symposium on Very High Level Languages Santa Fe, NM, October 26, 1994.
http://www.smli.com/~ouster/vhll.ps- [pit95]
- Parliament Internet Trial.
http://www.aph.gov.au- [wais88]
- National Information Standards Organization (Z39): Z39.50-1988: Information Retrieval Service Definition and Protocol Specification for Library Applications, National Information Standards Organization, P.O. Box 1056, Bethesda, MD 20817. +1 301 975 2814.
Available from Document Center, Belmont, CA. Telephone +1 415 591 7600.- [Ball94]
- Steve Ball: Tcl Interface to WAIS,
http://pastime.anu.edu.au/steve/products.html- [Rowe92]
- Larry Rowe, et al: Plateau Continuous Media Player, Proceedings of the 3rd Intl Workshop on Network and OS Support for Digital Audio and Video, San Diego, CA, Nov 92.
http://s2k-ftp.CS.Berkeley.EDU:8000/multimedia/papers/CMPlayer.ps.Z- [Gosling95]
- James Gosling, Henry McGilton: The Java Language Environment: A White Paper
http://java.sun.com/whitePaper/javawhitepaper_1.html- [Ball95]
- Steve Ball: SurfIt! World Wide Web Browser,
http://pastime.anu.edu.au/steve/surfit.html
[Return to Table of Contents]
COPYRIGHT © 1995 by AUUG95 and APWWW95 Charles Sturt University. ALL RIGHTS RESERVED.