I have not changed my view over the intervening two-and-a-half years, because the growth of the Web has not changed its basic complexion. The Web now has some excellent sites, but the majority are trivial, poorly maintained, often self-indulgent and low in nourishment value. I have aimed to make ArtServe interesting both to the student of Art History, and to the general viewer interested in art and architecture in the Mediterranean.
ArtServe is accessed from all over the world some 36,000 times per day on average (that is, well over 3 gigabytes of material per week). Access statistics are available at http://rubens/current.html .
The server has some notable deficiencies, many of which are to do with a dearth of staff to help in its upkeep, others of which have to do with the nature of Web servers. Staffing problems underline a basic difficulty with serious servers - namely that, the bigger they get, the more complicated they get, in a progression which is far from linear. This problem is highlighted by the manifest deficiencies of current server technology. Briefly, any organisation (and its concomitant, searching) must be arranged using external programs: it is not imposed - as it should be - by the server. Maintenance of large sites - and Artserve currently holds some 19,000 images - therefore becomes more difficult than it need be and users do not get the full benefit from them because they cannot necessarily find what they seek.
'Content' for an Art Historian means images, mapped at the very least by a datafile, and preferably with a searchable forms interface to make life easier for the user. This paper discusses the equipment used to take the server from the initial 2,800 images to the current 19,000 - well over five gigabytes of data - and the programs used to regiment the images to make them smoothly accessible to the server. It is a straightforward account, with illustrations to pretty things up a bit, and an Appendix which details the various programs written to process the images and make them available in a couth fashion to the server.
There are various possibilities along the way. The laserdisk provides a convenient half-way house, storing analog frames at video resolution, which the Sony PHV-A7E Photo Video camera presents to it as a video or S-VHS stream for capture on disk. The DEC Alpha computer has a digital capture card in it, but direct presentation of the frames to it would be very tedious. Using the two devices described here, a 'hit' rate of well over 200 images per hour can be attained, and these are then digitised in batch (into JPEG images) whilst the operator does something else.
Direct digital devices such as the Nikon E2 digital camera and the Nikon scanner present opportunities in that they offer four times video resolution (in the case of the camera) and much higher resolutions in the case of the scanner. Such resolutions present problems at the same time because the speed of the Web, and of people's connections, means that such large images cannot conveniently be viewed without custom-built software, called zoom, which we are currently developing. As for speed, the Nikon E2 (in my cheaper configuration) manages an image every couple of seconds (a sports variety, with more memory, can do continuous frames); written to PCMCIA cards (typically 30 images per 10Mb card, depending on quality desired), the images are of excellent quality. The camera is certainly capable of bulk-processing, but I do not use it on a copystand, preferring the Sony device detailed below. As for the scanner, the hopper holds about 50 (thin) mounted slides, and the speed of processing depends on the memory the machine has available.
A very flexible device is the Sony video camera with framebuffer, which can be used not only for photographs and prints (for example), but also for negatives and slides, with light transmitted through a lightbox. The lens (a Fujinon S16x6.7 BERM-18) allows such close-focusing that one-third the area of a 35mm slide can be captured as a full frame.
Imaging hardware has evolved over the past three years. Beginning with a S-VHS video camera feeding into a 24-bit board on an Amiga, the suite is now as follows:
This is for imaging 35mm slides. The device pumps out composite or
S-VHS. This can be fed directly into the video card on the Alpha
but is generally used with a laserdisk; the majority of the early
images on ArtServe (for example, those dealing with Classical Sites in Turkey
have been digitised with this device.
This device holds 36,250 frames per side,
at video resolution, and routines which came with the Alpha allow
the digitising of any quantity of frames into JPEGs or GIFs. This
device has been used for all the student images (which they
typically access using machines which have small screens) and for
over half of the images on rubens. The procedure is exactly the same
when using a commercial or privately pressed laserdisk: for example,
Dr Clive Ruggles' Survey
of Prehistoric Ritual Monuments in the British Isles was prepared
directly from the laserdisk he himself prepared.
Laptops often come equipped with PCMCIA readers, but separate devices are available:
This device is not cheap (circa $20,000 with a lens, flash, etc), but the results are excellent; a sample image (1.2 megapixels) is available at http://rubens.anu.edu.au/laserdisk/0214/21485.JPG.
My setup is the 'old' model; the current one, called the CatsEye, writes the same size images without interpolation, and costs about $25,000. With a close-up lens, this is an excellent copystand camera. It can talk RGB (or S-VHS or composite) straight into the laserdisk recorder as well, and has been used not only for copying photographs (most recently the Borobudur images), but also 35mm slides. A great advantage is its speed, especially now we have it talking to a Linux machine, for it writes JPEGs in under 20 seconds; a sample image (1.6 megapixels) is available at http://rubens.anu.edu.au/boro.project/basement/2.JPG .
The same camera is extremely versatile and was used in closeup to provide detailed image of computer cables and connectors;
This new acquisition is to experiment with much larger images (up to about six megapixels), given that a principle of digitising is to do it at the best resolution one can achieve and then scale down for current use. Equipped with a hopper, this manages about ten slides per hour, and can be left working overnight; it is attached to a W95 box with a mere 16Mb of memory - and no doubt throughput speed could be increased were more memory to be added; a sample image (about 5 megapixels) is available http://rubens.anu.edu.au/imageserve/sample.JPG.
Although we naturally use tape to backup our systems, writing to CDROM now costs under $1,500 for the hardware and software, and under $10 for a 650Mb blank. The process is very useful for 'bell and braces' backups, and for providing demonstration material from the server when a network is not available.
The basic image-processing utilities available under flavours of Unix - xv, xli, ImageMagick, and the pbmtools - all proved most useful. The software written in-house is described in an Appendix to this paper.
Three years ago, any digital image appeared a marvel. Today, when a reasonable-quality digital camera (offering video resolution) costs $1,000, expectations have risen so that bigger and better images are making their appearance. Three years ago, my site was devoted to images exclusively of video resolution. Now, however, with the purchase of devices such as the Nikon E2 digital camera, and the Sony DXC-930P (with framebuffer), images of four-times and five-times video resolution respectively can be generated.
Storing such images in JPEG format is little problem given the cost of hard disks, but serving them presents problems because their size is usually still too large for most people. Users like big images, but they need ways of sampling sections and of having the image delivered to them (not necessarily involving a Web browser). Hence the development of our zoom program - see below, in the Appendix.
But 'bigger and better', whilst representing the progression in imaging, has not been reflected in Web technology, and herein lies the rub for the development of large sites. The current generation of servers and clients is easy to install and generally works first time. But the technology was never designed to deal with large quantities of material, perhaps in different media (text, sound, images, postscript files, video) so that we have seen the development of a host of add-ons which attempt to mitigate perceived deficiencies. Programs like wwwstat and pwebstats look after the reporting of log files, whilst Harvest/Glimpse allows us to pretend that the heaps of material underneath the Home Page really are in something approaching order, rather than chaos. CGI scripts help to impose various layers of order as well.
Professor Maurer and his team at the University of Graz have developed HyperG (now to be called HyperWave) as a 'second generation' server and set of clients (for various platforms.
Apart from the obvious problem with URLs (which should obviously become Universal resource Names), they point out problems in the current generation with scalability, with searching and with a basic lack of structure. HyperWave deals with these matters and also offers bidirectional links (hence link consistency - no more dead links), a read-write database (their description of the server), a full-text server with inverted indexes, a document dache server, and our old friend from databases - namely, referential integrity.
Appendix:
Software for Image-Processing in Bulk
Throughout the work with ArtServe, we have kept in mind the need to provide users in Art History (and any other discipline which uses a lot of images backed by a database) with effective, robust and reasonably simple software - although it should be made clear that, as the Web changes, so the features and type of the programs needed to populate it will also change.
The guiding principles adopted for our software were as follows:
The programs divide into several groups:
Writing Image Datafiles
When dealing with large numbers of images, writing the datafile which describes them can be a bugbear. All the datafiles used on the ArtServe server are in plain ASCII format - there is no traditional 'database handler'.
The simplest method for writing such files is to do it directly into a text editor (normally vi, and cut and paste when (as is often the case) the bulk of records needs to be repeated - as, for example, when doing a series of records of the West facade of Chartres. The disadvantage of this method is that it is not particularly fast, and other arrangements need to be made for viewing the images to be catalogued, in their correct sequence.
This matter has now been addressed with a program using a Web interface, and with the (temporary) name of Bus-Queue which soon got changed to Image-Queue. First, the images are captured, and processed to provide thumbnail GIFs to sit alongside the larger JPEGs. Under my system, all images are filed in numerical directories, which makes various kinds of processing much easier.
The database records are entered using a forms interface to the Web browser, and this can be tailored using setup screens. Two types of entry are possible - namely, type-in boxes and pull-down menus. The labels and lengths of the former are set up in advance, as are the pull down menus, the latter being particularly useful when a restricted set of values is required, rather than any whimsy from the person doing the entering. No parameters are fixed in the rock, and all may be changed when (as is most likely) one realises the need to increase a type-in length here or make an addition to a menu there. An additional setup feature allows the user to specify how to treat the next record, obvious possibilities being to retain the parameters of the previous record and, in the case of some numerical fields, up the number by one.
But the program is called Image-Queue because of the way it presents the thumbnail images so the user can see what is going on. The image to be databased appears in the middle of the window, with the next four across the bottom of the window. (To examine the image in detail, click on it to bring up the large JPEG.) Once databased, the tally augments by one, new record appears, and the first thumbnail goes top left on the window. The next one joins it, and so on. Thus, after completing four new records, there are nine thumbnail images visible - the four already completed, the one currently being done, and the next four in the sequence. They shuffle from bottom right to top left - hence the analogy with a bus queue.
Datafiles and the Display of Databased Images
salami is highly configurable, and this is necessary not just to suit individual taste, but also different machine and (especially) monitor sizes. The user can select headers, the number of thumbnail images across the page, the particular fields from the datafile to print underneath them, and the field to hotspot. For use in 'slow network' setups, the program can display simply records, with no inline thumbnails (although this somewhat defeats the point of 'visual' displays).
Text Files
encyc therefore cuts an encyclopaedia (or other long, ordered text) in the correct HTML format into smaller files with a certain amount of indexing. The program depends on highlighted (in some way) terms. Everything following the term, up to the next term, is the definition. For every input file there will be a corresponding directory created (in the current directory) and in that directory will be placed one file for each term/definition pair, and a file called index.html containing only a list of hotlinks to all the terms. In the current directory, index.html will be created containing links to all the subdirectories created, each with the text above the first term from its corresponding source file, if that isn't too long. There is a stop-list of common words. A term consisting solely of one word from the stop-list is not allowed to be a term.
Image Processing
Database Access
The traditional way of working with images is to use 35mm slides, presented (often two by two, using two projectors) in lecture or tutorial room. The 35mm slide replaced the 6x6 'lantern slide' gradually from the late 1950s, arguably with some loss in quality of the image, albeit with the addition of colour. 35mm slides are compact and portable, but of limited life-span (roughly, the more they are projected, the quicker they degrade) and expensive to purchase, repair and refile.
The University of British Columbia tried to get away from the disadvantages of slides in the mid-1980s, by 'enregistering' all student images onto a laserdisk. This gave images of video quality (roughly 760 by 525 pixels), and each display station needed a laserdisk player and a monitor in order to function.
A move toward computer storage of images and their display over the network is probably inevitable because it gets round the essential stupidity and wastefulness, in this networked age, of each group of users having a collection of images which duplicates every else's collection all around the world.
This is not to say that such procedures are necessarily cheaper than 'manual' work with 35mm slides; just as laserdisk 'automation' requires a player for each workstation, so the display of digitised images across the network requires the use of a video projector, and of a sufficiently speedy computer (and network) to feed each projector. The three-gun projectors offer the best quality, but are very expensive.
In order for work with digital images to be a practical proposition, programs need developing which will allow lecturers and students to manipulate and regiment images for presentations, private study and revision. It is essential that such tasks require the minimum of computer knowledge; hence, using the World Wide Web seemed the logical vehicle, given its workability on all kinds of computers. And because we have developed a system of using small thumbnail images (stand-ins for their larger brothers, which are a mouse-click away), such work can comfortably be done over a modem (14.4K or faster) if necessary.
The programs involved display complete records of the images in certain of our units; allow the interrogation of our databases of student images (some 10,000 images in all); aid the construction of visual presentations resulting from such interrogations; and include a quiz program of varying difficulty to allow student self-testing of image material.
Although the Web and HTML have some limitations, the generality of the technology is a great advantage, as is the ease with which users can write HTML pages which are the common feature of every program to file or floppy disk, or print them out (including images).
The various programs we have developed, thanks to the programming skills of David Blackman, are as follows:
The finished quiz program throws up 5 (default) or more images, with scrolling lists adjacent containing the parameters on which the students wishes to be tested (perhaps architect, name of building, date). The student highlights the 'correct' parameters, and submits the quiz. The 'answers' page gives CORRECT or WRONG for each parameter selected, and prints out the complete record for each image in the quiz. For both the QUESTIONS and for the ANSWERS page, clicking on the thumbnail will bring up a full-size JPEG image. The answer page can be printed out by the student if so desired, or can be written to floppy disk if such a facility is available. Linked to this program is a facility for having students enter their student number so that we can check that each student tests on at least 20 images per week (we take note of the fact of self-testing, not of the results). The program has several elements:
The Home Page for the Modern Architecture Quiz is available at http://rubens.anu.edu.au/quiz/new/modarch.html.
Firstly, a database is chosen (the ArtSurf Database, or the Modern Architecture Database) and interrogated according to the required criteria. Pages displaying thumbnail images of the hits attained are then displayed, and the user can then select the images required for the presentation by simply depressing radio buttons. Sorting the images is easily accomplished by renumbering the order in which they will appear in the finished list. Getting rid of images you now realise are not required is also easy (by calling up the SIEVE routine and cancelling the requisite radio buttons), as is returning to the database (via the MORE_DB button) to make a further selection. The FORMAT button places the resultant images two-by-two (default: you can have one, three or four across as well) on an HTML page, with the record details formatted underneath. The finished file can be used as it is to make a lecture or tutorial presentation (clicking on the thumbnail image brings up a larger JPEG), or it can be printed out to paper, or downloaded to floppy. The finished file can also be dispatched to the user by email, and in this case the SEND button is depressed.
Such preparation of presentations might be done over several days. With Light-Table, you can leave the program and come back to it later, by saving the HTML page to disk, and subsequently reloading it later: the FORMAT page will be displayed with the required navigation buttons displayed and still functional; that is, all the operations described above can still be performed, including returning to the database for further searches. Everything subsequently found is added to the saved collection and can in its turn be saved.
A sample of a Web page produced with Light-Table is available at http://vandyck/teach96/arch/sample.html.
One snag inherent in the technology - namely the amount of memory required by having perhaps a two-hour lecture on one HTML page - has been solved by giving the user the choice of the number of images across the page (from one to four) and the number of rows of images on each page (3, 4, 6 or 10). In field testing, nine images per page - three across, three down - works well, even when the large images retrieved are of 500Kb size. Each image now receives a number (from 1 to N-1) so that students (and lecturers) know where they are in a suite of Web pages.
Tutorials
An excellent way of purveying information to students is by means of electronic tutorials, which are Web pages enlivened by inline images, produced as follows:
A sample screen prepared with this approach is illustrated at http://vandyck.anu.edu.au/jo2/sample.html.
Return to
Conference Proceedings