The User Interface of URNs and URCs

Renato Iannella
Research Data Network Cooperative Research Centre,
Resource Discovery Unit, DSTC Pty Ltd,
Level 7, Gehrmann Laboratories,
The University of Queensland, 4072, AUSTRALIA

Phone: +61 7 3365 4310
Fax:    +61 7 3365 4311
Email: renato@dstc.edu.au
URL: http://www.dstc.edu.au/RDU/staff/ri

Abstract

The rapid growth of the World-Wide Web has seen the Uniform Resource Locator (URL) scheme being used as the de facto naming system for the Internet. The next generation of naming will, however, provide more flexibility - with the development of Uniform Resource Names (URN) and Uniform Resource Characteristics (URC). This paper gives an overview of the current state-of-the-art for URNs and URCs and will highlight the user interface issues with their use and management on the Internet.
Keywords: Uniform Resource Names, Uniform Resource Characteristics, User Interface, Metadata, World-Wide Web, Internet Services.

1 Introduction

The Internet Engineering Task Force (IETF) has recognised the importance of globally unique resource naming and metadata (resource descriptions) by supporting two working groups charged with developing solutions for their Internet information architecture. The naming problem is to be addressed using Uniform Resource Names (URNs) and the metadata problem with Uniform Resource Characteristics (URCs) (Berners-Lee 1994a).

This report gives an overview of the URN and URC issues currently being investigated with an emphasis on how an Internet user should expect to use and interact with these new technologies. A sample scenario is presented to demonstrate this interaction, as well as identifying migration issues which still need to be addressed.

2 Background of URLs

Uniform Resource Locators (URLs) are the primary scheme used by Internet users to access and retrieve a resource (Berners-Lee 1994b). The syntax of a URL consists of a:

For example:

http://www.acme.edu.au:8080/projects/oil.html

As can be seen, the syntax is strongly coupled to a number of system-dependant factors. Although URLs were designed to specify the location of a resource, they have become the de facto standard for identifying resources. This has led to many problems as URLs frequently change and become invalid. Problems with the user interface to URLs have been reported by Hoffman (1995) and include:

The syntax of URLs is also commonly used to give some indication as to what the resource is about (Dempsey 1994); that is, the directory and filename often allude to its contents.

It is important that we learn from these experiences when we design and implement the next generation of naming schemes if we wish to maximise the success and minimise the frustrations.

3 Uniform Resource Names and Characteristics

The IETF working groups have developed standards for the requirements and encoding of URNs and URCs. Much discussion has taken place on the syntax to support these two issues.

The main idea behind the use of URNs and URCs is that the combination of

will enable resources to be published and effectively located and retrieved. A 'resolution' service will bind the URN to a URC. Technically, a client will pass a URN to a resolution service which will use a directory service to contact an appropriate server and request the URC. Once the URC has been retrieved, the client can then return the information to the user or process the metadata depending on the user's request.

3.1 URN Requirements

The URN Working Group has identified a set of functional requirements that any proposed URN scheme should address (Sollins & Masinter 1994). These requirements are:

3.2 URN Encoding

In addition, there are encoding requirements for URNs including:

From an end-users perspective, these encoding requirements are the most crucial as they represent the user interface to the URN, and hence, the primary access to the desired resource.

3.3 URN Syntax

The syntax of URNs is an area that has received much attention. Ideally, the syntax must not have any technical dependencies and yet at the same time support current legacy systems. After much discussion, the current URN syntax is:

URN:NID:NSS
where NID is the Namespace Identifier and NSS is the Namespace Specific String. The NID categorises the URN with respect to an existing naming scheme. Clearly, new schemes can then be easily supported. The NSS is dependant and governed by the rules of the NID.

For example:

urn:isbn:123456789X
urn:inet:dstc.edu.au:tr0088
urn:telecom:61733654310
The first URN shows support for the legacy ISBN scheme used by the publishing industry. The second URN is used by Internet servers. The NSS in this case being a hostname and a string to resolve at that host. The third URN shows a (hypothetical) example of an international telecommunications carrier and a particular individuals telephone number.

3.4 URN Resolution

The URN Architecture Group (which DSTC is a member) is currently working on proposals for the resolution process for URNs. Currently, HTTP will be supported as an example of one protocol. There are modifications to DNS that are proposed to support the selection of services during the URN resolution process. The group is also establishing guidelines on the assignment of NIDs and associated responsibilities.

3.5 Uniform Resource Characteristics

A URC describes a resource via metadata. The URC working group has broadly outlined requirements for URCs through a series of scenarios (Daniel & Mealling 1995a).

Metadata is 'information about information'; that is, it effectively describes a resource for a particular purpose. For example, if we were to describe the metadata for this paper, we may come up with the following description:

Title: The User Interface of URNs and URCs
Author: Renato Iannella
Subject: Uniform Resource Names and Characteristics,...
Identifier: http://www.dstc.edu.au/RDU/reports/AUUG96WWW/index.html
This metadata could be the URC for the URN:

urn:inet:dstc.edu.au:tr0088
Currently, there is no agreement on any metadata set that can be used for documents. However, there is growing support for the Dublin Core metadata set (Weibel et al. 1995) and enhancements made to it. The Dublin Core metadata set comprises of the following fields:

Clearly, there will be no single metadata set that will cover all resources. However, URCs also need to describe the metadata they are using (that is, administrative metadata). In the above example, the metadata set could be expanded to include:

URC-Type: DublinCore
URC-Date-Created: 199501011200Z
URC-Date-Modified: 199501011230Z
URC-Created-By: Mary Smith (smith@dstc.edu.au)
This extra administrative metadata used by a URC will allow a resolution service to apply semantics to the URC; that is, a client can parse a URC, identify the metadata set used, then use the attribute/value pairs in some knowledgeable fashion. A standard set of attributes for administrative metadata that can be used across all URCs would at least enable a primitive level of interoperability. This would allow support for an unlimited number of metadata sets that are tailored for the resources they describe.

Another benefit from using URCs is that WWW indexers (such as Lycos and Alta-Vista Alta-Vista) can then begin to index the metadata and not the whole document itself. This will improve the quality of the WWW indexed data and improve the retrieval of information via resource discovery tools.

Current work on URCs include the standardisation of a syntax protocol (Burnard et al. 1996] and a transport infrastructure for metadata packages. Most of this work is the result of the Warwick Dublin Core Workshop (Lagoze 1996).

4 URN/URC Scenario

As an example of URN and URC usage, assume that an organisation called DSTC, wishes to make their technical reports available for access to the public. By issuing URNs to the public, the DSTC then has the freedom to move the actual locations (URLs) of the technical reports to suit the organisational needs. The URCs are updated with this information, leaving the URNs unchanged.

The DSTC decides to use the Dublin Core metadata set to describe their resources. Entries for each technical report are stored in a database. The DSTC also decides on some administrative metadata that captures information about the creation and management of the URC.

The URN resolver is able to resolve any requests for the INET namespace identifier and use the naming authority 'dstc.edu.au' in the namespace specific string. The DSTC sets guidelines on the allocation of the namespace specific string (in this case the Technical Report number). Requests to resolve the above URN will return all the Dublin Core attributes, including the administrative metadata.

The requesting client can verify that it understands the metadata returned by examining the value of the URC-Type attribute. This will then enable a client who can support, in this case, the Dublin Core metadata set, to extract the values from the attributes and process or display them appropriately. In many cases, the main aim of resolution is to get to the location of the resource. In this case, the URLs in the metadata will be used by the client to retrieve the resource.

Figure 1 shows an HTML document listing a number of technical reports published by the DSTC. Each report title is actually a link to a URN. For example, the bottom of the window displays the URN:

urn:inet:dstc.edu.au:rdu/tr007
which relates to the report titled 'Internet Resource Discovery Issues' (see third to last entry in the list).

Figure 1: Technical Report List

When the user selects this URN link on the HTML document, the URC information is then displayed (see Figure 2).

Figure 2: URC for Technical Report

The window shows the metadata describing the resource (in the top part of the window). The bottom section of the window shows the administrative metadata used. As can be seen in the Identifier attribute, there are two URLs for this conference paper, indicating that an HTML and Postscript version exists. The user can then choose which version of the paper they prefer and click on the link.

In future versions of WWW browsers, the URC information may not be seen directly by the user. For example, the user may customise their browser to indicate that they prefer to see the HTML version of any document received. Hence, in the above example, the user may have gone directly to the HTML version of the document.

The work described here has been undertaken as part of The URN Interoperability Project.

5 Migration to URNs and URCs

A number of issues need to be addressed to support the early deployment and use of URN and URC services:

One proposal, particularly for WWW-based resources, is to combine the metadata with the resource. For example, a HTML document can easily be marked-up with comments at the beginning of the document which is the metadata description. For example:

<head>
<title>Dublin Core Syntax - DSTC Comments</title>
<meta name="DC.title" content="Dublin Core Syntax - DSTC Comments">
<meta name="DC.author" content="Renato Iannella">
<meta name="DC.subject" content="Metadata, Dublin Core, URN, URC">
<meta name="DC.date" scheme="ISO" content="1996-05-21">
<meta name="DC.identifier" scheme="URI" content="http://www.dstc.edu.au/RDU/reports/dc-imp.html">
<meta name="DC.form" scheme="IMT" content="text/html">
</head>
<body>
...
</body>
The metadata can then be automatically extracted on demand, or systematically to create the URC database. Although limited, it may be a solution for current legacy HTML documents. For example, (Pitkow & Jones 1995) describes a similar system. Clearly, more sophisticated systems for URN assignment and URC management will need to be employed.

Two issues that are of key importance is that the window of opportunity is closing on URN deployment and that competing systems are being developed. Large commercial companies with browser products are in need of URNs - and could do it their own way if the IETF working groups are not ready to demonstrate deployable solutions. Other naming systems, like Persistent URLs are making claims for widespread adoption as a naming standard.

6 Conclusion

The importance of effective and flexible schemes for URNs and URCs will be paramount for their early deployment. This paper has given an overview of the current activities in these two key areas of the Internet architecture. Naming and metadata are areas receiving increasing attention as these issues are fundamental to the capabilities of many systems in distributed networking.

However, there are still many areas that need to be addressed. One key point is that we need to learn from the experiences with URLs and overcome these limitations in any future systems that will be deployed. URNs and URCs will become the foundation to many new and innovative services on the Internet. Hence, the need to fully support effective URN and URC management systems.

Acknowledgements

The work reported in this paper has been funded in part by the Cooperative Research Centres Program through the Department of the Prime Minister and Cabinet of Australia.

Bibliography

1
Beckett, D. (1995) IAFA Templates in use as Internet Metadata in Proceedings of the 4th International WWW Conference. Boston, Massachusetts, USA.
URL: http://www.w3.org/pub/Conferences/WWW4/Papers/52/

2
Berners-Lee, T. (1994) Universal Resource Identifiers in WWW: RFC1630, IETF Network Working Group.

2
Berners-Lee, T., Masinter, L. & McCahill, M. (1994) Uniform Resource Locators (URL). Request for Comments: 1738, IETF Network Working Group
URL: http://ds.internic.net/rfc/rfc1738.txt

3
Burnard, L., Miller, E., Quin, L. & Sperberg-McQueen, C.M. (1996) A Syntax for Dublin Core Metadata in Recommendations from the Second Metadata Workshop
URL: http://info.ox.ac.uk/~lou/wip/metadata.syntax.html

4
Daniel, R. (1995) An SGML-based URC Service. Internet Drafts.
URL: ftp://ietf.cnri.reston.va.us/internet-drafts/draft- ietf-uri-urc-sgml-00.txt

5
Dempsey, L. (1994) Network Resource Discovery: A European Library Perspective in Smith, N. (ed.) Libraries, Networks and Europe: A European Networking Study. British Library Research & Development Department, London.

6
Vizine-Goetz, D., Godby, J. & Bendig, M. (1995) Spectrum: a Web-based tool for describing electronic resources in Computer Networks and ISDN Systems, 27:985-1000.

7
Hoffman, P. (1995) The User Interface of URLs in Proceedings of INET95 - Annual Meeting of the Internet Society, Hawaii, 1:123-126.

8
Lagoze, C. (1996) The Warwick Framework - A Container Architecture for Aggregating Metadata Objects.
URL: http://cs-tr.cs.cornell.edu/~lagoze/warwick.htm

9
Lycos Catalogue of the Internet (1995).
URL: http://www.lycos.com/

10
Persistent Uniform Resource Locator (1996).
URL: http://purl.oclc.org/

11
Pitkow, J. & Jones, R. (1995) Towards an intelligent publishing environment in Computer Networks and ISDN Systems, 27:729-737.

12
Sollins, K. & Masinter, L. (1994) Functional Requirements for Uniform Resource Names: RFC1737, IETF Network Working Group.

13
The URN Interoperability Project (1995).
URL: http://www.dstc.edu.au/ RDU/TURNIP/

14
Alta Vista WWW Index (1995).
URL: http://www.altavista.digital.com/

15
W3C Reference Library Code (1995)
URL: http://www.w3.org/pub/WWW/ Library/

16
Weibel, S., Godby, J., Miller, E. & Daniel, R. (1995) OCLC/NCSA Metadata Workshop Report, Dublin, Ohio, USA.
URL: http://www. oclc.org:5047/oclc/research/conferences/metadata/dublin_ core_report.html


Organised by: AUUG'96 & CSU Return to Conference Proceedings