The program incorporated into a page can reason with other pages as part of its behaviour. These other Web pages are treated as logic programming modules, termed LogicWeb modules. Special operators are available for the retrieval of LogicWeb modules and to invoke goals within them, as well as to combine modules.
The use of logic programming has a number of advantages, including the ease of specifying searching, the availability of knowledge-based reasoning, and the ability to define semantics for all the extensions. Logic programming also permits the meta-level manipulation of modules.
We outline three LogicWeb application areas: rule-based search tools, lightweight deductive databases, and distributed software engineering.
Keywords: logic programming, rule-based reasoning, World Wide Web
These features make logic programming ideal for applications requiring search over possible solutions, symbol manipulation, and flexible manipulation of databases (Lazarev 1989). For example, it is used in AI problem-solving, knowledge representation and expert systems (Prolog, Lazarev 1989). It is also used in the field of deductive databases where rules provide greater modelling capabilities, extending relational databases. It is also favourable from a software engineering viewpoint: it allows executable specifications to be written (Lazarev 1989). Recent work on modular extensions to logic programming (Bugliesi et al.1994, Brogi 1993) provide mechanisms for structuring larger Prolog software.
The World Wide Web is growing rapidly, particularly as an information dissemination tool. There is already extensive work on using the Web to transport, not only HTML (Hypertext Markup Language) text, but program code (termed 'mobile code'), and work on using the Web infrastructure for distributed applications (The World Wide Web Consortium). For example, the Java object-oriented language JAVA is enormously popular for enhancing the Web with interactive applications. More recently, the integration of logic programming and Web technology is being explored more extensively in the form of Prolog libraries to access Web pages, expert systems on the Web, mobile Prolog code, and MOOs (Multi User Domains - Object Oriented) (Davison 1996).
In this paper, we explore the use of logic programming in Web-based applications in the context of current work on LogicWeb. In particular, we shall look at the following application areas: declarative formulation of information searches, deductive databases on the Web, and distributed software engineering using the Web as the communication medium.
LogicWeb (Loke & Davison 1996) is an application of logic programming ideas to the Web, which treats Web pages as logic programming modules. The Web page becomes a live information entity that uses its rules to respond to user queries. Also, LogicWeb modules are treated as first-class objects within a logic program, enabling the rules to reason with other pages and define relationships between pages. Special operators are available for the retrieval of LogicWeb modules, to invoke goals within them, and to combine them.
The use of logic programming has a number of advantages, including the ease of specifying searching, the availability of knowledge based reasoning, and the ability to define semantics for all the extensions. Logic programming also permits the meta-level manipulation of modules.
In the following sections, we describe LogicWeb and outline the above mentioned applications. We then discuss LogicWeb as a mobile code system, and conclude with a brief description of the current implementation, and directions for future work. We shall assume an acquaintance with Prolog.
<A HREF="http://www.cs.mu.oz.au/~swloke/logicweb.html">LogicWeb</A>is represented as the following fact:
link("LogicWeb","http://www.cs.mu.oz.au/~swloke/logicweb.html").
This provides a layer of abstraction beyond the text of a page.
A LogicWeb module can contain a logic program written in Prolog extended with new operators. There are currently four operators [Loke et al. 1996a], two of which are:
m_id(<URL>)#>GoalThis states that the goal Goal is to be proven in the module m_id(<URL>) (the module corresponding to the Web page with the URL <URL>) rather than in the current module. Referencing the module via the #> operator results in the corresponding Web page being implicitly fetched (if it has not already been) and loaded on the client system. In other words, transparent retrieval and loading of the module is carried out prior to goal evaluation.
lw_union(ListOfModules)#>GoalThis goal evaluates Goal in the union of the modules in ListOfModules. This operator can be used to combine databases in LogicWeb modules.
relevant_link(Keyword,URL,RelevantURL) :- m_id(URL)#>link(Label,RelevantURL), contains(RelevantURL,".html"), related(Keyword,Keyword1), contains(Label,Keyword1), m_id(RelevantURL)#>h_text(Source), contains(Source,Keyword).The goal m_id(URL)#>link(Label,RelevantURL) retrieves link information from the page of URL URL from link/2 facts. Subsequent links in the page are retrieved on backtracking. contains/2 determine if its second argument is a substring of its first. related/2 are facts relating keywords. A goal such as
relevant_link("logic", "http://www.cs.mu.oz.au/~swloke/resources.html",
RelevantURL).
retrieves links concerning 'logic' from the resource page.
This simple example illustrates how users can write rules to automatically search Web sites based on heuristic knowledge.
Based on the above ideas, we have implemented a rule-based tool (Loke et al. 1996b) that searches several Web sites for a given citation, given an author's name and title keywords. Search engines, such as Lycos, are used to provide starting points for searching.
Logic programming is ideal for specifying different kinds of graph searches and for encoding heuristic knowledge to guide the search. It is also useful for parsing and for rule-based analysis of documents. A simple example is a rule that states that a document is relevant if it contains a set of related keywords.
In a recently proposed logic-based Web query language called WebLog (Lakshmanan et al. 1996), rules refer to pages (and its components) using URLs, treated as first-class objects. It allows formulation of queries over pages like the above example. However, their work is based on a database query language and they do not view pages as modules. Our view of pages as modules allows us to use module composition as an abstraction for gleaning information from different sources.
LogicWeb provides operators to dynamically combine modules containing databases in queries. For example, assume that three institutions have their own databases of academics and their research interests (represented as facts in LogicWeb modules) but all conforming to the following schema:
interested(staff_name,list_of_research_topics).
Suppose we would like to find out which academics from the institutions are interested in a given topic (for example, databases). The query is expressed in the form of a goal evaluated against the union of the modules from the institutions, as shown in the following rule:
academic_interest(DatabaseModules,Topic,Name) :- lw_union(DatabaseModules)#>interested(Name,ResearchTopics), member(Topic,ResearchTopics).This rule is invoked with the goal:
academic_interest([m_id("http://institution1/research_db1.html"),
m_id("http://institution2/research_db2.html"),
m_id("http://institution3/research_db3.html")],
databases,Name).
The locations of the modules containing the databases can themselves be obtained from databases; that is, we can built databases about databases. We can also specify relationships between databases or build a semantic net whose nodes are a collection of related databases. Such a semantic net provides an alternative structure for the Web. Besides the databases, knowledge bases can be built to process user queries.
Lightweight deductive databases allow users to express information and how they can be manipulated in a more structured and formal way compared to ordinary HTML text. The LogicWeb framework also allows interfaces to be built over these databases. For instance, a LogicWeb module can have clauses that process user queries by retrieving the modules containing the databases, performing the matching, and displaying the results.
Search rules can be used to search the Web for the modules with the databases relevant to a given query. Hence, LogicWeb provides a single language for specifying both the search for the modules and the retrieval of the information from within modules.
Logic programming has advantages for software engineering. Modular abstractions are needed for programming-in-the-large (Bugliesi et al. 1994, Brogi 1993). LogicWeb provides this and also allows a module to retrieve (and use) on the fly other modules during goal evaluation.
Sedlock et al. recommends writing Prolog programs in the literate programming style, together with their documentation and specifications, in HTML files. The programs are extracted when needed. This not only benefits the programmer, but also other team members who can then browse the code. He describes an implementation in Prolog of a software project management system where all the code is written in HTML files. However, all the source code resides on a single server. We take his recommendation further and suggest LogicWeb as a distributed programming platform.
Mobile code languages must deal with the issues of security (since foreign code is executed locally), architecture-independence and resource control. Hence, a main feature of these languages is that they are interpreted. This enables control over the instructions executed. LogicWeb modules are executed locally and hence, must not allow damaging code to run. Fortunately, interpreters for logic programs can be easily written and as logic programs themselves. A meta-interpreter has control over every goal that is executed and hence, can guarantee safety. Automatic memory management in Prolog programs also means that users do not deal with pointers which could potentially access arbitrary memory locations.
We have outlined how LogicWeb can be used in three broad application areas. LogicWeb enables information providers to set up downloadable programs which help search for information on their repositories. These programs become interfaces to their sites and can take the form of browsing aids, query-able knowledge bases, or information on related sites. We intend to build specific applications utilising the above techniques and to explore new application areas.
The Web is a dynamic information resource. We would like to examine operators with a temporal notion such as latest(ModuleId) and refresh(ModuleId,TimeInterval) which can be used to retrieve the latest version of a module and to periodically update a module, respectively.