A Vision for Open Hypermedia Systems

Peter J. Nürnberg, John J. Leggett

Hypermedia Research Laboratory
Center for the Study of Digital Libraries
Texas A&M University
College Station, TX, 77843-3112, USA

{pnuern, leggett}@csdl.tamu.edu

phone: 1(409)845-0298     fax: 1(409)847-8578


Abstract

Currently, the Open Hypermedia Systems (OHS) Working Group claims three main areas of interest: scenarios, reference architectures, and protocols. The discussions over scenarios of OHS use are supposed to inform the work on OHS reference architectures, which in turn is supposed to enable the development of an Open Hypermedia Protocol (OHP) that will allow clients of one OHP-compliant OHS to use services of other OHP-compliant OHS's.

In this paper, we start from existing proposals for an OHS reference architecture and an OHP. We then present a number of scenarios that motivate modifications to these existing proposals. These modifications primarily include adding the notion of an open structure processing layer to the reference architecture and adding a fixed minimal set of guaranteed services to the protocol.

We then present our resultant reference architecture and protocol proposals. Our proposals are based on current working group proposals, but incorporate the modifications suggested by our scenarios.

Finally, we conclude with some comments on the process we used to derive our proposals, an evaluation of current progress of the OHS Working Group, and suggestions for future directions.

Key words: open hypermedia systems (OHS), open hypermedia protocol (OHP), structural computing

Word count: 15858

Media: text/html, image/gif

Hypertext form (suitable for browsing) available at http://jodi.ecs.soton.ac.uk/Articles/v01/i02/Nurnberg/


1. Introduction

The Open Hypermedia Systems (OHS) Working Group (http://www.csdl.tamu.edu/ohs/) was formed after the 2nd OHS Workshop (http://www.daimi.aau.dk/~kock/OHS-HT96/), held in conjunction with Hypertext 96. Currently, the OHS Working Group (OHSWG) claims three main areas of interest: scenarios, reference architectures, and protocols. The discussions over scenarios of OHS use are supposed to inform the work on OHS reference architectures, which in turn is supposed to enable the development of an Open Hypermedia Protocol (OHP) that will allow clients of one OHP-compliant OHS to use services of other OHP-compliant OHS's.

Before the scenario, architecture, and protocol work can begin, however, it is important to state the problem being addressed by the OHSWG. Opinions on this subject are necessarily "first principles" that ground further discussion, and as such tend on the one hand to be somewhat arbitrary and on the other hand treated as inviolate axioms from which resultant work must not be allowed to stray too far. There has been neither a consensus nor even a truly concerted effort to derive these principles to date. Like other aspects of the group's work, any problem definition on which the scenarios, protocol proposals, and architecture proposals will rest will ultimately be derived from group discussion on different proposals. We begin this paper by presenting our proposal for a definition of the scope of the group's work.

In the next section, several different scenarios are discussed. Synopses of the points of the scenarios relevant to the paper and links to the full scenarios on the OHS scenarios WWW site (http://www.csdl.tamu.edu/ohs/scenarios/) are given. Three major implications of these scenarios for the architecture and protocol work are then discussed.

We then provide a proposal for an OHS reference architecture, based on a synthesis of previous proposals (e.g., [Goose et al. 1997] (http://www.daimi.aau.dk/~kock/OHS-HT97/Papers/goose.html) and [Grønbæk and Wiil 1997] (http://www.daimi.aau.dk/~kock/OHS-HT97/Papers/gronbak.html)), with the modifications suggested by our scenario analysis and OHSWG problem scope definition. We offer this architecture primarily as a way to identify which conceptual entities must communicate with one another to effect hypermedia services for system clients. However, we feel the architecture has generative properties as well, suggesting ways in which future functionality may be added to our initial designs. Brief outlines of how the actual architectures of several OHS's represented in the OHSWG map to our reference architecture are also provided.

We then present our proposal for an OHP, based on the current proposal by Davis et al. [1996] (http://diana.ecs.soton.ac.uk/~hcd/protweb.htm), the critique offered by Anderson [1997] (http://www.daimi.aau.dk/~kock/OHS-HT97/Papers/anderson.html), the discussion on the OHS mailing list (http://www.csdl.tamu.edu/ohs/archive/index.html#start), our reference architecture proposal, scenario analyses, and problem definitions. We feel the protocol itself, or at least its presentation, is somewhat simpler than the current proposal. We intentionally ignore several "hot issues" regarding details of protocol implementation. Where this is done, however, we defend our choice based largely on our reference architecture and discussions on where certain knowledge must reside. Conversely, we elaborate certain aspects of the current protocol work in ways we feel will speed more widespread adoption of the protocol by the OHSWG members.

The primary aim of this paper is to provide the OHSWG with scenarios, architecture and protocol proposals. However, this paper has a secondary aim - an evaluation of the process by which the OHSWG has chosen to perform its work. The first several sections of this paper can be seen as a microcosm of the OHSWG effort, since it contains representatives of all of the elements of the group's product. The last two sections of the paper considers the benefits and drawbacks of this process, evaluates the work of the group against this model, and offers suggestions for future progress.

2. Scope of the OHSWG

Simply put, the OHSWG work concerns open hypermedia systems (OHS's). In this section, we examine what the term "OHS" has meant in different contexts. Then, we outline what we believe to be certain minimal requirements for any product that the OHSWG ultimately produces.

2.1 Open Hypermedia Systems

Open. The term open has been used to mean many things in system work. The consensus of the OHSWG has been that open hypermedia systems allow an open set of clients of the hypermedia services provided by the system. No assumptions about the clients (such as data types handled, etc.) are made.

Hypermedia. The term hypermedia (somewhat surprisingly) has not been well-defined by the group. Furthermore, it is not clear that the group's members agree on what it constitutes. Proposals to the group touching on this subject have varied widely, from a "minimalist" approach, as argued in the current protocol proposal [Davis et al. 1996] (http://diana.ecs.soton.ac.uk/~hcd/protweb.htm), to a somewhat broader view, as argued in [Trigg and Grønbæk 1997] (http://www.daimi.aau.dk/~kock/OHS-HT97/Papers/trigg.html). This range of opinions may stem from the fact that the systems represented in the group have widely varying notions of those services that should be provided by a hypermedia system. DHM [Grønbæk and Trigg 1996] (http://www.daimi.aau.dk/~kgronbak/DHM/DHMHome.html) includes composite services. Microcosm [Davis et al. 1992] (http://wwwcosm.ecs.soton.ac.uk/) includes information retrieval services. HOSS [Nürnberg et al. 1996] (http://www.csdl.tamu.edu/hoss/) provides taxonomic, spatial, and other structural services. The exact types of services to be provided by an OHS must be resolved before any real progress can be made on the architecture and proposal. We present our view on this subject below.

System. Even this most innocuous term can cause a bit of confusion. Wiil and Leggett [1996] described a division of OHS's into two categories. Open link services (e.g. Chimera (http://www.ics.uci.edu/pub/chimera/), Microcosm (http://wwwcosm.ecs.soton.ac.uk/), Multicard (http://ourworld.compuserve.com/homepages/Euroclid/)) are systems that concern themselves primarily with the provision of hypermedia functionality (whatever that may be) to third party applications. Open hyperbases (e.g. DHM (http://www.daimi.aau.dk/~kgronbak/DHM/DHMHome.html), HyperDisco (http://www.daimi.aau.dk/~kock/Publications/HyperDisco/), HOSS (http://www.csdl.tamu.edu/hoss/)) provide open link services and some notion of hypermedia storage. Obviously, the OHSWG must make decisions regarding the presence or absence of hypermedia-based storage in any OHS reference architecture. Additionally, some systems provide a much higher degree of distribution than others, and thus must account for "secondary" services such as location services, naming, etc. The questions of which of these kinds of secondary services should be included in an OHS also must be addressed. We present our views on this subject below as well.

2.2 Minimal Requirements

Even after the group decides what hypermedia and system mean, there still remains the question of how ambitious to be when designing the architecture and especially the protocol. There are conflicting goals involved in this matter. Less ambitious plans are likely to be implemented more quickly and easily, while more ambitious plans are likely to be more broadly applicable and useful. This has recently been a topic of some debate on the OHS mailing list (http://www.csdl.tamu.edu/ohs/archive/index.html#start), with various positions from less (as per Whitehead's 10 June 1997 post (http://www.csdl.tamu.edu/ohs/archive/0049.html)) to somewhat more (as per Grønbæk's 10 June 1997 post (http://www.csdl.tamu.edu/ohs/archive/0041.html)) ambitious.

Clearly, as a minimal requirement, the OHSWG must have a product with functionality either difficult or impossible to duplicate with other systems. This is hardly likely to be a source of much debate, since not to do so would result in a late solution to a non-problem. It also seems relatively uncontroversial to propose that our product be able to be implemented or adopted in stages, with an easy entry level and progressively more interesting levels to be added "later." There seem to be three questions that must be answered taking this as a starting point.

  1. What should the final functionality be? That is, what functionality do we see in our "final" level of development?

  2. What is the starting level? That is, what functionality do we see in our initial level of development?

  3. How many levels should there be? This can be thought of as describing the "deltas" between levels.

Final functionality. Before we spend a great deal of time answering this question, we should decide if we even need an answer right now, or if one can even ever be given. It is a good thing to have an endpoint in mind when starting a task. However, we should consider the possibility that our final product contains some open sets of entities that prevent us from truly talking about the functionality of the "final" version. In this paper, we take the view that this is indeed the case, and as such, have no proposal for a "final" level specification. The arguments for this view and the implications of it are discussed below.

Starting functionality. This is a considerably more difficult issue than final functionality, mostly because it must be resolved quickly. Less ambitious starting points are easy to implement. However, realistically, they may fail to capture the imagination of the group members and the larger hypertext research community. The fact of the matter is that, as of the OHS 3 meeting (April 1997) (http://www.daimi.aau.dk/~kock/OHS-HT97/), no group has OHSWG work as its top priority and it may be that none ever will. Without a tangible "value-added" to subscribing to and implementing OHSWG standards, they may languish unobserved. In any event, there is clearly a fine line between specifying too much work and generating insufficient interest. Although these concerns must be balanced, we favor a more ambitious startpoint specification, in the hope that its potential rewards for compliant systems will be sufficient motivation. We provide more specifics below.

Number of levels. Although this seems like a reasonable question to ask, it may in fact be somewhat misguided. Several posts in previous discussions have been made about "levels" of protocol compliance, but a looser (and more useful) interpretation of the underlying concept at work here is some set of "components," the members of which either may or may not be supported by a particular system. With a notion of levels comes the implication that a system supporting some level must support all "lower" levels as well. We should avoid making compliant systems support functionality that the designers of those systems do not see as necessary for their goals. We believe, in fact, that the notion of an open set of components provides a flexible model that system designers can use to their advantage. With respect to the issue of "delta" between levels (or its component analog) it should be the case that modifications to a system to adopt/integrate a new component should be well encapsulated. More details are given below.

3. Scenarios

This section considers several different scenarios submitted to the OHS WWW site (http://www.csdl.tamu.edu/ohs/). They can be found in the scenarios section (http://www.csdl.tamu.edu/ohs/scenarios/) of that site. Firstly, a brief synopsis of the aspects of each scenario that are relevant to the points made in this paper is provided. (Links to the full scenarios are provided as well.) Of course, these synopses do not capture all of the intricacies of the full scenarios, but the extra complications can be ignored for the implications drawn here. Secondly, four ramifications of the scenarios for the OHSWG work are discussed. Three of these directly impact our reference architecture proposal, while two concern our protocol proposal.

3.1 Scenario Synopses

Taxonomic (full scenario at http://www.csdl.tamu.edu/ohs/scenarios/tax1/). Botanical taxonomists build taxonomies over samples of plants. Taxonomies consist of taxa and specimens. In this case, a specimen is a plant sample and some data about the sample, such as collection date, collection location, accession number, etc. A taxon is an abstraction that groups together like specimens and/or like taxa. It has a name, a set of characteristics, and a group of children specimens and/or taxa. Taxa are singly parented.

Many difficulties exist in managing these botanical taxonomies. Firstly, not all taxonomists agree on the same set of taxa over a given set of specimens. Secondly, even if two taxonomists agree on what taxa should be constructed, they may disagree on the grouping criteria for some subset of the taxon. Thirdly, even if they agree on these two issues, they may disagree on the attributes of some subset of the taxa, such as name. Fourthly, even if two taxonomists agree on all of these issues at some point in time, the taxonomies they create may change over time, as new data are added, new interpretations are made, etc.

In addition to the creation and manipulation of taxonomic structure, the characters also want to be able to add and manipulate navigational hypertext structures over their specimens, taxa, and taxonomies.

The scenario considers various problems encountered by botanical taxonomists as they manipulate their taxonomies on-line.

Spatial (full scenario at http://www.csdl.tamu.edu/ohs/scenarios/space1/). People faced with the task of organizing pieces of information may be able to use spatial hypertext tools to help them perform this task. Spatial hypertext systems take advantage of people's ability to organize information spatially. In these systems, a piece of information (datum) is represented by an object with certain visual characteristics (shape, color, etc.) that help determine the "kind" of information it is. A given datum may have its visual characteristics changed by a user. Additionally, the image of the datum may contain text and/or other information.

Users are expected to place like data into spatial structures such as stacks, vertical lists, etc. These structures may themselves be placed into structures, and so forth. For example, several vertical lists may be placed side by side, forming a group of lists.

The spatial hypertext system should be able to recognize the spatial structures generated by users and allow users to treat all objects in such a structure as a unit for certain operations, such as movement or deletion. It recognizes these structures by conducting a "spatial parse" of the space in which the data reside. An important aspect to the structure generated by a spatial parse is that it is dynamic. Repositioning one or more data may invalidate the results of a previous spatial parse. Furthermore, it is difficult to modify the results of a previous parse to account for data movement. Also, this parse is relatively (time-wise) inexpensive to perform. This means that in this scenario, there is no reason to keep the results of a parse after data are moved. Instead, new structures can be generated by the parser as requested.

The scenario concerns a character who uses a spatial hypertext system to organize his thoughts about a paper he is writing.

External (full scenario at http://www.csdl.tamu.edu/ohs/scenarios/external1/). Irrespective of the final forms of the OHS reference architecture and protocol, users will always want to use tools that are not integrated with OHS services. This may be because the appropriate wrapper has not been built for a particular tool, the tool provides OHS-like services in a way not compliant with OHSWG standards, or the tool acts as an "extra-systemic" client or server to programs that do comply.

In this scenario, Bob uses several tools that are not integrated into any OHSWG system. He uses these tools alongside his OHSWG tools.

Auto-indexer (full scenario at http://www.csdl.tamu.edu/ohs/scenarios/index1/). There are many WWW search engines that work in the following way. Some automatic indexer program visits a WWW page and indexes its content against its URL. It then follows the links from that page to other pages, where it performs the same process. This obviously takes a long time. (Consider that when the AltaVista search engine was being prototyped in December 1995, DEC estimated that the WWW contained over 50 million indexable documents [DEC 1997].) This index cannot be built by hand. Some autonomous program, once started, must traverse the WWW structure automatically.

In this scenario, Homer wants to build just such a search engine. However, he wants to build indexes of documents in the OHSWG space instead of the WWW space.

3.2 Scenario Implications

The scenarios reviewed above have many implications for OHSWG work. Here, we review four of these. Their ramifications for architecture and protocol work are also provided.

3.2.1 Open Structure Abstractions
The structure abstractions that comprise hypertext (i.e., the essence of the definition of hypertext) are the cause of some disagreement, as was discussed in Section 2.1 (scope_ohs.html). Nearly all OHS's provide some notion of link that connects one or more locations. Links express relationships between objects or parts of objects. The terminology for these links, objects, object parts, etc., varies widely, but this is not the real cause of difficulty. If we could all agree on the abstractions, the names would be largely irrelevant.

The more difficult complications come in deciding the right set of abstractions. Sufficiency is an insufficient criterion for judging whether or not a particular set of abstractions should be adopted to model something (a process, an object, etc.) Our community has recognized that convenience and efficiency are two other important criteria.

The convenience criterion can be illustrated by the observation that most programming languages have the same modeling power (i.e., they are Turing complete), but all programmers have not simply uniformly adopted one of these languages for all uses. Different languages are better for different tasks, because they may provide more natural abstractions for the particular task. An example of a convenience (or naturalness) based argument for a set of abstractions is that made in [Trigg and Grønbæk 1997] (http://www.daimi.aau.dk/~kock/OHS-HT97/Papers/trigg.html) in favor of composites. Although it is clear that composites can be modeled by links, Trigg and Grønbæk discuss important reasons why this may not be natural.

The efficiency criterion can be illustrated by the observation that it is sometimes easier to describe the method for generating items than enumerating the items themselves. For example, the set of odd integers can be described by (2*i)+1 (where i is an integer), which is simpler (more efficient with respect to space, time, etc.) to communicate than an enumeration of odd integers. An example of such abstractions in an OHS are the Microcosm generic links [Davis et al. 1992], that compute structures (connections between data) when needed. Although possible, it would be inefficient (in the extreme) to store all possible generic links as actual link objects in a hyperbase.

If we accept that we cannot simply provide a sufficiently powerful set of abstractions to OHS clients, the question arises as to what are convenient and/or natural structural abstractions. It seems that all OHSWG members would agree that links are such. At least some agree that composites and generic links are such. The taxonomic and spatial scenarios summarized in Section 3.1 (scen_synop.html) show that there are even more structural abstractions that people have found useful in our own research community. Although links can model taxonomic structure, and (dynamic) composites can model spatial hypertext structure, these points, for the reasons described above, are insufficient. Taxonomic and spatial hypertext require structural abstractions tailored to these domains. This argument is easily extended to show that any closed set of abstractions cannot be guaranteed to be useful in a practical sense (i.e., convenient and efficient) for all possible applications. Only an open set of structure abstractions can meet this requirement.

One could argue that it is not the domain of the OHSWG to support "odd" hypertext systems that manipulate taxonomic or spatial hypertext. There are two reasons we believe that such an objection would be misguided.

Firstly, it may in fact be no more difficult to support an open set of abstractions than a closed one. Suppose that some closed set of abstractions is adopted by the OHSWG. There are two approaches to the management of these abstractions. If they are all handled independently by the system, then functionality like version control, notification control, concurrency control, etc., must be re-implemented for each abstraction. If, on the other hand, all abstractions served to clients are "translated" into some common structure format by the OHSWG server, such functionality need only be implemented once for the common format. (Note that this translation to a common format does not violate the convenience and efficiency arguments made above, since these arguments only pertain to clients of the system and are independent of storage and other issues internal to the server.) This "common structure" approach is evidenced most prominently in systems like HyperDisco (http://www.daimi.aau.dk/~kock/Publications/HyperDisco/) (in which functionality is implemented in base classes and different structure abstractions are then subclassed from these base classes) and HOSS (http://www.csdl.tamu.edu/hoss/) (in which the Structure Base provides common structure object facilities to individual Sprocs, that then tailor these objects to abstractions suited to particular domains). Allowing an open set of structure abstractions does not require that all possible structure abstractions be implemented (of course). In practice, it calls for a "common" structure format to be handled internally by the OHSWG server, which then should provide an open set of ways in which to tailor and extend this common format to the particular needs of clients (such as taxonomic, spatial, or navigational applications).

Secondly, providing more general support for "odd" forms of hypertext will expand our base of potential users significantly. For example, consider the number of systems presented at the last few ACM Hypertext conferences that require only links and perhaps generic links and composites. If that number were truly large, we would witness more people using OHS's to develop hypertext applications. Instead, nearly all of the "literati" works, as well as other applications that receive a non-trivial amount of attention at these conferences (such as spatial hypertext), require more than just these basic abstractions which the systems community in general (and the OHSWG in particular) spends so much time discussing. In a very real sense, OHS developers design hypermedia infrastructure. We should at least aim to make this infrastructure usable by the majority of our own research community.

Our architecture proposal is based on the premise that we should deliver an open set of structural abstractions to OHSWG clients. We believe that this does not add any requirements to systems that wish to be OHSWG compliant, while allowing a natural way for systems to provide extensions for supporting additional structural abstractions. Our protocol proposal reflects the open nature of the set of structural abstractions served to OHSWG clients by specifying multiple layers of protocol to describe client-server interactions. The lower layers provide a common syntactic model for these messages, while the upper (open) layer allows tailoring of messages to fit the particular structure abstractions served by a given server.

3.2.2 External Entities
Whatever the form the OHS reference architecture takes, it will be the case that there will be entities outside of the architecture. Several kinds of such "outside" programs are illustrated in the external scenario summarized in Section 3.1 (scen_synop.html). What are the implications for this in our architecture work?

Essentially, we must not try to model things that occur outside the OHS in our architecture. It is our contention that this is exactly what is done in the current proposals by Grønbæk and Wiil [1997] (http://www.daimi.aau.dk/~kock/OHS-HT97/Papers/gronbak.html) and Goose et al. [1997] (http://www.daimi.aau.dk/~kock/OHS-HT97/Papers/goose.html) when entities such as document management systems (DMS's) are included. To take the DMS's as an example, consider that there are two ways in which these systems are considered in the current proposals: either clients or servers may interact with DMS's.

If clients do so, this is clearly not our concern, since clients may perform open, arbitrary computation. This is not relevant to our task, which is to specify the client's interactions (or their form) with the OHSWG servers.

Servers may interact with DMS's in two ways. A DMS may be a client, in which case there is no reason to explicitly name the type of client or differentiate it in any way from other clients. It may also be a "store" acting as a place in which the server stores hypermedia or other data. In this case, again, it must be asked why this is differentiated from any other store. Either it is "wrapped" per the HyperDisco [Wiil and Leggett 1996] method, and thus looks like any other store, or it is not, in which case this interaction is too specific to be considered in the reference architecture.

Of course, a single instance of a DMS could act as both client and store to a server, or as a store for both clients and servers. Is this problematic? Perhaps, but again, this is true for a broad class of stores. There seems to be, in general, no support for treating DMS's separately from other kinds of programs, whether they be clients, stores, or combinations.

Our architecture proposal is based on the premise that there is no particular reason to enumerate or differentiate entities external to the issue at hand of defining a reference architecture for OHS's. They should be excluded from consideration or generalized to a broader, more applicable class of entity.

3.2.3 Semantics of Operations
There has been some controversy in the OHSWG concerning whether or not the hypermedia services should be differentiated semantically from one another in the protocol. The current proposal assigns semantics to some operations (e.g., LaunchDocument), but many services are implemented through the RequestService call. The basic idea is that clients request a list of available services from an OHSWG server. The server returns two parallel arrays of strings. One array contains service names as understood by the server, and one is to be used as the client sees fit, presumably for display on a menu of hypermedia services available.

The heart of the controversy seems to be that this model requires human interaction to effect the hypermedia services from the client, since choosing the correct service to perform an action requires a human to interpret the strings in the second array. This model of service definition is obviously inspired by the Microcosm approach to service definition. At first glance, this seems the "least common denominator" approach to this issue, since systems that normally provide semantically meaningful service definitions can simply remove these semantics from their definitions to comply with the protocol. However, the key problem with this approach is that it still relies on human interaction at the client side. The fact that OHS's other than Microcosm can define their interfaces in this semantics-free way does not address this reliance.

We believe the key point in this issue is whether or not reliance on human interaction and interpretation of service names is reasonable in every case. The automatic indexer scenario summarized in Section 3.1 (scope_ohs.html) shows this not to be the case. For the automatic indexer to work, it must work autonomously. It must be able to determine how to "follow a link" without human intervention. That is, it must know how to request this service. Simply allowing that arbitrary services have arbitrary names makes it impossible for a program to know what name is associated with a particular service (e.g. Follow Link). The current protocol's proposed method of semantics-free service definition cannot address this situation. Thus, we feel that the protocol must include semantics.

One objection previously raised to this approach is that assigning semantics to services seemed to close the set of services that could be offered. We do not believe this to be the case. Both our architecture and protocol proposals have layers of open structure abstractions, and by extension necessarily admit open sets of operations. However, with the definition of a structure abstraction to be added to the set served by an OHSWG server, it is necessary that semantically well-defined operations also be defined.

3.2.4 Naming
In any distributed system, naming is an important issue. There has been little discussion of naming objects in the OHSWG, and essentially no discussion of naming servers. The former is somewhat surprising, while the latter is not unexpected, since current architecture proposals do not have the notion of multiple types of servers.

Before naming is discussed any further, it is important to note why this is even an issue that we should consider. The automatic indexer scenario summarized in Section 3.1 (scope_ohs.html) describes a program that indexes node content against the node identifier for all nodes served by a set of servers. If nodes cannot be uniquely identified, such an index cannot be built, or rather, even if it were, it would not be useful. If the group sees the automatic indexer scenario as a reasonable use of an OHSWG system, the issue of object naming must be addressed.

Naming objects is crucially important in a distributed system. It is beyond the scope of this paper to describe fully the intricacies or review the applicable work in this area (e.g., distributed file systems [Gifford et al. 1992; Reiher et al. 1994; Sun 1995; Többicke 1994], distributed operating systems [Walker et al. 1992; Welch 1994], and many other WWW or internet based initiatives [IETF 1997; OCLC 1997]). We wish to raise this issue, however, and outline the way in which one of the systems represented in the OHSWG (HOSS) is addressing this issue in order to provide a starting point for discussion.

HOSS object identifiers consist of two parts: a global identifier and a local identifier. The global identifier uniquely specifies a hyperbase. Hyperbases must be uniquely named. The method for generating globally unique hyperbase names may be handled in various ways. HOSS uses a completely distributed naming authority scheme which does not guarantee uniqueness. However, cost of name generation is very low and likelihood of collision can be made arbitrarily small. Theoretically, when a hyperbase is created, it generates a "random" 64 bit string which it uses as its identifier. In practice, the hyperbase creation script appends the least four significant bytes of the system time on the machine on which the hyperbase is being created to the IP address of that machine. The actual method of generating these 64 bits (or even the number of bits in this global identifier) is not relevant to the idea. A "DNS-like" naming authority scheme that guarantees uniqueness [Albitz and Liu 1996], a global registry of available names, or other methods could be used as well.

The local part of an object identifier is assigned by the hyperbase management system (HBMS) itself. It may be any size, so the number of objects in a hyperbase, the way in which an HBMS names objects, etc. are unconstrained. The HBMS must be able to guarantee uniqueness over the local identifiers it generates for a particular hyperbase.

Whether or not the OHSWG adopts this object naming scheme, another well-established scheme, or defines its own, this issue should be addressed. In this paper, we do not propose any particular object naming procedure. We simply note that the above method seems sufficient and may provide a starting point for discussion among group members.

With respect to server naming, this issue only arises when multiple servers are defined. Our reference architecture proposal describes an open set of structure servers ("link server", "composite server", "spatial hypertext server", etc.) Each of these servers must have a unique name.

In HOSS, server naming authority is completely distributed and works on a global registry basis. Servers register any name (or set of names) they wish with a global name registry, along with ports on which the server is listening for service requests. Programs may query this registry to locate particular servers. (This incidentally brings up the issue of locating servers in a distributed environment as well. This issue, although important, is not addressed in detail here.) The global registry is effected as a distributed set of name information managers (called SIM's, see Section 4.1.1 (refarch_ent_sim.html)) that propagate information among themselves. SIM's are arranged hierarchically. Name requests that cannot be resolved at one SIM may be propagated up the hierarchy and to the appropriate SIM in a way analogous to the resolution of machine names to addresses in DNS [Albitz and Liu 1996].

As with object naming, it is outside of the scope of this paper to propose a specific method for doing server naming. However, if the OHSWG adopts an open set of servers in its reference architecture, this will have to be addressed. The above method seems adequate, and should provide a starting point for further discussions.

4. An OHS Reference Architecture Proposal

In Section 2 (scope.html), we outlined what we believe to be the scope of the OHSWG. In Section 3 (scen.html), we provided synopses of and pointers to four scenarios and four implications of these scenarios for OHSWG work. In this section, we describe our reference architecture proposal, based on previous proposals to the group and our analysis. This description is divided into two parts. The first describes each entity of the architecture. Analogs of these entities to parts of OHS systems represented in the OHSWG are provided. The second section describes the protocols needed in such an architecture.

Before we begin the description, here is a diagram providing an overview of the architecture. It is reproduced in each section below.

A reference architecture box diagram
Figure 1. An OHS Reference Architecture Proposal.
All reference architecture figures in this paper, including this one are Netscape client-side imagemaps. Parts of the figures are links to the appropriate sections.

4.1 Architecture Entities

This section describes the four entities in our reference architecture proposal. Some entities are discussed in little detail, since they do not impact the design or implementation of the OHP protocols. These entities should receive attention from the OHSWG, but are only outlined here.

4.1.1 SIM
A SIM (Server Information Manager) collects server name and location information from servers and distributes it to clients. As described in Section 3.2.4 (scen_impl_name.html), there are many ways in which this can be accomplished. The method that the HOSS OHS uses was described above. The previous section also provided the name for this entity, since servers of server information in HOSS are also called SIM's. While the details of this entity should be addressed by the group, we only note that such an entity should exist in any OHS reference architecture.

A reference architecture box diagram
Figure 2. SIM's.

4.1.2 Storage Engine
A storage engine entity in our architecture should be interpreted as an entity that provides persistent storage to clients. This is in contrast to previous proposals that discuss the kinds of abstractions these storage engines might or should serve (data objects, links, etc.) We see no benefit in distinguishing among hypermedia storage engines (HBMS's, Hypermedia databases, or other terms applied to such entities), data storage engines, document storage engines, etc., especially from the point of view of defining the OHP. Thus, we try to sidestep the division mentioned above by Wiil and Leggett [1996] between open link servers and open hyperbases. It will be the case in some OHS's that Sprocs talk to "structure-aware" storage engines, while in others, they talk to databases, file systems, or other engines that provide pure data abstractions. Additionally, we intentionally ignore the issue of storage engine "wrappers" as suggested by Wiil and Whitehead [1997] and elsewhere. This is not because we feel the subject to be unimportant or uninteresting, but because it does not seem relevant to the goal of defining an OHP. We suggest that this issue be taken up by the group outside of the OHP discussions.

Analogs to our storage engine entity can be found in any OHS, although as stated above, the issue of structure-awareness varies among systems. Additionally, both current reference architecture proposals have clear analogs to this entity.

A reference architecture box diagram
Figure 3. Storage Engines.

4.1.3 Client
A client in our architecture should be interpreted as the entity that interacts with Sprocs. Although it is tempting to draw an analogy between this client entity and viewers in the OHRA proposal or content handlers in the Grønbæk and Wiil proposal, this is not entirely correct. Although in many instances, viewers or content-handlers may interact directly with Sprocs, one can also imagine various other programs such as session managers, shims, wrappers, tool integrators, etc. interacting with Sprocs on behalf of viewers or content handlers. From the point of view of the Sprocs, this distinction is irrelevant. We feel it is most appropriate not to distinguish these cases from one another. Whether in such cases the proxy alone is considered the client, or some conceptual "collapsing" of the proxy and the "end-client" are so considered, is a question that also seems unimportant for the task of this particular group.

Client entity analogs in various OHS's and the reference architecture were cited above. In addition, the automatic indexer described in one of the scenarios abstracted above would also be considered a client entity.

A reference architecture box diagram
Figure 4. Clients.

4.1.4 Sproc
A Sproc (i.e., structure processor [Nürnberg et al. 1996]) is a server that provides various structural abstractions to clients. It is the generalized analog to a link server or hypermedia server in other systems. Because the set of structural abstractions that should be provided by an OHS is open (as discussed above), this layer should also be open.

All OHS's represented in the OHSWG have at least one Sproc that serves navigational hypertext abstractions such as link and/or anchor. We claim that several OHS's conceptually provide more than one Sproc to their users, even if these different Sprocs are implemented as one process in these systems. HOSS provides an explicit open Sproc layer in its architecture, so it is clear that it provides multiple Sprocs, both conceptually and in implementation. OHS's with a notion of composite object (e.g., DHM, HyperDisco) might be thought of as having a conceptual "composition" Sproc. Systems with information retrieval facilities (e.g., Microcosm) might be thought of as having a conceptual "IR" Sproc. When non-OHS hypertext systems (VIKI [Marshall and Shipman 1995], StorySpace [Joyce 1991]) are considered, more examples of conceptual Sprocs can be found.

What is the benefit of an open Sproc layer? If the set of structural abstractions served by a system is open, the set of entities serving them should be as well. The case for an open set of abstractions was made in Section 3.2.1 (scen_impl_open.html). One could argue that even if an open set of abstractions were to be served, one could accomplish this with a closed set of servers if these servers could be modified or extended over time. We feel this is an argument grounded in implementation. Whether or not the addition of new abstractions is effected through the addition of new servers or the modifications of existing ones does not change the fact that the reference architecture should provide a way to express this addition at a conceptual level.

What is the benefit of identifying multiple Sprocs in systems such as DHM or Microcosm when this obviously does not mirror the implementations of these systems? Why shouldn't the DHM hypermedia server or the Microcosm filter manager be seen as one Sproc? Clearly, they can be so conceptualized. In one sense, this is an issue that does not affect the argument for an open Sproc layer. However, something on this point clearly must be said if the architecture is to be in any way descriptive. This question is simply a particular instantiation of the issue of what constitutes a Sproc.

Above, we stated that a Sproc serves a set of structural abstractions. More descriptively (but correspondingly less precisely), a Sproc serves a coherent set of structural abstractions. Coherence of a set of abstractions does not seem amenable to definition, but informally, we can understand it to mean that a Sproc's set of abstractions is coherent if its members constitute a particular structure model. For example, at a basic level, navigational hypertext has a model with links connecting anchors that are themselves "connected" to (or associated with) locations in nodes or nodes themselves. (Whether or not this is anyone's exact model of navigational hypertext is for the moment not relevant to the discussion at hand. It will be addressed below in more detail.) All OHS's have a similar model of navigational hypertext, and so we can, as a group, decide that anchor, link, node and other related abstractions can be coherently grouped into a set served by one Sproc. Furthermore, we believe that few people would naturally group the abstractions of spatial hypertext into such a Sproc, since they seem to be of a fundamentally different variety. Composition, IR, taxonomic, and other structures might also be agreed by the group to constitute individual Sprocs.

One could raise an objection to the notion that there should (or could) be only one model of navigational or any other kind of hypertext. There are two answers to this objection. Firstly, the reference architecture certainly admits cases in which systems provide many largely similar but slightly different Sprocs. That is, this objections does not apply to the reference architecture, but to a particular application of it, that being the application of standardizing the meaning of certain hypertext models. This leads to the second answer to this objection. At a fundamental level, the OHSWG is concerned precisely with the task of standardizing the notions of what constitutes hypertext. If we as a group decide that a link server (navigational Sproc) must understand a message such as GetAnchorTable in the current protocol proposal, we as a group have (partially) standardized the notion of (navigational) hypertext. To do so does not in any way speak to specific implementations of such an abstraction, but it does speak to what the client can reasonably expect the interface to any implementation to look like. This is the definition of standardization. Furthermore, to say that we have standardized the notion of hypertext does not speak to the target audience of such standardization. That is, we do not claim that our work need be adopted by anyone outside the group to call this work "standardization". Particular efforts directed toward the consideration and/or adoption of our standards by the W3C, IETF, or other bodies is not discussed here.

In Section 2.1 (scope_ohs.html), we said that the group must (at least implicitly) define the notion of hypermedia. We now would like to rephrase this requirement in terms of the discussion here on open structural abstractions. The group must define both the list of types of hypermedia (or structural computing [Nürnberg et al. 1997]) in which it is interested, and what abstractions constitute these types. Our perspective is that abstractions served by the current OHS's (excepting HOSS for the moment) essentially fall into three categories: navigational (served by all), compositional (DHM, HyperDisco), and IR (Microcosm). We feel that these three types of structural computing constitute a good starting point for the group. Different types of structural computing can be addressed at a later time.

One benefit of this division of abstractions into three sets is that each OHS can subscribe to the OHSWG standards in terms of which of these sets it supports. All OHS's support a form of navigation (although the exact definition of navigational hypertext is not provided here). Some systems currently support composition or IR, while others do not. We have taken a view in this group to this point that one way for systems lacking certain functionality to be OHSWG compliant is simply to ignore (gracefully) service requests concerning this functionality. This is a result of a too coarse-grained approach to service definition. With the finer-grained Sproc approach to service definition, clients wishing to use, for example, composition services need not poll OHSWG servers until they find a composition-aware server. They simply locate (through a SIM) a composition server. This server may in fact be the same process as that which is serving navigational abstractions to this same client, but conceptually, these services should remain separated.

A reference architecture box diagram
Figure 5. Sprocs.

4.2 Architecture Protocols

4.2.1 SIM Protocol
The SIM protocol is used by all other entities in the architecture to locate one another. In its simplest form, it must allow processes to publish service identifiers and locations (e.g., TCP ports) at which they receive service requests and to query for these locations. Options for propagation of this information to other SIM's, types of protocols understood at various locations, etc., can also be part of this protocol.

A reference architecture box diagram
Figure 6. SIM Protocol.

4.2.2 Store Protocol
Clients and Sprocs use one of an open set of storage protocols to make service requests of storage engines. These protocols will be dependent on the engines present in any given system and are left unspecified in this paper. Work such as that in HyperDisco that considers storage engine wrappers would allow for standardization of a storage protocol between these wrapped (or naturally compliant) storage engines and other entities. These and other details about this protocol, as well as related work (e.g. ODMA [AIIM 1997]) should be addressed at a later time by the OHSWG.


Figure 7. Store Protocol.

4.2.3 OHP
This set of protocols is of primary importance to the OHSWG group. We present a proposal for one member of this family in Section 5 (proto.html). All that need be said here is that an open Sproc layer in the architecture implies an open set of protocols between clients and Sprocs. Each member of this family will provide service definitions that are specific to the Sproc that understands that particular protocol.

In some sense, a Sproc implies a protocol and vice-versa. This can be considered analogous to the notion in object-oriented distributed middle-ware systems such as CORBA [OMG 1997] that an object publishes an interface. The "protocol" that carries the particular method invocations to and from the object need not be specified explicitly. Only the method of how to map an interface definition to a protocol need be specified, and even this is transparent to client programmers. We believe that the OHSWG can also adopt such an approach, specifying only a "transport" protocol [Hebrawi 1993] on top of which specific Sproc protocols can be placed.


Figure 8. OHP.

5. An OHP Proposal

In Section 4 (refarch.html), we described the OHP as a family of protocols, one per Sproc. In this section, we present a proposal for how these protocols can be generated and how they should be defined. Of course, because the set of Sproc protocols is open, all members of the set cannot be specified. Here, we address one particular Sproc protocol - one that handles navigation. Composition and IR protocols are not defined here, but should be the focus of the OHSWG in the short term.

The OSI seven layer model [Hebrawi 1993] describes connections between programs. It allows one to concentrate on one aspect of connectivity by focusing on a particular layer in the model. The lowest layer concerns physical connectivity, while the highest layer describes application-level protocols.

Similarly, we find it useful to think about OHP protocols in an analogous layered way. Figure 9 illustrates a layered model approach to OHP protocols. The OSI Transport Layer (OSI Layer 4) guarantees reliable, ordered byte transport. It seems clear that we should consider building our protocols on top of such a layer, since we should not be concerned with error checking, message ordering, etc. The other three layers are considered below.

A protocol stack box diagram
Figure 9. An OHP Protocol Stack.

5.1 Generic Encoding Layer

The Generic Encoding (GE) Layer uses the services of the OSI Transport Layer, and thus may assume reliable ordered byte transport. This layer is responsible for mapping the abstractions of the Sproc Generic Layer to byte streams for the Transport Layer facilities.

For reasons that are explained in detail in Section 5.2 (proto_gen.html), the Sproc Generic Layer uses structure object abstractions to send messages. A structure object has arbitrary content, arbitrary attribute/value pairs, and arbitrary endset/endpoint pairs. Attributes have an arbitrary number of NULL-terminated ASCII strings as values, while endsets have an arbitrary number of object identifiers as values. (The form of an object identifier is ignored here, as this is part of the naming issue discussed briefly in Section 3.2.4 (scen_impl_name.html). For the purposes of this discussion, we treat identifiers as arbitrary byte blocks of arbitrary size.)

The central issue of this layer is how to encode or serialize these structural objects into byte streams. The current protocol proposal discusses analogous issues in Section 5.1 under "Form of the Protocol". It proposes that we encode messages into ASCII clear text. Clearly, other choices could be made at this level. For example, messages (or structural objects or whatever abstractions are used at the Sproc Generic Layer) could be mapped into arbitrary byte streams, compressed, encrypted, etc. The primary advantage of clear text seems to be that human beings can read and interpret it, making debugging or monitoring easier. The primary advantage of byte streams is that the amount of data sent may be smaller, since no encoding of arbitrary binary data into clear text need occur. Because of the range of choices available, this layer is represented as open in the protocol stack figure.

The choice here does not seem to have implications at either the higher or lower levels of the protocol. Furthermore, the differences between the various choices are not substantial. Clearly, they are equivalent in expressive power, and just as clearly, programmers will not normally need to concern themselves with the exact mechanics of this layer. We propose that the group adopt the clear text approach, as this is consonant with the current proposal. Alternatives include the current various binary packing approaches and BER standards [ISO 1990] for encoding to and from byte streams.

Because data translated into byte streams loses its structure, a header including the total number of bytes (not including the header) should be prepended to the byte stream given to the Transport Layer. Details concerning this size header and our proposal for encoding structural objects into byte streams are provided in Appendix A (appa.html).

A protocol stack box diagram
Figure 10. Generic Encoding Layer of the OHP Protocol Stack.

5.2 Sproc Generic Layer

The Sproc Generic (SG) Layer uses the services of the Generic Encoding Layer, and thus may assume reliable ordered structure object transport. This layer is responsible for mapping the abstractions of the Sproc Specific Layer to generic structural objects.

One complication in this layer is that the set of abstractions that the Sproc Specific Layer presents to its clients is open. It cannot be the case that the SG layer be expected to be able to translate arbitrary structural abstractions into generic structure objects. To remedy this, each instance of a Sproc Specific layer protocol is expected to be expressed in terms of generic structure objects when the services of the SG Layer are invoked.

Since the Sproc Generic Layer must already express its abstractions in terms of generic structure objects, the SG Layer need not do any translation. This raises the question of why the SG layer exists at all. This is because there are some aspects of OHP messages that are common to all Sproc Specific Layers. This common functionality is encapsulated in the SG layer. This functionality involves tagging messages with standard "header" type information. This information has been the subject of several posts to the OHS mailing list, including Nürnberg's 8 June post (http://www.csdl.tamu.edu/ohs/archive/0033.html) and Reich's 9 June post (http://www.csdl.tamu.edu/ohs/archive/0034.html).

What information should be considered "header" information? Basically, the criterion is that it be applicable to OHP messages independent of the particular Sproc and structural abstractions handled by the messages. Working from the previous OHS list posts, we propose the following header information:

Details as to the data types and representations of these types are provided in Appendix B (appb.html).

A protocol stack box diagram
Figure 11. Sproc Generic Layer of the OHP Protocol Stack.

5.3 Sproc Specific Layer

The Sproc Specific (SS) Layer uses the services of the Sproc Generic Layer, and thus may assume reliable ordered structure object transport and automatic standard header tagging. This layer is responsible for mapping the abstractions of a particular Sproc to generic structure objects.

Each Sproc has its own set of abstractions and services it provides to clients. These abstractions may take arbitrary forms. However, they must be expressible in terms of generic structure objects, since the Sproc Generic Layer expects these objects as input. There are two questions that must be answered with respect to this requirement. Firstly, does the generic structure object provide a sufficiently powerful means of expression? Secondly, does the requirement to map all members of the set of open abstractions to one particular abstraction contradict the discussion in favor of an open set of abstractions provided in Section 3.2.1 (scen_impl_open.html)?

With respect to the question of expressive power, it seems clear that generic structure objects are sufficient. Aspects of messages that do not neatly fit into the attribute or endset categories can be modeled as arbitrary object content. However, we feel that a stronger claim can be made with respect to representation of messages as structure objects. It is the case that this representation is reasonably natural for many of the abstractions needed by Sprocs. Object endset functionality allows the manipulation of structure. Object attributes allow modeling attributes of structure objects such as presentation specifics. Content can be used to hold scripts or node contents. A proxy object can be added to a set of objects that contains things like message name. Because aspects of a message like the message name are modeled as structure objects, layers below the SS Layer have a simple job - they need only understand structure objects. Message name, protocol name, and other data are "folded" into this common representation. From a data type perspective, everything is stored in a common data type (structure object) with fields of predefined types (named arrays of strings for attributes and named arrays of arbitrary bytes for endsets).

This leads to the second question. In Section 3.2.1 (scen_impl_open.html), we argued against the notion that sufficiency of expressive power be used as a criterion for expressing abstractions. However, that argument applied to the abstractions that clients manipulate, not to those that servers manipulate. There is always a layer in any protocol at which all abstractions take a common form, whether that be byte streams at the Transport Layer or generic structure objects at the Sproc Generic Layer. The important issue here is that this mapping be hidden from clients. These concerns should be transparent to clients.

Since the SS Layer is open, it is impossible to define all the mappings from all of the structural abstractions and services provided by this layer to generic structure objects. This can be handled two ways. Firstly, one could define mappings as new abstractions and services are added. However, this greatly hinders standardization. Secondly, one can define an algorithm for mapping arbitrary structural abstractions and services to the generic format. This approach is analogous to the way in which CORBA interface specifications are used both to define the programmatic interface to an object and to generate IPC libraries to convey messages concerning this interface. We have chosen this second approach. Appendix C (appc.html) provides and algorithm to map the parameters of a given message to a set of generic structure objects. Essentially, simple parameters are mapped to endset/endpoint or attribute/value pairs. Large parameters are mapped to their own objects.

Given an algorithm to translate service definitions into sets of generic structure objects, only the service definitions of Sprocs must be specified. There are an open set of these Sprocs, but we feel three Sproc interfaces should be specified as a starting point for OHSWG work. As mentioned in Section 4.1.4 (refarch_ent_sproc.html), these Sprocs should address navigation, composition, and IR. A proposal for the navigation Sproc interface is given below. Composition and IR Sproc interfaces should be addressed in the short term.

A protocol stack box diagram
Figure 12. Sproc Specific Layer of the OHP Protocol Stack.

5.3.1 Navigation Sproc Interface Definition
This proposal for a navigation Sproc interface is based on the current protocol proposal and critiques and various posts on the OHS mailing list (http://www.csdl.tamu.edu/ohs/archive/index.html#start). It is written in the Sproc Interface Format defined in Appendix D (appd.html). For comparison, the current protocol proposal is rendered in this format in Appendix E (appe.html).

Since IR and composition structure management has been abstracted into other Sprocs, this navigation specification is less powerful than the current proposal. For example, it is not possible to model Microcosm generic links with this interface. The advantages of this finer-grained approach to service definitions over the current "union of functionality" approach are discussed above. This Sproc is designed to handle the task of navigation between explicit static links.

In general, one change that has been made to nearly all services is that they now concern multiple objects (e.g., the open document service can specify multiple documents). IPC is a very expensive part of any distributed operation, so messages should be combined and sent together whenever it is possible and natural to do so.

Many services can be requested by either the client or the server. When this is true, the message is listed only once, and the different semantics (if any) of the different directions are explained. This is in contrast to the current proposal, in which if there is one conceptual operation that either the client or server may request, two messages are defined.

Targets. There are four classes of entities in this protocol. The "all" class includes messages anyone can send and/or receive. The "server" class is the Sproc. The "browser" class is a client that only browses structures. The "author" class is a client that may modify structures.

TARGET  all
TARGET  server
TARGET  browser
TARGET  author

Types. With respect to anchor identifiers, link identifiers, node identifiers, location specifiers, and presentation specifiers, the protocol will treat all of these as opaque byte block identifiers. The type id_t is predefined, and will be used throughout for all identifier types.

Scripts can be defined in various ways, but we propose handling them as NULL-terminated ASCII strings.

TYPE  script_t   ALIAS FOR  NTstring;

We follow the current proposal's definition of an anchor as a binding between a locspec and a document (component) id. We also add attributes for a presentation specification, an (optional) direction, and a boolean indicating whether or not the anchor belongs to a link. Also, some systems (e.g., Chimera and the HOSS LSM) make anchors relative to not only a component and persistent selection, but also an application. Some systems "imply" an application in an anchor through the document type. Provisions for both methods must be made.

TYPE  anc_t   STRUCTURE (
   doc_id    : id_t;
   locspec   : id_t;
   direction : NTstring;
   pspec     : id_t;
   in_link   : boolean;
   app_id    : id_t;
   doc_type  : NTstring;
);

Document services. A Sproc can request a client to open and display a document. This message is known as "LaunchDocument" in the current proposal. Our "LaunchDocument" message makes the following changes. Firstly, the document may be identified by an identifier - some global (possibly non-ASCII) "name". Secondly, the data callback tag is removed. This is conceptually a client/storage engine interaction, even if in practice in some cases, the Sproc may be acting as a Storage Engine as well. Thirdly, since some systems do not launch applications to display destinations based on document type, an application identifier has been added. Finally, we change the name to "OpenDocuments" to imply some relationship to the next message.

Also, there is the notion of a client having independently loaded a document (i.e., not having been requested through an "OpenDocuments" message) and wishing to register this fact with the Sproc. This is analogous to the "CreateNode" in the current proposal. If the browser initiates this sequence, it receives a return code in reply.

SEQUENCE  OpenDocuments  server, browser
   MESSAGE  OpenDocuments  server->browser, browser->server
   {
      num_docs      : cardinal
      doc_ids       : id_1d[num_docs];
      read_onlys    : boolean_1d[num_docs];
      doc_nicknames : NTstring_1d[num_docs];
      app_ids       : id_1d[num_docs];
      doc_types     : NTstring_1d[num_docs];
   };
   MESSAGE  OpenDocumentsRespond  server->browser
   {
      rc : rc_t;
   };

The Sproc can request that a client close a document. This is analogous to the current "CloseNode". The current form of this message includes a flag that optionally indicates to the client that it should flush its copy of the anchor table for this document. However, it seems that there is no other facility for asking the client to flush its anchor table cache. This functionality should be effected in another message.

Also, a client should also be able to unregister a document. This is analogs to the "Closing" message in the current protocol. If the browser initiates this sequence, it receives a return code in reply.

SEQUENCE  CloseDocuments  server, browser
   MESSAGE  CloseDocuments  server->browser, browser->server
   {
      num_docs : cardinal
      doc_ids  : id_1d[num_docs];
   };
   MESSAGE  CloseDocumentsRespond  server->browser
   {
      rc : rc_t;
   };

Display services. A Sproc must be able to request a client to display an anchor. (That is, the client should make the anchor visible on the screen since it is the destination of a newly followed link.) This message is analogous to the current protocol's "DisplayAnchor".

SEQUENCE  DisplayAnchors  server
   MESSAGE  DisplayAnchors  server->browser
   {
      num_ancs : cardinal
      anc_ids  : id_1d[num_ancs];
   };

Anchor services. Sprocs may need to inform clients that anchors have been either created, deleted, or modified in documents managed by the client. Likewise, the client may wish to create, delete, or modify anchors. Basically, these services are implemented with the "RequestService" message of the current protocol. The current protocol's message "UpdateAnchors" is related to the "ModifyAnchors" below. If the author initiates these sequences, the server returns a return code and the anchors (and/or their identifiers) that were successfully created, deleted, or modified. If the author cannot allocate anchor identifiers, this parameter should be left empty. The server will supply anchor identifiers in this case. Either the server or author may request a fresh list of the anchors handled by the other.

SEQUENCE  CreateAnchors  server, author
   MESSAGE  CreateAnchors  server->author, author->server
   {
      num_ancs : cardinal;
      anc_ids  : id_1d[num_ancs];
      ancs     : anc_1d[num_ancs];
   };
   MESSAGE  CreateAnchorsRespond  server->author
   {
      num_ancs : cardinal;
      anc_ids  : id_1d[num_ancs];
      ancs     : anc_1d[num_ancs];
      rc       : rc_t;
   };

SEQUENCE  DeleteAnchors  server, author
   MESSAGE  DeleteAnchors  server->author, author->server
   {
      num_ancs : cardinal;
      anc_ids  : id_1d[num_ancs];
   };
   MESSAGE  DeleteAnchorsRespond  server->author
   {
      num_ancs : cardinal;
      anc_ids  : id_1d[num_ancs];
      rc       : rc_t;
   };

SEQUENCE  ModifyAnchors  server, author
   MESSAGE  ModifyAnchors  server->author, author->server
   {
      num_ancs : cardinal;
      anc_ids  : id_1d[num_ancs];
      ancs     : anc_1d[num_ancs];
   };
   MESSAGE  ModifyAnchorsRespond  server->author
   {
      num_ancs : cardinal;
      anc_ids  : id_1d[num_ancs];
      ancs     : anc_1d[num_ancs];
      rc       : rc_t;
   };

SEQUENCE  GetAnchors  server, author
   MESSAGE  GetAnchors  server->author, author->server
   {
   };
   MESSAGE  GetAnchorsRespond  server->author, author->server
   {
      num_ancs : cardinal;
      anc_ids  : id_1d[num_ancs];
      ancs     : anc_1d[num_ancs];
      rc       : rc_t;
   };

Link services. Sprocs may need to inform clients that links have been either created, deleted, or modified in documents managed by the client. Likewise, the client may wish to follow, create, delete, or modify links. For the following operation, there needs to be an alternate form that indicates to the server that all destinations should be returned directly to the requesting client, regardless of how the destinations would normally be displayed. This is to allow automatic indexers to index an OHSWG structure space. These services are implemented with the "RequestService" message of the current protocol. If the author initiates these sequences, the server returns a return code. Either the server or author may request a fresh list of the links handled by the other.

SEQUENCE  FollowLinks  browser
   MESSAGE  FollowLinks  browser->server
   {
      num_srcs : cardinal
      src_ids  : id_1d[num_ancs];
   };
   MESSAGE  FollowLinksRespond  server->browser
   {
      rc : rc_t;
   };

SEQUENCE  GetDests  browser
   MESSAGE  GetDests  browser->server
   {
      num_srcs : cardinal
      src_ids  : id_1d[num_ancs];
   };
   MESSAGE  GetDestsRespond  server->browser
   {
      num_dests : cardinal;
      dest_ids  : id_1d[num_dests];
      dests     : anc_1d[num_dests];
      rc        : rc_t;
   };

SEQUENCE  CreateLinks  server, author
   MESSAGE  CreateLinks  server->author, author->server
   {
      num_lnks      : cardinal;
      num_to_ancs   : cardinal[num_lnks];
      to_ancs       : id_1d[num_lnks][num_to_ancs()];
      num_from_ancs : cardinal[num_lnks];
      from_ancs     : id_1d[num_lnks][num_to_ancs()];
   };
   MESSAGE  CreateLinksRespond  server->author
   {
      rc : rc_t;
   };

SEQUENCE  DeleteLinks  server, author
   MESSAGE  DeleteLinks  server->author, author->server
   {
      num_lnks      : cardinal;
      num_to_ancs   : cardinal[num_lnks];
      to_ancs       : id_1d[num_lnks][num_to_ancs()];
      num_from_ancs : cardinal[num_lnks];
      from_ancs     : id_1d[num_lnks][num_to_ancs()];
   };
   MESSAGE  DeleteLinksRespond  server->author
   {
      rc : rc_t;
   };

SEQUENCE  AddAnchorsToLinks  server, author
   MESSAGE  AddAnchorsToLinks  server->author, author->server
   {
      num_lnks      : cardinal;
      num_to_ancs   : cardinal[num_lnks];
      to_ancs       : id_1d[num_lnks][num_to_ancs()];
      num_from_ancs : cardinal[num_lnks];
      from_ancs     : id_1d[num_lnks][num_to_ancs()];
   };
   MESSAGE  AddAnchorsToLinksRespond  server->author
   {
      rc : rc_t;
   };

SEQUENCE  RemoveAnchorsFromLinks  server, author
   MESSAGE  RemoveAnchorsFromLinks  server->author, author->server
   {
      num_lnks      : cardinal;
      num_to_ancs   : cardinal[num_lnks];
      to_ancs       : id_1d[num_lnks][num_to_ancs()];
      num_from_ancs : cardinal[num_lnks];
      from_ancs     : id_1d[num_lnks][num_to_ancs()];
   };
   MESSAGE  RemoveAnchorsFromLinksRespond  server->author
   {
      rc : rc_t;
   };

SEQUENCE  GetLinks  server, author
   MESSAGE  GetLinks  server->author, author->server
   {
   };
   MESSAGE  GetLinksRespond  server->author, author->server
   {
      num_lnks      : cardinal;
      num_to_ancs   : cardinal[num_lnks];
      to_ancs       : id_1d[num_lnks][num_to_ancs()];
      num_from_ancs : cardinal[num_lnks];
      from_ancs     : id_1d[num_lnks][num_to_ancs()];
   };

Context services. A client may wish to open, close, create, delete, or modify contexts. These services are implemented with the "RequestService" message of the current protocol. A browser may request the identifiers of all opened and closed contexts. In this case, the browser supplies a list of context ids. The state for these contexts is returned. If the number of contexts in the set the browser initially supplies is negative, the array is ignored, an the state of all contexts is returned.

SEQUENCE  OpenContexts  browser
   MESSAGE  OpenContexts  browser->server
   {
      num_ctxs : cardinal;
      ctx_ids  : id_1d[num_ctxs];
   };
   MESSAGE  OpenContextsRespond  server->browser
   {
      num_ctxs_opened : cardinal;
      opened_ctx_ids  : id_1d[num_ctxs_opened];
      rc              : rc_t;
   };

SEQUENCE  CloseContexts  browser
   MESSAGE  CloseContexts  browser->server
   {
      num_ctxs : cardinal;
      ctx_ids  : id_1d[num_ctxs];
   };
   MESSAGE  CloseContextsRespond  server->browser
   {
      num_ctxs_closed : cardinal;
      closed_ctx_ids  : id_1d[num_ctxs_closed];
      rc              : rc_t;
   };

SEQUENCE  GetContexts  browser
   MESSAGE  GetContexts  browser->server
   {
      num_ctxs : integer;
      ctx_ids  : id_1d[num_ctxs];
   };
   MESSAGE  GetContextsRespond  server->browser
   {
      num_open_ctxs   : cardinal;
      open_ctx_ids    : id_1d[num_open_ctxs];
      num_closed_ctxs : cardinal;
      closed_ctx_ids  : id_1d[num_closed_ctxs];
      rc              : rc_t;
   };

SEQUENCE  CreateContexts  author
   MESSAGE  CreateContexts  author->server
   {
      num_ctxs : cardinal;
      num_lnks : cardinal[num_ctxs];
      lnks     : id_1d[num_ctxs][num_lnks()];
   };
   MESSAGE  CreateContextsRespond  server->author
   {
      rc : rc_t;
   };

SEQUENCE  DeleteContexts  author
   MESSAGE  DeleteContexts  author->server
   {
      num_ctxs : cardinal;
      num_lnks : cardinal[num_ctxs];
      lnks     : id_1d[num_ctxs][num_lnks()];
   };
   MESSAGE  DeleteContextsRespond  server->author
   {
      rc : rc_t;
   };

SEQUENCE  AddLinksToContexts  author
   MESSAGE  AddLinksToContexts  author->server
   {
      num_ctxs : cardinal;
      num_lnks : cardinal[num_ctxs];
      lnks     : id_1d[num_ctxs][num_lnks()];
   };
   MESSAGE  AddLinksToContextsRespond  server->author
   {
      rc : rc_t;
   };

SEQUENCE  RemoveLinksFromContexts  author
   MESSAGE  RemoveLinksFromContexts  author->server
   {
      num_ctxs : cardinal;
      num_lnks : cardinal[num_ctxs];
      lnks     : id_1d[num_ctxs][num_lnks()];
   };
   MESSAGE  RemoveLinksFromContextsRespond  server->author
   {
      rc : rc_t;
   };

Computation services. This message models client and server side computation. It is analogous to the "Interpret" message of the current protocol. Instead of the ability only to send scripts, a "program identifier" can also be sent, indicating that the recipient of the message should retrieve the program from the appropriate storage engine and execute it.

SEQUENCE  Execute  all
   MESSAGE  Execute  all->all
   {
      pgm_id   : id_t;
      script   : script_t;
      lang     : NTstring;
      num_tags : cardinal;
      tags     : NTstring_1d[num_tags];
      tag_vals : NTstring_1d[num_tags];
   };
   MESSAGE  ExecuteRespond  all->all
   {
      rc : rc_t;
   };

Bidirectional "services". For completeness, and to address the bi-directional messages in the current proposal, these messages are included. However, because they are so general, the group should decide if they are necessary, or if these cases can better be handled otherwise (e.g., return codes for the "Error" message and new Sproc definitions for the "Other" message).

SEQUENCE  Error  all
   MESSAGE  Error  all->all
   {
      error_name : NTstring;
      error_text : NTstring;
      num_tags   : cardinal;
      tags       : NTstring_1d[num_tags];
      tag_vals   : NTstring_1d[num_tags];
   };

SEQUENCE  Other  all
   MESSAGE  Other  all->all
   {
      num_tags : cardinal;
      tags     : NTstring_1d[num_tags];
      tag_vals : NTstring_1d[num_tags];
   };

6. Metacommentary

This section evaluates the process by which we derived our architecture and protocol proposals. The OHSWG has chosen to use a scenario-based design approach for work on the architecture and protocols. The architecture work was proposed mainly as a method for understanding the protocol requirements. Have these ideas been helpful?

Scenarios. Certainly, in this paper, we found that grounding our reasoning in concrete examples of use of systems (strictly hypertext systems or not) led to the identification of major issues for the consideration of the group. Even the few implications we derived from the analysis of the four scenarios discussed played a major role in our proposals. In short, we found the scenario-based approach helpful in organizing our thinking.

The scenarios play another role in this paper as well. They help provide motivation for the implications we drew. Thus not only were they helpful to us, but hopefully are helpful to others who consider the implications we identified. Whether or not a particular person finds the conclusions justified, the scenarios provide a basis for discussion that is easier to understand than abstract discussion on the issues of naming, open structural abstractions, etc.

The group so far has been only partially committed to carrying out requirements analyses based on these scenarios. Several position papers at the last two group meetings described or at least summarized scenarios for the purposes of defending a particular feature of a proposal or identifying important issues. However, other discussions, especially on the OHS mailing list, have tended to shy away from using scenarios to discuss issues. Informally, we have noted that those issues raised in position papers with scenarios have either led to useful discussions or have been generally accepted by the group as being important. Conversely, similar issues raised on the mailing list have resulted in exchanges of assertions that do not seem well-supported and do not lead to productive discussion. It is not entirely clear that the scenarios comprise the whole difference between such discussions, but we suspect that the usefulness of a discussion and its basis in practice are causally related.

Architecture. The issue of how helpful the architecture work has been in considering the protocol requirements is more difficult to answer. It seems that in large part, the architecture work is being pursued because people find it interesting in its own right, rather than for purposes of protocol design. There is nothing wrong with this pursuit - on the contrary, it will undoubtedly heighten the group's collective understanding of OHS's. However, we should consider decoupling these two efforts, at least partially.

Most of the architecture is not relevant for designing a protocol that allows application interoperability. In our protocol proposal, we considered only the link server (Sproc) layer and those entities that communicate with it. Clearly, many of the entities can be further described. They may in fact be quite complex. Consider that clients and storage engines may be wrapped or not, there may be collaboration support, tool integrators, session managers, notification control systems, and a host of other functionality in an OHS environment. Nonetheless, from the point of view of the protocol, only those entities that communicate with the server(s) are relevant, and then only to some degree.

We believe that the architecture work should continue, but that protocol design should be decoupled from it. Scenarios can inform both the protocol and architecture efforts, and the architecture work can suggest future protocols that the group may want to standardize. "Radical" architecture proposals such as those discussed at OHS 3 (e.g., combining the client and storage engine entities) might impact the way in which protocol work is carried out, but arguments over the details can probably be safely ignored by the protocol designers.

Current Implementations. Unfortunately, the group to date has few real success stories. No fully-compliant implementations of the current OHP have been announced on the OHS mailing list. As of OHS 3, no systems were interoperating using OHP. If the group is to remain viable and vibrant, this must change soon. The main problems to date have seemed to be that the level of uncertainty in the specification and the lack of available effort on behalf of the group members.

Until the group endorses a preliminary draft of the protocol, it is unlikely that group members will implement any proposal. However, the group is hesitant to endorse any particular proposal until it feels comfortable that no major problems exist with the proposal. Obviously, these are conflicting interests. In our experience, people in general and academics especially err on the side of wanting a "perfect" solution. As a group, we should try to resist this urge to make the protocol specification perfect before releasing it. We believe that it is imperative that the group fully endorse a proposal and a method for modifying the proposal by the end of 1997. The method for modifying the standard should require widespread consent of the group membership before changes are endorsed and take place within a framework that ensures a high degree of backward compatibility.

7. Conclusions

Summary of the work. We presented synopses of four scenarios. Our analysis of these scenarios allowed us to derive four implications for the OHSWG reference architecture and protocol work. We concluded that open structural abstractions and semantically meaningful operations were required, reference architecture proposals designed to inform protocol work should exclude external entities, and that naming is a difficult and important issue for the group's work.

Next, we presented a reference architecture proposal based on previous proposals to the group and on our scenario analyses. The key differentiating features of our proposal are the exclusion of entities that do not interact with OHSWG servers and an open layer of link server peers called Sprocs.

We also discussed a proposal for an OHP protocol. In light of the open Sproc layer in our reference architecture proposal, we reconceptualized the OHP as a family of protocols, one per Sproc. This allows a finer-grained notion of OHSWG services. We abstracted purely syntactic issues (how to encode a message into a byte stream) from the issues central to particular Sprocs (what services to provide). Finally, we provided a proposal for the interface to a Sproc that handles simple navigation hypertext. We believe composition and IR Sproc interfaces should also receive attention in the short term from the group.

Finally, we made some comments on the method chosen by the OHSWG to accomplish its goals. We found the scenario based design approach worthwhile. We believe, however, that further architecture work should be decoupled from protocol work.

Observations. We feel that the work of the OHSWG can be characterized as the design of middle-ware for distributed open structure systems. As such, we are faced with the decision of how to proceed. Large-scale attempts by industry-led consortiums are designing and implementing similar middle-ware. CORBA [OMG 1997] and JavaBeans [JavaSoft 1997] are two such examples. Clearly, we want to avoid re-inventing the wheel. To what degree can we use the facilities of existing middle-ware projects? More importantly, what is the true value of OHSWG work when compared to these other projects?

We do not feel it is time to adopt CORBA or JavaBeans as our development platform, despite the similarities in the general aims of the projects. There are problems with these other projects that we will inherit if we adopt these standards. For example, CORBA is fairly heavy-weight and expensive. JavaBeans is still immature. The choice of which of these or other similar component frameworks to use, or whether to develop our own, should be a matter for discussion by the group.

In any event, there is still much to be learned from these projects. The conceptualization of the OHP not so much as a protocol but as a family of interfaces to distributed objects, beans, entities, Sprocs, or general servers is useful. There are many ideas we can borrow concerning issues such as naming, location services, and other important functionality. We should attempt to do only as much work as we need, building on existing applicable work when appropriate, and proposing new solutions when required.

Several existing OHS's provide general development tools for the construction of servers and clients. Oftentimes, however, these tools are not the focus of our academic papers or discussions about our systems. We should make a concerted effort to exchange information about these aspects of our work, considering the potentially large impact these tools could have on the group's work. For example, the HOSS Protocol Definition Compiler described in Section 5 (proto.html) can greatly reduce the time it takes to develop IPC libraries for different protocols. The group could choose to adopt this program as a starting point, concentrating on service definition instead of debating issues of encoding or building IPC libraries by hand. Other systems will have similar tools and methods that will make the group's work easier. We will likely find that after pooling the resources and experiences of the group membership, the design and implementation tasks before us will seem more manageable.

Future Work. The immediate term (1997) goals for the OHSWG should be to endorse a protocol specification for navigation services and a process by which the specification can be altered. Within the next 6-12 months, the group should also agree on specifications for composition and IR services.

There are many possibilities for interesting work in the longer term. We feel that the group should explore the possibility of defining service standards for spatial, taxonomic, literary, and other types of hypertext with their unique forms of structure. We should attempt to transfer this technology to the broader hypertext community as well. Standards concerning common locspec and pspec formats, common naming schemes, and other details not addressed in detail in this paper hold the promise of greater interoperability between the OHSWG representative systems.

References

AIIM. 1997. Open Document Management API. Available via http://www.aiim.org/odma/odma.htm.

Albitz, P and Liu, C. 1996. DNS and BINS, 2nd Edition. O'Reilly and Associates, Sebastopol, CA.

Anderson, K. 1997. A critique of the open hypermedia protocol. OHS 3 position statement. Available via http://www.daimi.aau.dk/~kock/OHS-HT97/Papers/anderson.html.

Davis, H., Lewis, A., and Rizk, A. 1996. OHP: A draft proposal for a standard open hypermedia protocol (levels 0 and 1: Revision 1.2 - 13th March 1996). Proceedings of the 2nd Workshop on Open Hypermedia Systems (Washington DC, Mar), UCI-ICS Technical Report 96-10. pp. 27-53. Available via http://diana.ecs.soton.ac.uk/~hcd/protweb.htm.

Davis H., Hall, W., Heath, I., and Wilkins, R. 1992. Towards an integrated information environment with open hypermedia systems. Proceedings of the Fourth ACM Conference on Hypertext (ECHT 92) (Milan, Italy, Dec). ACM Press, New York.

Digital Equipment Corporation. 1997. About AltaVista Search. Available via http://altavista.digital.com/av/content/about_our_story_3.htm.

Gifford, D., Needham, R., and Schroeder, M. 1992. The Cedar file system. Distributed Computing Systems: Concepts and Structures. A. Ananda and B. Srinivasan, Eds. IEEE Computer Society Press, Los Alamitos, CA.

Goose, S. Lewis, A., and Davis, H. 1997. OHRA: Towards an open hypermedia reference architecture and a migration path for existing systems. OHS 3 position statement. Available via http://www.daimi.aau.dk/~kock/OHS-HT97/Papers/goose.html.

Grønbæk, K. 1997. Post to the OHS mailing list on 10 June. Available via http://www.csdl.tamu.edu/ohs/archive/0041.html.

Grønbæk, K. and Wiil, U. 1997. Towards a reference architecture for open hypermedia. OHS 3 position statement. Available via http://www.daimi.aau.dk/~kock/OHS-HT97/Papers/gronbak.html.

Grønbæk, K. and Trigg, R.H. 1996. Toward a Dexter-based model for open hypermedia: Unifying embedded references and link objects. Proceedings of Hypertext '96, (Washington, DC, Mar 16-20).

Hebrawi, B. 1993. OSI Upper Layer Standards and Practices. McGraw-Hill, New York.

IETF. 1997. Uniform Resource Names (urn) Charter. Available via http://www.ietf.org/html.charters/urn-charter.html.

ISO. 1990. ISO/IEC 8825 (1990): Specification of Basic Encoding Rules (BER) for ASN.1. Transaction Publishers, New Brunswick, NJ.

JavaSoft. 1997. JavaBeans Home Page. Available via http://splash.javasoft.com/beans/.

Joyce, M. 1991. StorySpace as a hypertext system for writers and readers of varying ability. Proceedings of the Third ACM Conference on Hypertext (HT '91) (San Antonio, TX, Dec). ACM Press, New York, 381-388.

Marshall, C. and Shipman, F. 1995. Spatial hypertext: designing for change. Communications of the ACM 38, 8 (Aug), 88-97.

Nürnberg, P. 1997. Post to the OHS mailing list on 8 June. Available via http://www.csdl.tamu.edu/ohs/archive/0033.html.

Nürnberg, P., Leggett, J., and Schneider, E. 1997. As we should have thought. Proceedings of the Eighth ACM Conference on Hypertext (HT '97) (Southampton, UK, Apr). ACM Press, New York.

Nürnberg, P. 1996. Hypermedia operating systems: A new paradigm for computing. Proceedings of the Seventh ACM Conference on Hypertext (HT '96) (Washington, DC, Mar). ACM Press, New York.

Online Computer Library Center, Inc. 1997. Persistent URL Home Page. Available via http://purl.org/.

Object Management Group. 1997. OMG Home Page. Available via http://www.omg.org/.

Reich, S. 1997. Post to the OHS mailing list on 9 June. Available via http://www.csdl.tamu.edu/ohs/archive/0034.html.

Reiher, P., Heidemann, J., Ratner, D., Skinner, G., Popek, G. 1994 Resolving file conflicts in the Ficus file system. Proceedings of the Summer 1994 USENIX Conference (Boston, MA, Jun). IEEE Computer Society Press, Los Alamitos, CA, 183-195.

Sun Microsystems, Inc. 1995. The NFS Distributed File Service: NFS White Paper - March 1995. Available via http://www.sun.com/sunsoft/solaris/desktop/nfs.html, Mar.

Trigg, R. and Grønbæk, K. 1997. Heterogeneity, structure, and CSCW: Three challenges for open hypermedia. OHS 3 positions statement. Availavle via http://www.daimi.aau.dk/~kock/OHS-HT97/Papers/trigg.html.

Többicke, R. 1994. Distributed file systems: Focus on Andrew File System/Distributed File System. Proceedings of the IEEE Symposion on Mass Storage Systems (Annecy, France, Jun). IEEE Computer Society Press, Los Alamitos, CA, 23-26.

Walker, B., Popek, G., English, R., Kline, C., and Thiel, G. 1992. The Locus distributed operating system. Distributed Computing Systems: Concepts and Structures. A. Ananda and B. Srinivasan, Eds. IEEE Computer Society Press, Los Alamitos, CA, 145-164.

Welch, B. 1994. A comparison of three distributed file system architectures: Vnode, Sprite, and Plan 9. Computing Systems 7, 2 (Spring), 175-199.

Whitehead, J. 1997. Post to the OHS mailing list on 10 June. Available at http://www.csdl.tamu.edu/ohs/archive/0049.html.

Wiil, U. and Whitehead, J. 1997. Interoperability and Open Hypermedia Systems. In Proceedings of the 3rd Workshop on Open Hypermedia Systems, U. K. Wiil, Ed. Scientific Report 97-01, The Danish National Centre for IT Research.

Wiil, U. and Leggett, J. 1996. The HyperDisco approach to open hypermedia systems. Proceedings of Hypertext '96 (Washington, DC, Mar). ACM Press, New York, 140-148.

Resources

OHSWG Home Page at http://www.csdl.tamu.edu/ohs/.

OHS mailing list archive at http://www.csdl.tamu.edu/ohs/archive/index.html#start.

OHS 1 Workshop at http://www.daimi.aau.dk/~kock/OHS-ECHT94/.

OHS 2 Workshop at http://www.daimi.aau.dk/~kock/OHS-HT96/.

OHS 3 Workshop at http://www.daimi.aau.dk/~kock/OHS-HT97/.

Chimera Home Page at http://www.ics.uci.edu/pub/chimera/.

DHM Home Page at http://www.daimi.aau.dk/~kgronbak/DHM/DHMHome.html.

Euroclid Home Page at http://ourworld.compuserve.com/homepages/Euroclid/.

HOSS Home Page at http://www.csdl.tamu.edu/hoss/.

HyperDisco Home Page at http://www.daimi.aau.dk/~kock/Publications/HyperDisco/.

Microcosm Home Page at http://wwwcosm.ecs.soton.ac.uk/.

Appendix A. Generic Encoding Layer Rules

The following is an algorithm to map a set of generic structure objects to a byte stream suitable for transport with an OSI Layer 4 transport mechanism. The corresponding algorithm to map a byte stream to an object is not provided, but is easily derived from this algorithm.

Let:

Also, in the algorithm, let:

Given {obji}, do:

The byte stream that is written to the Transport Layer is the 8 byte integer size followed by the size byte block block.

Ignored here are issues about integer representation (e.g., 1's complement, 2's complement, etc.) and byte-order. These issues must be addressed by the OHSWG.

Appendix B. Sproc Generic Layer Rules

The Sproc Generic Layer receives a set of objects {obji} from the Sproc Specific Layer. One of these objects will be tagged with an attribute named "message_name". This is known as the "head" object. The following four attributes should be added to the head object.

Ignored here are issues about time representation (e.g., seconds since 1/1/70 0:00 GMT, etc.). These issues should be addressed by the OHSWG.

Appendix C. Sproc Specific Layer Rules

The Sproc Specific Layer sends sets of generic structural objects to the Sproc Generic Layer. They must map arbitrary service requests with arbitrary parameters to the generic structure object format. This appendix explains how this can be done.

Assume that a message definition is given in Sproc Interface Format, defined in Appendix D. The algorithm below handles only one message, but is modified in a straightforward way to accomodate multiple messages. Alias types are not explicitly handled below.

Let:

Also, in the algorithm, let:

Given msg, do:

where add_param (num, param_name, value, type) is defined as:

Appendix D. Sproc Interface Format

This appendix describes one possibility for how to specify interfaces to Sprocs. Alternative formats, such as CORBA Interface Definition Language (IDL) [OMG 1997] or other standards should also be considered. This format is based on a subset of the PDCL specifications of the HOSS system. A HOSS program named the Protocol Definition Compiler (PDC) maps PDCL specifications into IPC libraries. Currently, the target libraries generated use HOSS specific IPC calls. With modification, this compiler could generate code that includes the Sproc Generic and Generic Encoding Layer functionality as well, mapping directly from specifications to byte blocks suitable for transport via an OSI Layer 4 medium.

The semantics of the grammar below are not specified in detail here, but a brief description follows. Basically, specifications consist of a protocol (interface) identifier, a list of legal "targets" or classes of processes in the protocol (often "client" and "server"), a list of type definitions, and a list of services. There are several built-in types including cardinal, integer, byte, NULL-terminated strings, and identifier (opaque byte block), as well as arrays of these types. The type declaration facilities allow other types to be composed into arrays, structures, and linked lists.

Each service definition (sequence) consists of a name, an initiator target, and an arbitrary number of parts (messages). Each part consists of a name, a source target, a destination target, and an arbitrary number of parameters.

Currently, the PDC generates two "basic" libraries per protocol and one library per target per protocol. The basic libraries contain code to initialize the source process identifier and to encode and decode parameters of protocol-specific types. The target libraries contain messages to send and receive messages for which the target was specified as the source or destination, respectively. Also, for sequences that specify the target as the initiator, a message that composes the appropriate sending and receiving functions is generated. (This provides an "RPC-like" synchronous blocking interface to services for initiators.) The examples provided in Section 5 and the Appendix E specification of the current Davis et al. proposal in this format should provide clarifying examples.

Character sequence definitions. The following are shorthand for sequences of characters referred to in this specification. They are specified as UNIX regular expressions. Although not shown here, the regular expression for the ID sequence should be understood to exclude the keywords listed below.

WS[ \t\n]
COMMENT#.*\n
ID[a-zA-Z]+[a-zA-Zo-0_]*
DECNUM[0-9]+
HEXNUM0x[a-fA-F0-9]+
STRING\"[^\"\n]*\"

Keyword list.

ALIASFORSEQUENCE
ARRAYGENERATESIZE
DECLAREDLISTSTATIC
DECODE_FNMESSAGESTRUCTURE
DIMENSIONNEXTTARGET
DYNAMICOFTO
ENCODE_FNPOINTERTYPE

Grammar specification. The grammar consists of special token sequences and keywords. Tokens are delimited by white space as defined in the WS sequence. Additionally, the grammar refers to characters in the set {':', ';', ',', '.', '{', '}', '[', ']', '(', ')', '-', '>', '='}, which are rendered in bold to distinguish them from the grammar specification symbols. Comments are ignored.

PDCLFile:: TargetDecl* TypeDecl* SeqDecl*
TargetDecl:: TARGET Id
TypeDecl:: TypeDeclHeader Id TypeDeclRest ;
TypeDeclHeader:: TYPE [ ( Loc [ , Loc ]* ) ]
Loc:: ( DECLARED | ENCODE_FN | DECODE_FN ) = ( GENERATE | STRING )
TypeDeclRest:: ( SimpleTypeDeclRest | AliasTypeDeclRest | StaticArrayTypeDeclRest | DynArrayTypeDeclRest | StructTypeDeclRest | ListTypeDeclRest | PointerTypeDeclRest )
SimpleTypeDeclRest:: SIZE STRING
AliasTypeDeclRest:: ALIAS FOR Id
StaticArrayTypeDeclRest:: STATIC ARRAY [ SimpleDim [ , SimpleDim ]* ] OF Id
SimpleDim:: ( Id | STRING | NUMBER )
DynArrayTypeDeclRest:: DYNAMIC NUMBER DIMENSION ARRAY OF Id
StructTypeDeclRest:: STRUCTURE ( FieldDecl* )
FieldDecl:: Id : [ [ Index [ , Index ]* ] ] ;
Index:: Id [ ( ) ]*
ListTypeDeclRest:: LIST ( FieldDecl* ) NEXT = Id
PointerTypeDeclRest:: POINTER TO Id
SeqDecl:: SEQUENCE Id Id MessageDecl*
MessageDecl:: MESSAGE Id Id -> Id ( ParamDecl* ) ;
ParamDecl:: Id : Id [ [ Index [ , Index ]* ] ] ;
Id:: ID | ALIAS | ARRAY | DECLARED | DECODE_FN | DIMENSION | DYNAMIC | ENCODE_FN | FOR | GENERATE | LIST | MESSAGE | NEXT | OF | POINTER | SEQUENCE | SIZE | STATIC | STRUCTURE | TARGET | TO | TYPE

Appendix E. Current Protocol Proposal in Sproc Interface Format

This is a rendering of the current OHP proposal by Davis et al. in the Sproc Interface Format defined in Appendix D. The changes suggessted in Anderson are not included, even though some were widely agreed upon. This specification is laregly to provide an example interface.

# These are the target classes.  A program may belong to more than one class.
TARGET  server
TARGET  client
TARGET  all

# These are the types not provided intrinsically
TYPE  locspec_t  STRUCTURE (
   ContentType : NTstring;
   Content : NTstring;
   Count : NTstring;
   ReverseCount : NTstring;
   Name : NTstring;
   Script : NTstring;
);

TYPE  locspec_1d  DYNAMIC 1 DIMENSION ARRAY OF  locspec_t;

# The services

# These are the server-initiated calls
SEQUENCE  LaunchDocument  server
   MESSAGE  LaunchDocument  server -> client
   {
      DocumentName : NTstring;
      ReadOnly : boolean;
      DocumentNickname : NTstring;
      DocumentType : NTstring;
      DataCallback : boolean;
      Channel : integer;
   };

SEQUENCE  DisplayAnchor  server
   MESSAGE  DisplayAnchor  server -> client
   {
      AnchorId : id_t;
      Presentation : NTstring;
      Channel : integer;
   };

SEQUENCE  DisplayLocSpec  server
   MESSAGE  DisplayLocSpec  server -> client
   {
      LocSpec : locspec_t;
      Presentation : NTstring;
      Channel : integer;
   };

SEQUENCE  Interpret  server
   MESSAGE  Interpret  server -> client
   {
      ScriptType : BTstring;
      Data : NTstring;
      Channel : integer;
   };

SEQUENCE  CloseNode  server
   MESSAGE  CloseNode  server -> client
   {
      UpdateNode : boolean;
      Channel : integer;
   };
   MESSAGE  UpdateAnchors  client -> server
   {
      num_anchs : integer;
      AnchorIds_ids : id_1d[num_ancs];
      LocSpecs : locspec_1d[num_ancs];
      Directions : NTstring_1d[num_ancs];
      Presentations : NTstring_1d[num_ancs];
      Services : NTstring_1d[num_ancs];      
      Channel : integer;
   };

# These are client-initiated calls
SEQUENCE  CouldntFindThisMessage  client
   MESSAGE  CouldntFindThisMessage  client -> server
   {
      Channel : integer;
   };
   MESSAGE  HeresNewAnchor  server -> client
   {
      AnchorId : id_t;
      LocSpec : locspec_t;
      Direction : NTstring;
      Presentation : NTstring;
      Service : NTstring;
      Channel : integer;
   };

SEQUENCE  GetNode  client
   MESSAGE  GetNode  client -> server
   {
      DocumentName : NTstring;
      Channel : integer;
   };
   MESSAGE  HeresDocument  server -> client
   {
      Data : NTstring;
      Channel : integer;
   };

SEQUENCE  GetServices  client
   MESSAGE  GetServices  client -> server
   {
      Channel : integer;
   };
   MESSAGE  HeresServices  server -> client
   {
      num_items : integer;
      Menuitems : NTstring_1d[num_items];
      Services : NTstring_1d[num_items];
      Channel : integer;
   };

SEQUENCE  GetAnchorTable  client
   MESSAGE  GetAnchorTable  client -> server
   {
      Channel : integer;
   };
   MESSAGE  HeresAnchorTable  server -> client
   {
      num_items : integer;
      LocSpecs : locspec_1d[num_items];
      Directions : NTstring_1d[num_items];
      Presentations : NTstring_1d[num_items];
      Services : NTstring_1d[num_items];
      Channel : integer;
   };

SEQUENCE  CreateNode  client
   MESSAGE  CreateNode  client -> server
   {
      DocumentName : NTstring;
      DocumentType : NTstring;
   };
   MESSAGE  HeresNewChannel  server -> client
   {
      SendDocument : boolean;
      DocumentNickname : NTstring;
      Channel : integer;
   };

SEQUENCE  GetAnchorTable  client
   MESSAGE  GetAnchorTable  client -> server
   {
      num_items : integer;
      LocSpecs : locspec_1d[num_items];
      Directions : NTstring_1d[num_items];
      Presentations : NTstring_1d[num_items];
      Services : NTstring_1d[num_items];
      Channel : integer;
   };

SEQUENCE  UpdateNode  client
   MESSAGE  UpdateNode  client -> server
   {
      DocumentType : NTstring;
      Date : NTstring;
      Channel : integer;
   };

SEQUENCE  Closing  client
   MESSAGE  Closing  client -> server
   {
      Channel : integer;
   };

SEQUENCE  RequestService  client
   MESSAGE  RequestService  client -> server
   {
      Service : NTstring;
      AnchorID : id_t;
      LocSpec : locspec_t;
      Presentation : NTstring;
      Channel : integer;
   };

# These are the "bi-directional" messages
SEQUENCE  Error  all
   MESSAGE  Error  all -> all
   {
      Subject : NTstring;
      Message : NTstring;
      Channel : integer;
   };

SEQUENCE  Other  all
   MESSAGE  Other  all -> all
   {
      num_tags : integer;
      tag_names : NTstring_1d[num_tags];
      tag_vals : NTstring_1d[num_tags];
      Channel : integer;
   };


Peter J. Nürnberg, John J. Leggett
HRL, CSDL, Texas A&M
original page URL: http://jodi.ecs.soton.ac.uk/Articles/v01/i02/Nurnberg/jodi.html