PROJECT PROPOSAL

A Scalable Internet Resource Service

M.R. van Steen, A.S. Tanenbaum

September 25, 1998

1 Problem Statement

1.1 Overloaded Resource Servers

The current internet is built around a client/server model in which clients fetch information from resource servers. In this proposal we restrict ourselves to resources built from files, and ignore, for example, dynamically generated information . Resource servers typically provide access to a file system through different high-level protocols. At present, many clients access resource servers in a uniform way by means of Web browsers. Modern browsers support the traditional transfer protocols, such as HTTP and FTP, but also support indirect access to servers through local proxies.
A major problem with many resource servers is that they may easily be swamped with requests for information. In the cases that requests are somewhat uniformily distributed across the entire set of files available at the server, it makes sense to replicate the entire server to balance the load.
When replicas can share a common address, it is relatively easy to evenly balance requests using a round-robin strategy. This approach has been shown to work when using DNS domain names as addresses, and Bind's round-robin feature.
When sharing of a common address is not possible, replicas are coined mirror sites. Each mirror site will have to be made publicly known under a different name. In practice, this approach works well if a client can always efficiently access a main site, and be redirected to a mirror when bulk data needs to be transferred.

1.2 Scaling through Distribution and Replication

Replicating an entire server can easily be overkill. In many cases, access problems are caused by a small subset of files available at the server. A typical example is formed by new software distributions that may lead to a temporary access problem as clients are downloading the new distribution. Another example is a Web server that supports multimedia files, that is files that are large and whose content is best transferred by streams. Although the server may actually be capable of simultaneously servicing many clients, congestion may occur because a few users are transferring audio and video through long-lived streams.
What is needed in this case is a solution that will allow us to selectively distribute and replicate files. Each file should have a location-independent name that will allow us to distribute incoming client requests across the servers that manage that file. In this way, an internet resource server becomes a truly distributed resource service.
Scaling resource services not only requires that we can select particular files to distribute and replicate. It also requires that we can support different replication policies as well. Consider the following examples.

Software distributions. A software distribution is often constructed as a collection of archive files. The characteristic feature is that as soon as updates are announced, a server will get a large number of requests during a relatively short time, which gradually converges to a stable number of requests per time unit. A suitable replication strategy would be to replicate the updates first to a number of core servers, and then announce its availability. Clients would then contact the nearest core server.
News articles. News articles, including news flashes, announcements of timely events, etc., could also benefit from a push strategy, just like updating software distributions. However, in this case it may be worthwhile to have a mechanism by which clients subscribe to certain subjects. In this way, exactly to which locations files need to be replicated can be controlled much better.
Scientific papers. There are also many examples in which replication may actually be a waste of resources. For example, scientific papers which have been made electronically available, are generally read by a relatively small group of people, and for which simultaneous access is not an issue. In such cases, replication should not be done.
Internet RFCs. As a last example, consider files that would benefit from replication, but for which immediate replication is not necessary. Internet RFCs are an example. In such cases, we could successfully follow a lazy scheme in which replicas are pulled in on demand at another server. In other words, only after a client actually wants to read an RFC will a server go to the original source and store a copy of the RFC before handing it to the client.

In this project, we are initially seeking for a simple, but expandable solution that will allow us to selectively distribute and replicate files across several servers. Such a solution is described below.

2 A Globe-Based Solution

At the Vrije Universiteit, Amsterdam, we are developing a wide-area distributed system called Globe [5]. The main idea of Globe is that it provides physically distributed objects. These kinds of objects allow an object's state to be distributed and replicated across multiple machines. How this is done, and how replicas are kept consistent, is decided and implemented on a per-object basis. In other words, Globe allows fine-tuning of distribution and replication policies as is most convenient for a specific object. Documentation on Globe can be found through our Web site athttp://www.cs.vu.nl/globe/.
The Globe approach can be followed to develop scalable internet resource services. The main idea is to encapsulate a resource (i.e., one or more files) into a Globe distributed object, and to subsequently associate the most appropriate replication and distribution strategy with that object. This approach requires adaptations on servers and clients, which we describe next.

2.1 Server Support

A Globe distributed object generally resides at multiple machines at the same time as shown in Figure 1. Each local object, that is the implementation of the distributed object in a specific address space, is constructed using standardized components. These include components for communication between local objects, components for locally implementing a specific replication strategy, and components that encapsulate the actual state of a resource.

(Figure 1: The general organization of a Globe distributed object.)

To take full advantage of Globe's capabilities, a distributed resource would be implemented as a Globe distributed object. This is done by creating several processes, each running on a different machine and each maintaining copies of the resource's files. To the outside world, however, the location of the processes, and thus the copies, is completely transparent. A client merely gets to see an interface to the distributed object. In addition, how the copies are kept consistent can be entirely resource-specific.
We propose to develop a simple mechanism that will allow a resource server to create and execute a process capable of supporting Globe distributed objects. The files will remain accessible in the usual way.

2.2 Client Support

In Globe, we assume resources are identified through a specific naming scheme, to which we refer to as a Globe Uniform Resource Name (Globe URN). A Globe URN is a human-readable name that adheres to the general URI syntax as defined in RFC 1630 [1], havingglobeas its scheme identifier (The precise syntax of Globe URNs is currently subject to research, but will adhere to the rules described in RFC 2141 [4]).
For this project, we propose to support three naming conventions for resources. Consider a resourceres.txtwhich has been replicated at two different servers, and which can be located using either one of the following normal URLs:ftp://ftp.cs.vu.nl/pub/steen/res.txtorhttp://www.cs.tudelft.nl/ikuz/project.txt. We assume clients access and transfer resources through a Web browser. The three naming conventions we wish to support are as follows:

The resource is accessible and transferable using either URL. In that case, the FTP server at ftp.cs.vu.nl or the HTTP server at www.twi.tudelft.nl, respectively, will be directly requested to transfer the file to the client.
The resource will also be assigned a Globe URN, such asglobe:/projects/descr.txt. This name can be resolved to one of the original URLs, after which normal transfer can take place. Actual name resolution is dscribed below.
The resource is also accessible through the embedded Globe URNhttp://HOST/projects/descr.txt, which can be readily processed by existing Web browsers.HOSTis a pre-defined host on the internet.

Name resolution, resource access, and transfer of files, takes place by means of a Globe proxy which is installed at a client's site. A Globe proxy has the following capabilities:

It can handle normal URLs, by passing them to appropriate (local) proxies, or directly to servers.
It can resolve (normal and embedded) Globe URNs to normal URLs associated with the named resource. When all normal URLs have been found for that resource, the proxy selects "the best one," and contacts the associated server. In this way, load balancing across the servers where the resource is replicated is established.
If a (normal or embedded) Globe URN refers to a Globe distributed object, that is a resource that has been encapsulated into a distributed object, the proxy can bind to that object (see [5]), and transfer the object's state to the client's site in an objectspecific way.

Clients that do not have a Globe proxy installed, will pass an embedded Globe URN to a server identified in that URN. This server will then resolve the embedded Globe URN to a normal URL and return the latter as an HTTP redirection [2] to the client.

3 Plan of Work

All implementations will be done in Java using only standardized components where possible. In this way, software is guaranteed to be highly portable, enabling a wide dissemination. All software will be made publicly available. Furthermore, we propose to initially develop prototype implementations to assess the feasibility of our approach.

3.1 Results

The project will provide the following:

A server process that is capable of supporting Globe distributed objects, in particular for distributed resources as explained above.
A client proxy capable of supporting Globe distributed objects, and capable of resolving resource names as described above.
A simple, nondistributed name server that resolves (embedded and normal) Globe URNs to normal URLs, as described above.

3.2 Schedule

Development of the nondistributed name server (6 months)
Development of the client proxy (6 months)
Development of the server process and accompanying tools (12 months)

We propose to split the project into two phases. The first phase has a duration of 12 months in which the name server and client proxy are developed. The resulting software will enable a simple solution to load balancing requests for distributed resource services. It will not yet support Globe distributed objects.
Only if the first phase has been positively evaluated, we will continue in a second phase with the development of a server process. After the second phase, we will have a prototype implementation for Globe distributed objects that can be used for constructing truly distributed resource services.

3.3 Costs and Resources

We are asking for a full-time systems programmer for a total duration of two years.
Supervision is organized along the same lines as our other development projects, which is as follows. A small team is formed consisting of the programmer, a PhD student whose research is directly related to the developments, and one other programmer. This team is supervised by M. van Steen, and meets on a weekly basis. There are also off-line discussions with the PhD student who acts as sparing partner.
Practice indicates that team meetings on average take 1 hour per week, and off-line discussions approximately 2 hours per week. Additional overall management, supervision, and review by staff of the development efforts is also approximately 1 hour per week.

3.4 Software Management

Software is developed in Java using standard, publicly available Java development tools. Care is taken to ensure portability across different platforms, although we initially concentrate on Unix only. Ports to Windows NT for all are software should eventually take place.
No special measures have yet been taken with respect to software distribution and maintenance. Depending on the success of Globe distributions, the VU is committed to ensure that these aspects are taken care of, as is demonstrated by former and current projects (the ACK compiler kit, Amoeba, and Minix).

4 The Globe Context

This project will be carried out in the context of the Globe project. As mentioned, Globe aims at developing a wide-area distributed system that can support one billion users worldwide, each having thousands of objects. To meet these fabulous goals, a significant research, development, and experimentation effort is required. Unfortunately, not much is really known regarding truly worldwide scalable solutions. Although it is easy to imagine that specific solutions will not scale, it is much harder to design those that do scale. As a project, Globe can be seen as a systematic attempt to obtain more and detailed insight in scalability problems and their solutions.
An important aspect of our approach is that we validate our ideas through implementations that are actually used. In fact, we believe that obtaining insight in scaling aspects can be done only through extensive experimentation. For worldwide scalable systems, this means that our solutions should be demonstrated to work on the internet, and getting millions of people involved. In the following, we briefly take a look at each of the three main activities in Globe: research, development, and experimentation.

4.1 Research in Globe

Globe has now reached a point in which an initial architectural design has been finished. The architectural design will be outlined in IEEE Concurrency [5], whereas details are currently being written down in a Ph.D. thesis [3]. The main research issues we are currently addressing are the following.

Security Although some initial work on security has been done, we are now taking a much closer look at how Globe's distributed objects can be made secure. In particular, we are searching for an extension of our framework that will allow us to easily incorporate existing and future security algorithms and implement very different policies. At present, we have a PhD student working full-time on this subject.

Object composition To build large-scale applications, we need appropriate mechanisms to combine our distributed objects into larger distributed objects that fit into the same framework. Object composition is relatively simple in a client/server environment where scaling through replication is not really an issue. When dealing with highly replicated objects, object composition becomes much harder. For example, when a replicated object invokes a method of another replicated object, we have to ensure that the invoked method is executed only once. Global or centralized coordination is out of the question in a wide-area system. Problems of a similar nature arise, caused only by the fact that we seek worldwide scalable solutions. Object composition is the main subject of another PhD student.

Naming and locating objects A distributed system needs a naming service to locate the current address of an object. In a wide-area system, where objects may persist over decades, may be highly replicated, and above all, change their location as fast as the networks allows, finding objects is troublesome. One of PhD students is currently investigating a novel approach to locating (possibly rapidly moving) objects worldwide.

A Web-based Globe Related to our development and experimentation efforts, is building a simple version of Globe that treats Web documents as Globe distributed objects. Simplicity is obtained by ignoring object composition and keeping security to a minimum. A Web document is a collection of HTML pages, together with icons, images, applets, etc. The main goals are to investigate scalable replication strategies on a per-document basis, and to assist developers and users to identify the best strategy for their document. This work is closely related to the current project proposal. The research is being carried by a PhD student, in collaboration with the TU Delft.

4.2 Development in Globe

Validating ideas in an area where research has only recently started, is crucial. Therefore, a significant part of Globe should be spent on development of the actual system. However, development efforts should be clearly separated from research as their goals may conflict. Where research is targeted to identifying and solving problems, our development efforts are aimed at building prototype implementations that can be used across the internet by a different community than the Globe researchers. Consequently, prototyping is to be taken 6 as a serious engineering effort. For this reason, detailed design and implementation efforts are done by a separate group of systems programmers.
Each programmer is member of the Globe team, and works in close collaboration with our researchers. The main role of PhD students with respect to development is to provide an initial design of the relevant software components, and to act as sparing partner to the programmer when it comes to details and implementation. Research and development is supervised by staff (van Steen and Tanenbaum).
So far, we have developed an Interface Definition Language (IDL) that allows us to specify the interfaces of Globe's distributed objects. There is currently an IDL-to-C and and an IDL-to-Java compiler available. Our current development efforts concentrate on building an initial location service (funded by Oc'e R&D), and building a Globe Web proxy. The latter is a simplified version of the client proxy mentioned in this project proposal. For example, it provides no support for resolving resource names, and will have only a barebones interface to existing browsers. All are implementations are currently done in Java.
We currently have two systems programmers working full-time on building prototype implementations.

4.3 Experimentation in Globe

The ultimate goal of our research and development is to conduct experiments on the internet having real users in the loop. To come to that point it is absolutely vital that our implementations have been thoroughly engineered and tested, and is easily available.
For these reasons, we seek professional development support (as explained above), and aim at making our software publicly available at distribution costs only.

4.4 Role of "Stichting NLnet"

This project proposal fits into our approach of seeking external funding for developing and maintaining the Globe software, and not to allocate money to development efforts from our research budgets. Moreover, it allows us to bind development to external, nonacademic projects, which we believe will have a positive contribution to the quality of our implementations.
In this sense, a possible role of Stichting NLnet is to support our development efforts, and the dissemination of Globe on the internet. Effectively, this would make the Stichting an important partner in the Globe project, complimentary to the expertise available. The primary role of the Vrije Universiteit is to do research in wide-area distributed systems, do the actual development and experimentation, and ensure expertise for maintenance is available. The primary role of the Stichting could be to financially support development and maintenance, and to bring in their expertise on disseminating and maintaining public domain internet software.
This project proposal could therefore be seen as the starting-point of a long-term cooperation. Roughly, we plan to develop an initial version of Globe along the following lines:

Build a simple Web-based version of Globe as explained above. We are currently developing a Globe Web proxy to meet this goal. This project proposal contributes to enhancing that proxy. A more significant contribution is the Globe server. The latter will allow us to install object replicas across the internet.
Build a simple distribution scheme for (dynamically) downloading implementations of Globe distributed objects. This scheme requires the development of implementation repositories, which we envisage to be existing file systems that are accessible through standard internet protocols such as FTP or HTTP. Globe itself can be distributed through standard distribution channels for public domain software.
Build a fault-tolerant scalable location service that will allow us to register Globe objects that can subsequently be looked up. Basically, the location service returns the address where an object can be contacted, and identifies the implementation repository from which the client should download the necessary for that object. Building the location service is done in parallel with the previous two activities.

At that point, a basic system for supporting Globe distributed objects is available. This system is extensible in the sense that users can dynamically enhance the capabilities of a Globe distributed object by simply downloading implementations of strategies for replication, migration, security, etc. as they come available. This approach is somewhat comparable to the use of plug-ins in Web browsers.
Once the basic system is finished, development continues along two lines. First, we expect that research will lead to adaptations in the Globe architecture, subsequently leading to new versions and releases of the system itself. Second, much of our research is targeted towards finding scalable algorithms for replication, security, etc. These algorithms will be implemented and made available through implementation repositories. The architecture of Globe has been designed to allow these implementations to be dynamically downloaded into existing objects as they come available. (Effectively, this means that if third parties develop software that fits into our architecture, that software can, in principle, be made available as well.)

References

[1] T. Berners-Lee. "Universal Resource Identifiers in WWW." RFC 1630, June 1994.
[2] R. Fielding et al. "Hypertext Transfer Protocol - HTTP/1.1." internet draft, Aug. 1998.
[3] P. Homburg. The Architecture of a Worldwide Distributed System. PhD thesis, Vrije University, Department of Mathematics and Computer Science. In preparation.
[4] R. Moats. "URN Syntax." RFC 2141, May 1997.
[5] M. van Steen, P. Homburg, and A. Tanenbaum. "The Architectural Design of Globe: A Wide-Area Distributed System." IEEE Concurrency, 7(1), Jan. 1999. Scheduled for publication.

Back to SIRS project page

Back to Stichting NLnet projects page

RCSID: $Id: sir-proposal.html,v 1.1 2001/03/08 09:41:46 wytze Exp $