News Introduction User's Guide FAQ Publications Mailing List People

Quick Start

Here are three simple Python scripts that demonstrate how to use OpenDHT: put.py, get.py, and rm.py.

Here is an example usage scenario:

$ ./put.py colors red
Success
$ ./get.py colors
red
$ ./put.py --secret donttell colors blue
Success
$ ./get.py colors
red
blue
$ ./rm.py colors blue donttell 
Success
$ ./get.py colors
red

For details, please read on.

How It Works

OpenDHT runs on a collection of 200 - 300 nodes on PlanetLab. Each of these PlanetLab nodes is a Linux host that runs the Bamboo DHT implementation. Each node in the OpenDHT deployment holds a portion of the DHT's total storage on its local disk, answers put, get, and rm requests for that portion of the DHT's total storage, and routes messages to other DHT member nodes.

Each node also serves as a gateway into the DHT for clients. A gateway accepts RPC operations from clients, forwards those messages into the DHT, and forwards the corresponding responses to those operations from the DHT to the appropriate client.

You can find a list of all active OpenDHT servers here. This list is dynamically updated to reflect the state of the OpenDHT deployment. You will generally experience the shortest latency for completion of OpenDHT RPCs if you choose a gateway that is topologically near you on the Internet. You can find a nearby OpenDHT gateway using OASIS by looking up the DNS name opendht.nyuld.net, or by using Barath Raghavan's Python script find-gateway.py.

There are two interfaces by which clients access OpenDHT: one using XML RPC over HTTP on port 5851 and one using Sun RPC over TCP on port 5852. The interfaces are semantically identical, only the "on-the-wire" protocol is different between them. Below we give an overview of the protocol, present the precise semantics, and then present the details particular to the Sun RPC and XML RPC interfaces.

Because this is a new service, things are still changing from time to time. For this reason, we strongly encourage users of OpenDHT to subscribe to the OpenDHT-Users mailing list. Announcements concerning the state of the OpenDHT deployment, changes to the RPC interface, and other useful information are sent to this list.

Protocol Overview

OpenDHT supports a narrow put-get-rm interface. The put RPC writes a key-value pair into the DHT, the get RPC retrieves one or more values previously put under a key, and the rm RPC removes values from the DHT. The current OpenDHT deployment limits a single value to 1 kilobyte in size.

As the total storage in the system is limited, OpenDHT cannot offer infinite persistence; a buggy or malicious client could easily consume all storage on a DHT node permanently if puts persisted forever. Instead, OpenDHT treats stored values as soft state by expiring them after a time-to-live (TTL) interval. The client who puts a key-value pair specifies the pair's TTL. The OpenDHT service guarantees not to expire the key-value pair before that TTL passes, barring catastrophic failure of the DHT service.

No notification is sent to clients when the TTL expires; clients must manage their storage to achieve the degree of persistence they desire. Should a client wish to cause its key-value pair to persist beyond its original TTL, the client may refresh the pair by issuing another put. The TTL in the latest put will take effect for that value.

In the current OpenDHT deployment, the longest TTL permitted is one week.

At times, an application may need to remove a value before its TTL expires. If the value was put into the DHT using a secret hash y such that y=SHA(x), then any client that knows the secret x can remove the value by issuing a remove request with the same key as the put, the SHA-1 hash of the value it wishes to remove, and the secret x. Because this request can be observed on the network (using tcpdump, for example), the secret can be learned by an adversary. For this reason, an application should use a different secret for each value put into the DHT. To insert a value that cannot be removed, a null secret hash may be used.

Precise Sematics

A put in OpenDHT takes the following five arguments: It returns an integer equal to one of three values: The application and client_library fields are for logging purposes, like the "User-Agent" header in HTTP. Set application to the name of your application, and client_library to the name of the library you're using to access OpenDHT (more about that below). If you really don't want to give us that information, that's fine, but it will make our job harder. Also, if you do tell us and we see that 90% of the clients accessing OpenDHT are using your application, you'll probably get a lot more technical support and other favors from us in the future, so it's in your interest to set it, too.

The key and value are the key and value you want to put into the DHT. They can be arbitrary byte arrays. The ttl_sec field specifies how long you want OpenDHT to store the value in seconds.

If non-null, the secret_hash field can later be used to remove the value from the DHT.

OpenDHT uniquely identifies values as key-value-secret_hash triples. If two users put the same key and value with different secret hashes, both values will be stored and returned by gets. If a client reputs a key-value-secret_hash triple previously put by into the DHT by any client (including itself), only one copy of the triple will be stored, and the expiration time will be extended to

max(time_1 + TTL_1, time_2 + TTL_2)

A put that returns 0 was successful. A return value of 1 indicates that the client is using more than its fair share of storage. A return value of 2 indicates that a temporary condition prevented the DHT from accepting the put and that the client should try the put again.

A get in OpenDHT takes the following five arguments:

It returns the following: The application, client_library, and key fields are the same as in a put.

As noted above, OpenDHT allows puts to have the same key but different values and secret hashes. It also allows the same key, value, and secret_hash to be put from different clients. In all such cases, OpenDHT stores all unique key-value-secret_hash triples, and all unique triples are returned on a get, although not necessarily all at one time.

The maxvals field specifies how many values should be returned by a single get, and the placemark field is used as an iterator to retrieve additional values with subsequent gets. In an initial get, a client should set placemark to be an empty byte array. If the get returns an non-empty placemark, there are additional values available. A subsequent get that uses the same key and the returned placemark will retrieve these values.

In other words, to get all the values available in the DHT under a particular key, do something like this:

values = ();
placemark = ();
do {
   (new_values, new_placemark) = get (app, lib, key, maxvals, placemark);
   placemark = new_placemark;
   values += new_values;
}
while (placemark != ());
A rm in OpenDHT takes the following five arguments: Due to the nature of OpenDHT's protocols, remove operations must be stored persistently by the DHT. For this reason, the TTL of a remove must be longer than the TTL remaining for the corresponding put. Otherwise, the previous value might re-appear when the remove expires. (For simplicity, just set the TTL of the remove to the same value as specified in the put.) Also for this reason, removes return the same result codes as a put:

XML RPC Syntax

XML RPC is a protocol supported by many popular scripting languages. It uses HTTP as its transport, so it works from behind many firewalls. OpenDHT accepts XML RPC commands on port 5851. The path is ignored, so you can use "/RPC2" or just "/".

Because HTTP includes a "User-Agent" header, the client_library field in puts and gets is implicit when using XML RPC; you don't need to specify it.

For historical reasons, there are two XML RPC procedures for put. One is named "put"; its parameters, in order, are: the key (<base64>), the value (<base64>), the TTL in seconds (<int>), and the application name (<string>). The return value is an <int>. The second put proceedure is named "put_removable". It takes the same arguments as put, except that after the key and value parameters come two others: hash_type and secret_hash. hash_type should either be "SHA", in which case secret_hash (<base64>) should be the SHA-1 hash of the secret used for removes, or hash_type should be the empty string (""), in which case secret_hash should be an empty byte array. "put_removable" returns the same result codes as "put".

Here's a sample put request:

<?xml version="1.0" encoding="ISO-8859-1"?>
<methodCall>
    <methodName>put</methodName>
    <params>
        <param><value><base64>c9Uau9icuBlvDvtokvlNaPzMLDU=</base64></value></param>
        <param><value><base64>8LhGCeXxLFXdhauo1dm+92gI87Vy5ZABErgZJ7pbtfZ+G9ootASb8OSu142xXXvy/Aw06amd5O87wrF8gTetZQ==</base64></value></param>
        <param><value><int>120</int></value></param>
        <param><value>XmlRpcTest</value></param>
    </params>
</methodCall>
And here's the DHT's response:
<?xml version="1.0" encoding="ISO-8859-1"?>
<methodResponse>
    <params>
        <param><value><int>0</int></value></param>
    </params>
</methodResponse>
As with put, there are two get proceedures. The first, "get", takes the following parameters, in order: the key (<base64>), the maximum number of values to return (<int>), a placemark from a previous get or an empty byte string if this is the first get on this key (<base64>), and the application name (<string>). The return value is an <array>. The first element of this array is another <array>, each element of which is one of the values (<base64>) for the given key. The second element of the top array is the placemark, or an empty byte array if all the values have been returned. The second get proceedure, "get_details", takes the same arguments as "get" but each element of the returned array is itself an <array> of the value itself returns an <array> the value (<base64>), the TTL remaining (<int>), the hash type (<string>), and the secret hash (<base64>).

Here's a get request:

<?xml version="1.0" encoding="ISO-8859-1"?>
<methodCall>
    <methodName>get<methodName>
    <params>
        <param><value><base64>c9Uau9icuBlvDvtokvlNaPzMLDU=</base64></value></param>
        <param><value><int>1</int></value></param>
        <param><value><base64></base64></value></param>
        <param><value>XmlRpcTest</value></param>
    </params>
</methodCall>
And here's the DHT's response:
<?xml version="1.0" encoding="ISO-8859-1">
<methodResponse>
    <params>
        <param><value><array><data>
            <value><array><data>
                <value><base64>8LhGCeXxLFXdhauo1dm+92gI87Vy5ZABErgZJ7pbtfZ+G9ootASb8OSu142xXXvy/Aw06amd5O87wrF8gTetZQ==</base64></value>
            </data></array></value>
            <value><base64>AAPub+SbJ7AAAAB4c9Uau9icuBlvDvtokvlNaPzMLDW9mjT/gSAUXQFBomDTS2VurrSRkAF/AAAB</base64></value>
        </data></array></value></param>
    </params>
</methodResponse>
In the Quick Start section, above, we introduced three Python scripts for using the XML RPC interface: put.py, get.py, and rm.py. Here are their complete arguments, the meanings of which should be clear now:

$ ./put.py 
usage: put.py [options] <key> <value>
options:
  -h, --help           show this help message and exit
  -gGW, --gateway=GW   gateway URI, list at http://opendht.org/servers.txt
  -tTTL, --ttl=TTL     how long (in seconds) to store the value
  -sSEC, --secret=SEC  can be used to remove the value later

$ ./get.py 
usage: get.py [options] <key>
options:
  -h, --help            show this help message and exit
  -gGW, --gateway=GW    gateway URI, list at http://opendht.org/servers.txt
  -d, --details         print secret hash and TTL remaining for each value
  -mCNT, --maxvals=CNT  how many values to return

$ ./rm.py
usage: rm.py [options] <key> <value> <secret>
options:
  -h, --help          show this help message and exit
  -gGW, --gateway=GW  gateway URI, list at http://opendht.org/servers.txt
  -tTTL, --ttl=TTL    must be longer than TTL remaining for value

Sun RPC Syntax

The format of Sun RPC calls is specified in a language called XDR. OpenDHT's Sun RPC interface is defined in gateway_prot.x. There are two versions of the interface in gateway_prot.x; version 3 is the current one, though version 2 is still supported for backwards compatibility. To use this interface, you compile the gateway_prot.x file using an XDR compiler (rpcgen on UNIX); this process generates a client stub that you can then access using standard function calls. Here are some sample programs that use the Sun RPC interface:

A simple example, illustrating the use of the OpenDHT put and get RPCs in C. To compile and run this program from scratch, do this on a Unix machine:

$ rpcgen -h gateway_prot.x > gateway_prot.h
$ rpcgen -l gateway_prot.x > gateway_prot_clnt.c
$ rpcgen -c gateway_prot.x > gateway_prot_xdr.c
$ gcc -c gateway_test.c
$ gcc -c gateway_prot_clnt.c
$ gcc -c gateway_prot_xdr.c
$ gcc -o gateway_test gateway_test.o gateway_prot_clnt.o gateway_prot_xdr.o
$ ./gateway_test 
usage: ./gateway_test server_host server_port
$ ./gateway_test planetlab8.millennium.berkeley.edu 5852
Doing a null call.
Null call successful.
Doing a put
Put successful
...
(I don't know how to do it on a Windows machine.)

We've also built a simple unicast instant messaging application, written in C++. Clients rendezvous by their usernames; mappings from username to IP/port are stored in OpenDHT. This code requires libarpc and libasync, libraries for asynchronous RPC and asynchronous I/O in C++, both of which are part of the SFS distribution. These libraries make it easy to write non-blocking, event-driven OpenDHT client code in C++.

For users programming in Java, we have a client-side gateway stub that presents a non-blocking put-get interface and automatically switches between gateways when one fails. You can get it in the latest Bamboo snapshot; the stub is in bamboo.dht.GatewayClient, and a sample usage is shown in bamboo.dht.PutGetTest.

Perl

In addition to the Python and Java interfaces, there is also a Perl interface to OpenDHT by Leon Brocard:

ReDiR

There is an old Java implementation of the ReDiR algorithm described in our IPTPS paper in the latest Bamboo CVS snapshot. A newer C++ implementation of the improved algorithm described in our SIGCOMM paper is available below along with an example of how to use it in conjunction with the STL thanks to Chenfeng Vincent Zhou:


Last modified 2006/02/19 15:51:18 GMT.