![]() |
![]() |
| Introduction | User's Guide | FAQ | Publications | Mailing List | People |
Here is an example usage scenario:
$ ./put.py colors red Success $ ./get.py colors red $ ./put.py --secret donttell colors blue Success $ ./get.py colors red blue $ ./rm.py colors blue donttell Success $ ./get.py colors red
For details, please read on.
OpenDHT runs on a collection of 200 - 300 nodes on PlanetLab. Each of these PlanetLab nodes is a Linux host that runs the Bamboo DHT implementation. Each node in the OpenDHT deployment holds a portion of the DHT's total storage on its local disk, answers put, get, and rm requests for that portion of the DHT's total storage, and routes messages to other DHT member nodes.
Each node also serves as a gateway into the DHT for clients. A gateway accepts RPC operations from clients, forwards those messages into the DHT, and forwards the corresponding responses to those operations from the DHT to the appropriate client.
You can find a list of all active OpenDHT servers here. This list is dynamically updated to reflect the state of the OpenDHT deployment. You will generally experience the shortest latency for completion of OpenDHT RPCs if you choose a gateway that is topologically near you on the Internet. You can find a nearby OpenDHT gateway using OASIS by looking up the DNS name opendht.nyuld.net, or by using Barath Raghavan's Python script find-gateway.py.
There are two interfaces by which clients access OpenDHT: one using XML RPC over HTTP on port 5851 and one using Sun RPC over TCP on port 5852. The interfaces are semantically identical, only the "on-the-wire" protocol is different between them. Below we give an overview of the protocol, present the precise semantics, and then present the details particular to the Sun RPC and XML RPC interfaces.
Because this is a new service, things are still changing from time to time. For this reason, we strongly encourage users of OpenDHT to subscribe to the OpenDHT-Users mailing list. Announcements concerning the state of the OpenDHT deployment, changes to the RPC interface, and other useful information are sent to this list.
OpenDHT supports a narrow put-get-rm interface. The put RPC writes a key-value pair into the DHT, the get RPC retrieves one or more values previously put under a key, and the rm RPC removes values from the DHT. The current OpenDHT deployment limits a single value to 1 kilobyte in size.
As the total storage in the system is limited, OpenDHT cannot offer infinite persistence; a buggy or malicious client could easily consume all storage on a DHT node permanently if puts persisted forever. Instead, OpenDHT treats stored values as soft state by expiring them after a time-to-live (TTL) interval. The client who puts a key-value pair specifies the pair's TTL. The OpenDHT service guarantees not to expire the key-value pair before that TTL passes, barring catastrophic failure of the DHT service.
No notification is sent to clients when the TTL expires; clients must manage their storage to achieve the degree of persistence they desire. Should a client wish to cause its key-value pair to persist beyond its original TTL, the client may refresh the pair by issuing another put. The TTL in the latest put will take effect for that value.
In the current OpenDHT deployment, the longest TTL permitted is one week.
At times, an application may need to remove a value before its TTL expires.
If the value was put into the DHT using a secret hash y such that
y=SHA(x), then any client that knows the secret x
can remove the value by issuing a remove request with the same key as the put,
the SHA-1 hash of the value it wishes to remove, and the secret x.
Because this request can be observed on the network (using
tcpdump, for example), the secret can be learned by an adversary.
For this reason, an application should use a different secret for each value
put into the DHT. To insert a value that cannot be removed, a null secret
hash may be used.
The key and value are the key and value you want to put into the DHT. They can be arbitrary byte arrays. The ttl_sec field specifies how long you want OpenDHT to store the value in seconds.
If non-null, the secret_hash field can later be used to remove the value from the DHT.
OpenDHT uniquely identifies values as key-value-secret_hash triples. If two users put the same key and value with different secret hashes, both values will be stored and returned by gets. If a client reputs a key-value-secret_hash triple previously put by into the DHT by any client (including itself), only one copy of the triple will be stored, and the expiration time will be extended to
max(time_1 + TTL_1, time_2 + TTL_2)
A put that returns 0 was successful. A return value of 1 indicates that the client is using more than its fair share of storage. A return value of 2 indicates that a temporary condition prevented the DHT from accepting the put and that the client should try the put again.
A get in OpenDHT takes the following five arguments:
As noted above, OpenDHT allows puts to have the same key but different values and secret hashes. It also allows the same key, value, and secret_hash to be put from different clients. In all such cases, OpenDHT stores all unique key-value-secret_hash triples, and all unique triples are returned on a get, although not necessarily all at one time.
The maxvals field specifies how many values should be returned by a single get, and the placemark field is used as an iterator to retrieve additional values with subsequent gets. In an initial get, a client should set placemark to be an empty byte array. If the get returns an non-empty placemark, there are additional values available. A subsequent get that uses the same key and the returned placemark will retrieve these values.
In other words, to get all the values available in the DHT under a particular key, do something like this:
values = ();
placemark = ();
do {
(new_values, new_placemark) = get (app, lib, key, maxvals, placemark);
placemark = new_placemark;
values += new_values;
}
while (placemark != ());
A rm in OpenDHT takes the following five arguments:
XML RPC is a protocol supported by many popular scripting languages. It uses HTTP as its transport, so it works from behind many firewalls. OpenDHT accepts XML RPC commands on port 5851. The path is ignored, so you can use "/RPC2" or just "/".
Because HTTP includes a "User-Agent" header, the client_library field in puts and gets is implicit when using XML RPC; you don't need to specify it.
For historical reasons, there are two XML RPC procedures for put. One is named "put"; its parameters, in order, are: the key (<base64>), the value (<base64>), the TTL in seconds (<int>), and the application name (<string>). The return value is an <int>. The second put proceedure is named "put_removable". It takes the same arguments as put, except that after the key and value parameters come two others: hash_type and secret_hash. hash_type should either be "SHA", in which case secret_hash (<base64>) should be the SHA-1 hash of the secret used for removes, or hash_type should be the empty string (""), in which case secret_hash should be an empty byte array. "put_removable" returns the same result codes as "put".
Here's a sample put request:
<?xml version="1.0" encoding="ISO-8859-1"?>
<methodCall>
<methodName>put</methodName>
<params>
<param><value><base64>c9Uau9icuBlvDvtokvlNaPzMLDU=</base64></value></param>
<param><value><base64>8LhGCeXxLFXdhauo1dm+92gI87Vy5ZABErgZJ7pbtfZ+G9ootASb8OSu142xXXvy/Aw06amd5O87wrF8gTetZQ==</base64></value></param>
<param><value><int>120</int></value></param>
<param><value>XmlRpcTest</value></param>
</params>
</methodCall>
And here's the DHT's response:
<?xml version="1.0" encoding="ISO-8859-1"?>
<methodResponse>
<params>
<param><value><int>0</int></value></param>
</params>
</methodResponse>
As with put, there are two get proceedures. The first, "get", takes the
following parameters, in order: the key (<base64>), the maximum number
of values to return (<int>), a placemark from a previous get or an empty
byte string if this is the first get on this key (<base64>), and the
application name (<string>). The return value is an <array>. The
first element of this array is another <array>, each element of which is
one of the values (<base64>) for the given key. The second element of
the top array is the placemark, or an empty byte array if all the values have
been returned. The second get proceedure, "get_details", takes the same
arguments as "get" but each element of the returned array is itself an
<array> of the value itself returns an <array> the value
(<base64>), the TTL remaining (<int>), the hash type
(<string>), and the secret hash (<base64>).
Here's a get request:
<?xml version="1.0" encoding="ISO-8859-1"?>
<methodCall>
<methodName>get<methodName>
<params>
<param><value><base64>c9Uau9icuBlvDvtokvlNaPzMLDU=</base64></value></param>
<param><value><int>1</int></value></param>
<param><value><base64></base64></value></param>
<param><value>XmlRpcTest</value></param>
</params>
</methodCall>
And here's the DHT's response:
<?xml version="1.0" encoding="ISO-8859-1">
<methodResponse>
<params>
<param><value><array><data>
<value><array><data>
<value><base64>8LhGCeXxLFXdhauo1dm+92gI87Vy5ZABErgZJ7pbtfZ+G9ootASb8OSu142xXXvy/Aw06amd5O87wrF8gTetZQ==</base64></value>
</data></array></value>
<value><base64>AAPub+SbJ7AAAAB4c9Uau9icuBlvDvtokvlNaPzMLDW9mjT/gSAUXQFBomDTS2VurrSRkAF/AAAB</base64></value>
</data></array></value></param>
</params>
</methodResponse>
In the Quick Start section, above, we introduced three Python scripts for
using the XML RPC interface: put.py,
get.py, and rm.py. Here are their
complete arguments, the meanings of which should be clear now:
$ ./put.py usage: put.py [options] <key> <value> options: -h, --help show this help message and exit -gGW, --gateway=GW gateway URI, list at http://opendht.org/servers.txt -tTTL, --ttl=TTL how long (in seconds) to store the value -sSEC, --secret=SEC can be used to remove the value later $ ./get.py usage: get.py [options] <key> options: -h, --help show this help message and exit -gGW, --gateway=GW gateway URI, list at http://opendht.org/servers.txt -d, --details print secret hash and TTL remaining for each value -mCNT, --maxvals=CNT how many values to return $ ./rm.py usage: rm.py [options] <key> <value> <secret> options: -h, --help show this help message and exit -gGW, --gateway=GW gateway URI, list at http://opendht.org/servers.txt -tTTL, --ttl=TTL must be longer than TTL remaining for value
The format of Sun RPC calls is specified in a language called XDR. OpenDHT's Sun RPC interface is defined in gateway_prot.x. There are two versions of the interface in gateway_prot.x; version 3 is the current one, though version 2 is still supported for backwards compatibility. To use this interface, you compile the gateway_prot.x file using an XDR compiler (rpcgen on UNIX); this process generates a client stub that you can then access using standard function calls. Here are some sample programs that use the Sun RPC interface:
A simple example, illustrating the use of the OpenDHT put and get RPCs in C. To compile and run this program from scratch, do this on a Unix machine:
$ rpcgen -h gateway_prot.x > gateway_prot.h $ rpcgen -l gateway_prot.x > gateway_prot_clnt.c $ rpcgen -c gateway_prot.x > gateway_prot_xdr.c $ gcc -c gateway_test.c $ gcc -c gateway_prot_clnt.c $ gcc -c gateway_prot_xdr.c $ gcc -o gateway_test gateway_test.o gateway_prot_clnt.o gateway_prot_xdr.o $ ./gateway_test usage: ./gateway_test server_host server_port $ ./gateway_test planetlab8.millennium.berkeley.edu 5852 Doing a null call. Null call successful. Doing a put Put successful ...(I don't know how to do it on a Windows machine.)
We've also built a simple unicast instant messaging application, written in C++. Clients rendezvous by their usernames; mappings from username to IP/port are stored in OpenDHT. This code requires libarpc and libasync, libraries for asynchronous RPC and asynchronous I/O in C++, both of which are part of the SFS distribution. These libraries make it easy to write non-blocking, event-driven OpenDHT client code in C++.
For users programming in Java, we have a client-side gateway stub that presents a non-blocking put-get interface and automatically switches between gateways when one fails. You can get it in the latest Bamboo snapshot; the stub is in bamboo.dht.GatewayClient, and a sample usage is shown in bamboo.dht.PutGetTest.
In addition to the Python and Java interfaces, there is also a Perl interface to OpenDHT by Leon Brocard:
There is an old Java implementation of the ReDiR algorithm described in our IPTPS paper in the latest Bamboo CVS snapshot. A newer C++ implementation of the improved algorithm described in our SIGCOMM paper is available below along with an example of how to use it in conjunction with the STL thanks to Chenfeng Vincent Zhou:
| Last modified 2006/02/19 15:51:18 GMT. | ![]() |