Distributed hash table

A Distributed Hash Table (DHT) is a distributed and often decentralized mechanism for associating hash values (keys) with some kind of content. Participants in the DHT each store a small section of the contents of the hashtable.

The main advantage of DHTs are their scalability: a typical search on a DHT requires only O(log(n)) network traffic where n is the number of participants in the DHT. This compares very favourably to gnutella-style search, which requires O(n) traffic, and superpeer style networks, which require about O(sqrt(n)) traffic to perform a complete search.

DHTs are a very general mechanism for distributed location of resources of almost any kind, for example:

  • File sharing: store the hash of the file in the DHT, and hashes of keywords relevant to the file. People looking for the file can then search by keyword, or if they know the exact hash of the file, search for that hash.
  • Finding people (eg in a chat program): store the hash of their public key, and hashes of their names to the DHT.
  • Providing a service (eg distributed backup): store a particular hash to the DHT in order to indicate that you are willing to provide a particular service (perhaps the hash of a short description of that service).

Secure hashes such as MD5 and SHA1 are commonly used. This provides a key space that is large enough to make collisions extremely unlikely (but not impossible). It also provides a method for verifying that a file is the correct file when sharing files (helps prevent the spread of viruses and broken files over the Web and peer-to-peer networks). Note: CRC32 is also commonly used, but collisions are far higher likely when one uses CRC32.

See also: Distribution | Hash

Related Links

TakeDown.NET -> “Distributed-hash-table