diff --git a/router/doc/net.png b/router/doc/net.png deleted file mode 100644 index 6eac378424..0000000000 Binary files a/router/doc/net.png and /dev/null differ diff --git a/router/doc/techintro.html b/router/doc/techintro.html deleted file mode 100644 index 1574f5fa29..0000000000 --- a/router/doc/techintro.html +++ /dev/null @@ -1,994 +0,0 @@ - - - Introducing I2P - a scalable framework for anonymous communication - - - - -
-Introducing I2P
-a scalable framework for anonymous communication
-$Id: techintro.html,v 1.8.2.1 2006/02/13 07:13:35 jrandom Exp $ -
-
- - - - - -
-
-* Introduction
-* Operation
-  * Overview
-  * Tunnels
-  * Network Database
-  * Transport protocols
-  * Cryptography
-
-
-
-* Future
-  * Restricted routes
-  * Variable latency
-  * Open questions
-
-
-
-* Similar systems
-  * Tor
-  * Freenet
-* Appendix A: Application layer
-
-
-
- -
- -

Introduction

-

-I2P is a scalable, self organizing, resilient packet switched anonymous network layer, -upon which any number of different anonymity or security conscious applications -can operate. Each of these applications may make their own anonymity, latency, and -throughput tradeoffs without worrying about the proper implementation of a free -route mixnet, allowing them to blend their activity with the larger anonymity set of -users already running on top of I2P. Applications available already provide the full -range of typical Internet activities - anonymous web browsing, anonymous web hosting, -anonymous blogging and content syndication (with Syndie), -anonymous chat (via IRC or Jabber), anonymous swarming file transfers (with i2p-bt, I2PSnark, and -Azureus), anonymous file sharing (with -I2Phex), anonymous email (with I2Pmail -and susimail), anonymous newsgroups, as well as several -other applications under development. Unlike web sites hosted within content -distribution networks like Freenet or -GNUnet, the services hosted on I2P are fully -interactive - there are traditional web-style search engines, bulletin boards, blogs -you can comment on, database driven sites, and bridges to query static systems like -Freenet without needing to install it locally. -

- -

-With all of these anonymity enabled applications, I2P takes on the role of the message -oriented middleware - applications say that they want to send some data to a cryptographic -identifier (a "destination") and I2P takes care of making sure it gets there securely -and anonymously. I2P also bundles a simple streaming library -to allow I2P's anonymous best-effort messages to transfer as reliable, in-order streams, -transparently offering a TCP based congestion control algorithm tuned for the high -bandwidth delay product of the network. While there have been several simple SOCKS -proxies available to tie existing applications into the network, their value has been -limited as nearly every application routinely exposes what, in an anonymous context, -is sensitive information. The only safe way to go is to fully audit an application to -ensure proper operation, and to assist in that we provide a series of APIs in various -languages which can be used to make the most out of the network. -

- - - - -

-I2P is not a research project - academic, commercial, or governmental, but is instead -an engineering effort aimed at doing whatever is necessary to provide a sufficient -level of anonymity to those who need it. It has been in active development since -early 2003 with one full time developer and a dedicated group of part time contributors -from all over the world. All of the work done on I2P is open source and -freely available on the website, with the majority -of the code released outright into the public domain, though making use of a few -cryptographic routines under BSD-style licenses. The people working on I2P do not -control what people release client applications under, and there are several GPL'ed -applications available (I2PTunnel, -susimail, I2PSnark, Azureus, -I2Phex). Funding -for I2P comes entirely from donations, and does not receive any tax breaks in any -jurisdiction at this time, as many of the developers are themselves anonymous. -

- -

Operation

-

Overview

- -

-To understand I2P's operation, it is essential to understand a few key concepts. -First, I2P makes a strict separation between the software participating -in the network (a "router") and the anonymous endpoints ("destinations") associated -with individual applications. The fact that someone is running I2P is not usually -a secret. What is hidden is information on what the user is doing, if anything at -all, as well as what router a particular destination is connected to. End users -will typically have several local destinations on their router - for instance, one -proxying in to IRC servers, another supporting the user's anonymous webserver ("eepsite"), -another for an I2Phex instance, another for torrents, etc. -

- -

-Another critical concept to understand is the "tunnel" - a directed path through -an explicitly selected set of routers, making use of layered encryption so that -the messages sent in the tunnel's "gateway" appear entirely random at each hop -along the path until it reaches the tunnel's "endpoint". These unidirectional -tunnels can be seen as either "inbound" tunnels or "outbound" tunnels, referring -to whether they are bringing messages to the tunnel's creator or away from them, -respectively. The gateway of an inbound tunnel can receive messages from any -peer and will forward them down through the tunnel until it reaches the (anonymous) -endpoint (the creator). On the other hand, the gateway of an outbound tunnel is -the tunnel's creator, and messages sent through that tunnel are encoded so that -when they reach the outbound tunnel's endpoint, that router has the instructions -necessary to forward the message on to the appropriate location. -

- -

-A third critical concept to understand is I2P's "network database" (or "netDb") -- a pair of algorithms used to share network metadata. The two types of metadata -carried are "routerInfo" and "leaseSets" - the routerInfo gives routers the data -necessary for contacting a particular router (their public keys, transport -addresses, etc), while the leaseSet gives routers the information necessary for -contacting a particular destination. Within each leaseSet, there are any number -of "leases", each of which specifies the gateway for one of that destination's -inbound tunnels as well as when that tunnel will expire. The leaseSet also -contains a pair of public keys which can be used for layered garlic encryption. -

- - - -

-When Alice wants to send a message to Bob, she first does a lookup in the -netDb to find Bob's leaseSet, giving her his current inbound tunnel gateways. -She then picks one of her outbound tunnels and sends the message -down it with instructions for the outbound tunnel's endpoint to forward the -message on to one of Bob's inbound tunnel gateways. When the outbound -tunnel endpoint receives those instructions, it forwards the message as -requested, and when Bob's inbound tunnel gateway receives it, it is -forwarded down the tunnel to Bob's router. If Alice wants Bob to be able -to reply to the message, she needs to transmit her own destination explicitly -as part of the message itself (taken care of transparently in the -streaming library). Alice may also cut down on -the response time by bundling her most recent leaseSet with the message so -that Bob doesn't need to do a netDb lookup for it when he wants to reply, but this -is optional. -

- -

-While the tunnels themselves have layered encryption to prevent unauthorized -disclosure to peers inside the network (as the transport layer itself does to -prevent unauthorized disclosure to peers outside the network), it is necessary -to add an additional end to end layer of encryption to hide the message from the -outbound tunnel endpoint and the inbound tunnel gateway. This -"garlic encryption" lets Alice's router wrap up multiple -messages into a single "garlic message", encrypted to a particular public key -so that intermediary peers cannot determine either how many messages are within -the garlic, what those messages say, or where those individual cloves are -destined. For typical end to end communication between Alice and Bob, the -garlic will be encrypted to the public key published in Bob's leaseSet, -allowing the message to be encrypted without giving out the public key to Bob's -own router. -

- -

-Another important fact to keep in mind is that I2P is entirely message based -and that some messages may be lost along the way. Applications using I2P -can use the message oriented interfaces and take care of their own congestion -control and reliability needs, but most would be best served by reusing the -provided streaming library to view I2P as a streams -based network. -

- -

Tunnels

- -

-Both inbound and outbound tunnels work along similar principles - the tunnel -gateway accumulates a number of tunnel messages, eventually preprocessing them -into something for tunnel delivery. Next, the gateway encrypts that preprocessed -data and forwards it to the first hop. That peer and subsequent tunnel -participants add on a layer of encryption after verifying that it isn't a -duplicate before forward it on to the next peer. Eventually, the -message arrives at the endpoint where the messages are split out again and -forwarded on as requested. The difference arises in what -the tunnel's creator does - for inbound tunnels, the creator is the endpoint -and they simply decrypt all of the layers added, while for outbound tunnels, -the creator is the gateway and they pre-decrypt all of the layers so that after -all of the layers of per-hop encryption are added, the message arrives in the -clear at the tunnel endpoint. -

- -

-The choice of specific peers to pass on messages as well as their particular -ordering is important to understanding both I2P's anonymity and performance -characteristics. While the network database (below) has its own criteria for -picking what peers to query and store entries on, tunnels may use any peers in -the network in any order (and even any number of times) in a single tunnel. If -perfect latency and capacity data were globally known, selection and ordering -would be driven by the particular needs of the client in tandem with their threat -model. Unfortunately, latency and capacity data is not trivial to gather -anonymously, and depending upon untrusted peers to provide this information has -its own serious anonymity implications. -

- -

-From an anonymity perspective, the simplest technique would be to pick peers -randomly from the entire network, order them randomly, and use those peers -in that order for all eternity. From a performance perspective, the simplest -technique would be to pick the fastest peers with the necessary spare capacity, -spreading the load across different peers to handle transparent failover, and -to rebuild the tunnel whenever capacity information changes. While the former -is both brittle and inefficient, the later requires inaccessible information -and offers insufficient anonymity. I2P is instead working on offering a range -of peer selection strategies, coupled with anonymity aware measurement code to -organize the peers by their profiles. -

- -

-As a base, I2P is constantly profiling the peers with which it interacts with -by measuring their indirect behavior - for instance, when a peer responds to -a netDb lookup in 1.3 seconds, that round trip latency is recorded in the -profiles for all of the routers involved in the two tunnels (inbound and -outbound) through which the request and response passed, as well as the queried -peer's profile. Direct measurement, such as transport layer latency or -congestion, is not used as part of the profile, as it can be manipulated and -associated with the measuring router, exposing them to trivial attacks. While -gathering these profiles, a series of calculations are run on each to summarize -its performance - its latency, capacity to handle lots of activity, whether they -are currently overloaded, and how well integrated into the network they seem to -be. These calculations are then compared for active peers to organize the routers -into four tiers - fast and high capacity, high capacity, not failing, and failing. -The thresholds for those tiers are determined dynamically, and while they -currently use fairly simple algorithms, alternatives exist. -

- -

-Using this profile data, the simplest reasonable peer selection strategy is to -pick peers randomly from the top tier (fast and high capacity), and this is -currently deployed for client tunnels. Exploratory tunnels (used for netDb -and tunnel management) pick peers randomly from the not failing tier (which -includes routers in 'better' tiers as well), allowing the peer to sample -routers more widely, in effect optimizing the peer selection through randomized -hill climbing. These strategies alone do however leak information regarding the -peers in the router's tip tier through predecessor and netDb harvesting attacks. -In turn, several alternatives exist which, while not balancing the load as evenly, -will address the attacks mounted by particular classes of adversaries. -

- -

-By picking a random key and ordering the peers according to their XOR distance -from it, the information leaked is reduced in predecessor and harvesting attacks -according to the peers' failure rate and the tier's churn. Another simple strategy -for dealing with netDb harvesting attacks is to simply fix the inbound tunnel -gateway(s) yet randomize the peers further on in the tunnels. To deal with -predecessor attacks for adversaries which the client contacts, the outbound tunnel -endpoints would also remain fixed. The selection of which peer to fix on the most -exposed point would of course need to have a limit to the duration, as all peers -fail eventually, so it could either be reactively adjusted or proactively avoided -to mimic a measured mean time between failures of other routers. These two strategies -can in turn be combined, using a fixed exposed peer and an XOR based ordering within -the tunnels themselves. A more rigid strategy would fix the exact peers and ordering -of a potential tunnel, only using individual peers if all of them agree to participate -in the same way each time. This varies from the XOR based ordering in that the -predecessor and successor of each peer is always the same, while the XOR only makes -sure their order doesn't change. -

- -

-As mentioned before, I2P currently (release 0.6.1.1) includes the tiered random -strategy above, but the others are planned for the 0.6.2 release. A more detailed -discussion of the mechanics involved in tunnel operation, management, and peer -selection can be found in the -tunnel spec. -

- -

Network Database

- -

-As mentioned earlier, I2P's netDb works to share the network's metadata. Two -algorithms are used to accomplish this - primarily, a small set of routers are -designated as "floodfill peers", while the rest of the routers participate in -the Kademlia derived -distributed hash table for redundancy. To integrate the two algorithms, each -router always uses the Kademlia style store and fetch, but acts as if the -floodfill peers are 'closest' to the key in question. Additionally, when a -peer publishes a key into the netDb, after a brief delay they query another -random floodfill peer, asking them for the key, and if that peer does not have -it, they move on and republish the key again. Behind the scenes, when one of -the floodfill peers receives a new valid key, they republish it to the other -floodfill peers who then cache it locally. -

- -

-Each piece of data in the netDb is self authenticating - signed by the -appropriate party and verified by anyone who uses or stores it. In addition, -the data has liveliness information within it, allowing irrelevant entries to be -dropped, newer entries to replace older ones, and, for the paranoid, protection -against certain classes of attack. This is also why I2P bundles the necessary -code for maintaining the correct time, occasionally querying some SNTP servers -(the pool.ntp.org round robin by default) -and detecting skew between routers at the transport layer. -

- -

-The routerInfo structure itself contains all of the information that one router -needs to know to securely send messages to another router. This includes their -identity (made up of a 2048bit ElGamal public key, a 1024bit DSA public key, and -a certificate), the transport addresses which they can be reached on, such as -an IP address and port, when the structure was published, and a set of arbitrary -uninterpreted text options. In addition, there is a signature against all of -that data as generated by the included DSA public key. The key for this routerInfo -structure in the netDb is the SHA256 hash of the router's identity. The options -published are often filled with information helpful in debugging I2P's operation, -but when I2P reaches the 1.0 release, the options will be disabled and kept blank. -

- -

-The leaseSet structure is similar, in that it includes the I2P destination -(comprised of a 2048bit ElGamal public key, a 1024bit DSA public key, and a -certificate), a list of "leases", and a pair of public keys for garlic encrypting -messages to the destination. Each of the leases specify one of the destination's -inbound tunnel gateways by including the SHA256 of the gateway's identity, a 4 -byte tunnel id on that gateway, and when that tunnel will expire. The key for -the leaseSet in the netDb is the SHA256 of the destination itself. -

- -

-As the router currently automatically bundles the leaseSet for the sender inside -a garlic message to the recipient, the leaseSet for destinations which will not -receive unsolicited messages do not need to be published in the netDb at all. If -the destination itself is sensitive, the leaseSet could instead be transmitted -through other means without ever going into the netDb. -

- -

-Bootstrapping the netDb itself is simple - once a router has at least one routerInfo -of a reachable peer, they query that router for references to other routers in the -network with the Kademlia healing algorithm. Each routerInfo reference is stored in -an individual file in the router's netDb subdirectory, allowing people to easily -share their references to bootstrap new users. -

- -

-Unlike traditional DHTs, the very act of conducting a search distributes the data -as well, since rather passing Kademlia's standard IP+port pairs, references are given -to the routers that the peer should query next (namely, the SHA256 of those routers' -identities). As such, iteratively searching for a particular destination's leaseSet -or router's routerInfo will also provide you with the routerInfo of the peers along -the way. In addition, due to the time sensitivity of the data published, the information -doesn't often need to migrate between peers - since a tunnel is only valid for 10 -minutes, the leaseSet can be dropped after that time has passed. To take into -account Sybil attacks on the netDb, the Kademlia routing location used for any given -key varies over time. For instance, rather than storing a routerInfo on the peers -closest to SHA256(routerInfo.identity), they are stored on the peers closest to -SHA256(routerInfo.identity + YYYYMMDD), requiring an adversary to remount the attack -again daily so as to maintain their closeness to the current routing key. As the -very fact that a router is making a lookup for a given key may expose sensitive data -(and the fact that a router is publishing a given key even more so), all netDb -messages are transmitted through the router's exploratory tunnels. -

- -

-The netDb plays a very specific role in the I2P network, and the algorithms have -been tuned towards our needs. This also means that it hasn't been tuned to address the -needs we have yet to run into. As the network grows, the primary floodfill algorithm -will need to be refined to exploit the capacity available, or perhaps replaced with -another technique for securely distributing the network metadata. -

- -

Transport protocols

- -

-Communication between routers needs to provide confidentiality and integrity -against external adversaries while authenticating that the router contacted -is the one who should receive a given message. The particulars of how routers -communicate with other routers aren't critical - three separate protocols have -been used at different points to provide those bare necessities. To accommodate -the need for high degree communication (as a number of routers will end up -speaking with many others), I2P moved from a TCP based transport -to a UDP based one - "Secure Semireliable UDP", or "SSU". As described in the -SSU spec:

- -
-The goal of this protocol is to provide secure, authenticated, -semireliable, and unordered message delivery, exposing only a minimal amount of -data easily discernible to third parties. It should support high degree -communication as well as TCP-friendly congestion control, and may include -PMTU detection. It should be capable of efficiently moving bulk data at rates -sufficient for home users. In addition, it should support techniques for -addressing network obstacles, like most NATs or firewalls. -
- -

Cryptography

- -

-A bare minimum set of cryptographic primitives are combined together to provide I2P's -layered defenses against a variety of adversaries. At the lowest level, interrouter -communication is protected by the transport layer security - SSU -encrypts each packet with AES256/CBC with both an explicit IV and MAC (HMAC-MD5-128) -after agreeing upon an ephemeral session key through a 2048bit Diffie-Hellman exchange, -station-to-station authentication with the other router's DSA key, plus each network -message has their own hash for local integrity checking. -Tunnel messages passed over the transports have their own -layered AES256/CBC encryption with an explicit IV and verified at the tunnel endpoint -with an additional SHA256 hash. Various other messages are passed along inside -"garlic messages", which are encrypted with ElGamal/AES+SessionTags (explained below). -

- -

Garlic messages

- -

-Garlic messages are an extension of "onion" layered encryption, allowing the contents -of a single message to contain multiple "cloves" - fully formed messages alongside -their own instructions for delivery. Messages are wrapped into a garlic message whenever -the message would otherwise be passing in cleartext through a peer who should not have -access to the information - for instance, when a router wants to ask another router to -participate in a tunnel, they wrap the request inside a garlic, encrypt that garlic to -the receiving router's 2048bit ElGamal public key, and forward it through a tunnel. -Another example is when a client wants to send a message to a destination - the sender's -router will wrap up that data message (alongside some other messages) into a garlic, -encrypt that garlic to the 2048bit ElGamal public key published in the recipient's -leaseSet, and forward it through the appropriate tunnels. -

- -

-The "instructions" attached to each clove inside the encryption layer includes the -ability to request that the clove be forwarded locally, to a remote router, or to a -remote tunnel on a remote router. There are fields in those instructions allowing a -peer to request that the delivery be delayed until a certain time or condition has -been met, though they won't be honored until the -nontrivial delays are deployed. It is possible to -explicitly route garlic messages any number of hops without building tunnels, or even -to reroute tunnel messages by wrapping them in garlic messages and forwarding them a -number of hops prior to delivering them to the next hop in the tunnel, but those -techniques are not currently used in the existing implementation. -

- -

Session tags

- -

-As an unreliable, unordered, message based system, I2P uses a simple combination of -asymmetric and symmetric encryption algorithms to provide data confidentiality and -integrity to garlic messages. As a whole, the combination is referred to as -ElGamal/AES+SessionTags, but that is an excessively verbose way to describe the simple -use of 2048bit ElGamal, AES256, SHA256, and 32 byte nonces. -

- -

-The first time a router wants to encrypt a garlic message to another router, they encrypt -the keying material for an AES256 session key with ElGamal and append the AES256/CBC -encrypted payload after that encrypted ElGamal block. In addition to the encrypted -payload, the AES encrypted section contains the payload length, the SHA256 hash of the -unencrypted payload, as well as a number of "session tags" - random 32 byte nonces. The -next time the sender wants to encrypt a garlic message to another router, rather than -ElGamal encrypt a new session key they simply pick one of the previously delivered session -tags and AES encrypt the payload like before, using the session key used with that -session tag, prepended with the session tag itself. When a router receives a garlic encrypted -message, they check the first 32 bytes to see if it matches an available session tag - if -it does, they simply AES decrypt the message, but if it does not, they ElGamal decrypt the -first block. -

- -

-Each session tag can be used only once so as to prevent internal adversaries from unnecessarily -correlating different messages as being between the same routers. The sender of an -ElGamal/AES+SessionTag encrypted message chooses when and how many tags to deliver, -prestocking the recipient with enough tags to cover a volley of messages. Garlic messages -may detect the successful tag delivery by bundling a small additional message as a clove (a -"delivery status message") - when the garlic message arrives at the intended recipient and -is decrypted successfully, this small delivery status message is one of the cloves exposed and -has instructions for the recipient to send the clove back to the original sender (through an -inbound tunnel, of course). When the original sender receives this delivery status message, -they know that the session tags bundled in the garlic message were successfully delivered. -

- -

-Session tags themselves have a very short lifetime, after which they are discarded -if not used. In addition, the quantity stored for each key is limited, as are the -number of keys themselves - if too many arrive, either new or old messages may be -dropped. The sender keeps track whether messages using session tags are getting -through, and if there isn't sufficient communication it may drop the ones previously -assumed to be properly delivered, reverting back to the full expensive ElGamal -encryption. -

- -

-One alternative is to transmit only a single session tag, and from that, seed a -deterministic PRNG for determining what tags to use or expect. By keeping this -PRNG roughly synchronized between the sender and recipient (the recipient precomputes a -window of the next e.g. 50 tags), the overhead of periodically bundling a large number -of tags is removed, allowing more options in the space/time tradeoff, and perhaps -reducing the number of ElGamal encryptions necessary. However, it would depend -upon the strength of the PRNG to provide the necessary cover against internal -adversaries, though perhaps by limiting the amount of times each PRNG is used, any -weaknesses can be minimized. At the moment, there are no immediate plans to move -towards these synchronized PRNGs. -

- -

Future

-

-While I2P is currently functional and sufficient for many scenarios, there are -several areas which require further improvement to meet the needs of those -facing more powerful adversaries as well as substantial user experience optimization. -

- -

Restricted route operation

- -

-I2P is an overlay network designed to be run on top of a functional packet switched -network, exploiting the end to end principle to offer anonymity and security. -While the Internet no longer fully embraces the end to end principle, I2P does require a -substantial portion of the network to be reachable - there may be a number of peers -along the edges running using restricted routes, but I2P does not include an -appropriate routing algorithm for the degenerate case where most peers are -unreachable. It would, however work on top of a network employing such an -algorithm. -

- -

-Restricted route operation, where there are limits to what peers are -reachable directly, has several different functional and anonymity -implications, dependent upon how the restricted routes are handled. At the most -basic level, restricted routes exist when a peer is behind a NAT or firewall which -does not allow inbound connections. This was largely addressed in I2P 0.6.0.6 by -integrating distributed hole punching into the transport layer, allowing people -behind most NATs and firewalls to receive unsolicited connections without any -configuration. However, this does not limit the exposure of the peer's IP address to -routers inside the network, as they can simply get introduced to the peer through -the published introducer. -

- -

-Beyond the functional handling of restricted routes, there are two levels of -restricted operation that can be used to limit the exposure of one's IP address - -using router-specific tunnels for communication, and offering 'client routers'. For -the former, routers can either build a new pool of tunnels or reuse their exploratory -pool, publishing the inbound gateways to some of them as part of their routerInfo in -place of their transport addresses. When a peer wants to get in touch with them, -they see those tunnel gateways in the netDb and simply send the relevant message to -them through one of the published tunnels. If the peer behind the restricted route -wants to reply, it may do so either directly (if they are willing to expose their IP -to the peer) or indirectly through their outbound tunnels. When the routers that the -peer has direct connections to want to reach it (to forward tunnel messages, for -instance), they simply prioritize their direct connection over the published tunnel -gateway. The concept of 'client routers' simply extends the restricted route by not -publishing any router addresses. Such a router would not even need to publish their -routerInfo in the netDb, merely providing their self signed routerInfo to the peers -that it contacts (necessary to pass the router's public keys). Both levels of -restricted route operation are planned for I2P 2.0. -

- -

-There are tradeoffs for those behind restricted routes, as they would likely -participate in other people's tunnels less frequently, and the routers which -they are connected to would be able to infer traffic patterns that would not -otherwise be exposed. On the other hand, if the cost of that exposure is less -than the cost of an IP being made available, it may be worthwhile. This, of course, -assumes that the peers that the router behind a restricted route contacts are not -hostile - either the network is large enough that the probability of using a hostile -peer to get connected is small enough, or trusted (and perhaps temporary) peers are -used instead. -

- -

Variable latency

- -

-Even though the bulk of I2P's initial efforts have been on low latency communication, -it was designed with variable latency services in mind from the beginning. At the -most basic level, applications running on top of I2P can offer the anonymity of -medium and high latency communication while still blending their traffic patterns -in with low latency traffic. Internally though, I2P can offer its own medium and -high latency communication through the garlic encryption - specifying that the -message should be sent after a certain delay, at a certain time, after a certain -number of messages have passed, or another mix strategy. With the layered encryption, -only the router that the clove exposed the delay request would know that the message -requires high latency, allowing the traffic to blend in further with the low latency -traffic. Once the transmission precondition is met, the router holding on to the -clove (which itself would likely be a garlic message) simply forwards it as -requested - to a router, to a tunnel, or, most likely, to a remote client destination. -

- -

-There are a substantial number of ways to exploit this capacity for high latency -comm in I2P, but for the moment, doing so has been scheduled for the I2P 3.0 release. -In the meantime, those requiring the anonymity that high latency comm can offer should -look towards the application layer to provide it. -

- -

Open questions

-
-How to get rid of the timing constraint?
-Can we deal with the sessionTags more efficiently?
-What, if any, batching/mixing strategies should be made available on the tunnels?
-What other tunnel peer selection and ordering strategies should be available?
-
- -

Similar systems

-

-I2P's architecture builds on the concepts of message oriented middleware, the topology -of DHTs, the anonymity and cryptography of free route mixnets, and the adaptability of -packet switched networking. The value comes not from novel concepts of algorithms -though, but from careful engineering combining the research results of existing -systems and papers. While there are a few similar efforts worth reviewing, both for -technical and functional comparisons, two in particular are pulled out here - Tor -and Freenet. -

- -

Tor

-

website

- -

-At first glance, Tor and I2P have many functional and anonymity related similarities. -While I2P's development began before we were aware of the early stage efforts on Tor, -many of the lessons of the original onion routing and ZKS efforts were integrated into -I2P's design. Rather than building an essentially trusted, centralized system with -directory servers, I2P has a self organizing network database with each peer taking on -the responsibility of profiling other routers to determine how best to exploit available -resources. Another key difference is that while both I2P and Tor use layered and -ordered paths (tunnels and circuits/streams), I2P is fundamentally a packet switched -network, while Tor is fundamentally a circuit switched one, allowing I2P to -transparently route around congestion or other network failures, operate redundant -pathways, and load balance the data across available resources. While Tor offers -the useful outproxy functionality by offering integrated outproxy discovery and -selection, I2P leaves such application layer decisions up to applications running on -top of I2P - in fact, I2P has even externalized the TCP-like streaming library itself -to the application layer, allowing developers to experiment with different strategies, -exploiting their domain specific knowledge to offer better performance. -

- -

-From an anonymity perspective, there is much similarity when the core networks are -compared. However, there are a few key differences. When dealing with an internal -adversary or most external adversaries, I2P's simplex tunnels expose half as much -traffic data than would be exposed with Tor's duplex circuits by simply looking at -the flows themselves - an HTTP request and response would follow the same path in -Tor, while in I2P the packets making up the request would go out through one or -more outbound tunnels and the packets making up the response would come back through -one or more different inbound tunnels. While I2P's peer selection and ordering -strategies should sufficiently address predecessor attacks, I2P can trivially -mimic Tor's non-redundant duplex tunnels by simply building an inbound and -outbound tunnel along the same routers.

- -

-Another anonymity issue comes up in Tor's use of telescopic tunnel creation, as -simple packet counting and timing measurements as the cells in a circuit pass -through an adversary's node exposes statistical information regarding where the -adversary is within the circuit. I2P's unidirectional tunnel creation with a -single message so that this data is not exposed. Protecting the position in a -tunnel is important, as an adversary would otherwise be able to mounting a -series of powerful predecessor, intersection, and traffic confirmation attacks. -

- -

-Tor's support for a second tier of "onion proxies" does offer a nontrivial degree -of anonymity while requiring a low cost of entry, while I2P will not offer this -topology until 2.0. -

- -

-On the whole, Tor and I2P complement each other in their focus - Tor works towards -offering high speed anonymous Internet outproxying, while I2P works towards offering -a decentralized resilient network in itself. In theory, both can be used to achieve -both purposes, but given limited development resources, they both have their -strengths and weaknesses. The I2P developers have considered the steps necessary to -modify Tor to take advantage of I2P's design, but concerns of Tor's viability under -resource scarcity suggest that I2P's packet switching architecture will be able to -exploit scarce resources more effectively. -

- -

Freenet

-

website

- -

-Freenet played a large part in the initial stages of I2P's design - giving proof to -the viability of a vibrant pseudonymous community completely contained within the -network, demonstrating that the dangers inherent in outproxies could be avoided. -The first seed of I2P began as a replacement communication layer for Freenet, -attempting to factor out the complexities of a scalable, anonymous and secure point -to point communication from the complexities of a censorship resistant distributed -data store. Over time however, some of the anonymity and scalability issues -inherent in Freenet's algorithms made it clear that I2P's focus should stay strictly -on providing a generic anonymous communication layer, rather than as a component of -Freenet. Over the years, the Freenet developers have come to see the weaknesses -in the older design, prompting them to suggest that they will require a "premix" -layer to offer substantial anonymity. In other words, Freenet needs to run on top -of a mixnet such as I2P or Tor, with "client nodes" requesting and publishing data -through the mixnet to the "server nodes" which then fetch and store the data according -to Freenet's heuristic distributed data storage algorithms. -

- -

-Freenet's functionality is very complementary to I2P's, as Freenet natively provides -many of the tools for operating medium and high latency systems, while I2P natively -provides the low latency mix network suitable for offering adequate anonymity. The -logic of separating the mixnet from the censorship resistant distributed data store -still seems self evident from an engineering, anonymity, security, and resource -allocation perspective, so hopefully the Freenet team will pursue efforts in that -direction, if not simply reusing (or helping to improve, as necessary) existing -mixnets like I2P or Tor. -

- -

-It is worth mentioning that there has recently been discussion and work by the -Freenet developers on a "globally scalable darknet" using restricted routes between -peers of various trust. While insufficient information has been made publicly -available regarding how such a system would operate for a full review, from what -has been said the anonymity and scalability claims seem highly dubious. In -particular, the appropriateness for use in hostile regimes against state level -adversaries has been tremendously overstated, and any analysis on the implications -of resource scarcity upon the scalability of the network has seemingly been avoided. -Further questions regarding susceptibility to traffic analysis, trust, and other topics -do exist, but a more in-depth review of this "globally scalable darknet" will have -to wait until the Freenet team makes more information available. -

- -

Appendix A: Application layer

- -

-I2P itself doesn't really do much - it simply sends messages to remote destinations -and receives messages targeting local destinations - most of the interesting work -goes on at the layers above it. By itself, I2P could be seen as an anonymous and -secure IP layer, and the bundled streaming library as -an implementation of an anonymous and secure TCP layer on top of it. Beyond that, -I2PTunnel exposes a generic TCP proxying system for -either getting into or out of the I2P network, plus a variety of network -applications provide further functionality for end users. -

- -

Streaming library

- -

-The streaming library has grown organically for I2P - first mihi implemented the -"mini streaming library" as part of I2PTunnel, which was limited to a window -size of 1 message (requiring an ACK before sending the next one), and then it was -refactored out into a generic streaming interface (mirroring TCP sockets) and the -full streaming implementation was deployed with a sliding window protocol and -optimizations to take into account the high bandwidth x delay product. Individual -streams may adjust the maximum packet size and other options, though the default -of 4KB compressed seems a reasonable tradeoff between the bandwidth costs of -retransmitting lost messages and the latency of multiple messages. -

- -

-In addition, in consideration of the relatively high cost of subsequent messages, -the streaming library's protocol for scheduling and delivering messages has been optimized to -allow individual messages passed to contain as much information as is available. -For instance, a small HTTP transaction proxied through the streaming library can -be completed in a single round trip - the first message bundles a SYN, FIN, and -the small payload (an HTTP request typically fits) and the reply bundles the SYN, -FIN, ACK, and the small payload (many HTTP responses fit). While an additional -ACK must be transmitted to tell the HTTP server that the SYN/FIN/ACK has been -received, the local HTTP proxy can deliver the full response to the browser -immediately. -

- -

-On the whole, however, the streaming library bears much resemblance to an -abstraction of TCP, with its sliding windows, congestion control algorithms -(both slow start and congestion avoidance), and general packet behavior (ACK, -SYN, FIN, RST, rto calculation, etc). -

- -

Naming library and addressbook

-

Developed by: mihi, Ragnarok

- -

-Naming within I2P has been an oft-debated topic since the very beginning with -advocates across the spectrum of possibilities. However, given I2P's inherent -demand for secure communication and decentralized operation, the traditional -DNS-style naming system is clearly out, as are "majority rules" voting systems. -Instead, I2P ships with a generic naming library and a base implementation -designed to work off a local name to destination mapping, as well as an optional -add-on application called the "addressbook". The addressbook is a web-of-trust -driven secure, distributed, and human readable naming system, sacrificing only -the call for all human readable names to be globally unique by mandating only -local uniqueness. While all messages in I2P are cryptographically addressed -by their destination, different people can have local addressbook entries for -"Alice" which refer to different destinations. People can still discover new -names by importing published addressbooks of peers specified in their web of trust, -by adding in the entries provided through a third party, or (if some people organize -a series of published addressbooks using a first come first serve registration -system) people can choose to treat these addressbooks as name servers, emulating -traditional DNS. -

- -

-I2P does not promote the use of DNS-like services though, as the damage done -by hijacking a site can be tremendous - and insecure destinations have no -value. DNSsec itself still falls back on registrars and certificate authorities, -while with I2P, requests sent to a destination cannot be intercepted or the reply -spoofed, as they are encrypted to the destination's public keys, and a destination -itself is just a pair of public keys and a certificate. DNS-style systems on the -other hand allow any of the name servers on the lookup path to mount simple denial -of service and spoofing attacks. Adding on a certificate authenticating the -responses as signed by some centralized certificate authority would address many of -the hostile nameserver issues but would leave open replay attacks as well as -hostile certificate authority attacks. -

- -

-Voting style naming is dangerous as well, especially given the effectiveness of -Sybil attacks in anonymous systems - the attacker can simply create an arbitrarily -high number of peers and "vote" with each to take over a given name. Proof-of-work -methods can be used to make identity non-free, but as the network grows the load -required to contact everyone to conduct online voting is implausible, or if the -full network is not queried, different sets of answers may be reachable. -

- -

-As with the Internet however, I2P is keeping the design and operation of a -naming system out of the (IP-like) communication layer. The bundled naming library -includes a simple service provider interface which alternate naming systems can -plug into, allowing end users to drive what sort of naming tradeoffs they prefer. -

- -

Syndie

- -

-Syndie is a safe, anonymous blogging / content publication / content aggregation system. -It lets you create information, share it with others, and read posts from those you're -interested in, all while taking into consideration your needs for security and anonymity. -Rather than building its own content distribution network, Syndie is designed to run on -top of existing networks, syndicating content through eepsites, Tor hidden services, -Freenet freesites, normal websites, usenet newgroups, email lists, RSS feeds, etc. Data -published with Syndie is done so as to offer pseudonymous authentication to anyone -reading or archiving it. -

- -

I2PTunnel

-

Developed by: mihi

- -

-I2PTunnel is probably I2P's most popular and versatile client application, allowing -generic proxying both into and out of the I2P network. I2PTunnel can be viewed as -four separate proxying applications - a "client" which receives inbound TCP connections -and forwards them to a given I2P destination, an "httpclient" (aka "eepproxy") which -acts like an HTTP proxy and forwards the requests to the appropriate I2P destination -(after querying the naming service if necessary), a "server" which receives inbound I2P -streaming connections on a destination and forwards them to a given TCP host+port, -and an "httpserver" which extends the "server" by parsing the HTTP request and -responses to allow safer operation. There is an additional "socksclient" application, -but its use is not encouraged for reasons previously mentioned. -

- -

-I2P itself is not an outproxy network - the anonymity and security concerns inherent -in a mix net which forwards data into and out of the mix have kept I2P's design focused -on providing an anonymous network which capable of meeting the user's needs without -requiring external resources. However, the I2PTunnel "httpclient" application offers -a hook for outproxying - if the hostname requested doesn't end in ".i2p", it picks a -random destination from a user-provided set of outproxies and forwards the request to -them. These destinations are simply I2PTunnel "server" instances run by volunteers -who have explicitly chosen to run outproxies - no one is an outproxy by default, and -running an outproxy doesn't automatically tell other people to proxy through you. -While outproxies do have inherent weaknesses, they offer a simple proof of concept for -using I2P and provide some functionality under a threat model which may be sufficient -for some users. -

- -

-I2PTunnel enables most of the applications in use. An "httpserver" pointing at a -webserver lets anyone run their own anonymous website (or "eepsite") - a webserver -is bundled with I2P for this purpose, but any webserver can be used. Anyone may -run a "client" pointing at one of the anonymously hosted IRC servers, each of which -are running a "server" pointing at their local IRCd and communicating between IRCds -over their own "client" tunnels. End users also have "client" tunnels pointing at -I2Pmail's POP3 and SMTP destinations (which in turn are -simply "server" instances pointing at POP3 and SMTP servers), as well as "client" -tunnels pointing at I2P's CVS server, allowing anonymous development. At times people have -even run "client" proxies to access the "server" instances pointing at an NNTP server. -

- -

i2p-bt

-

Developed by: duck, et al

- -

-i2p-bt is a port of the mainline python BitTorrent client to run both the tracker and -peer communication over I2P. Tracker requests are forwarded through the eepproxy to -eepsites specified in the torrent file while tracker responses refer to peers by their -destination explicitly, allowing i2p-bt to open up a -streaming lib connection to query them for blocks. -

- -

-In addition to i2p-bt, a port of bytemonsoon has been made to I2P, making a few -modifications as necessary to strip any anonymity-compromising information from the -application and to take into consideration the fact that IPs cannot be used for -identifying peers. -

- -

I2PSnark

-

I2PSnark developed: jrandom, et al, ported from mjw's Snark client

- -

-Bundled with the I2P install, I2PSnark offers a simple anonymous bittorrent -client with multitorrent capabilities, exposing all of the functionality through -a plain HTML web interface. -

- -

Azureus/azneti2p

-

Developed by: parg, et al

- -

-The developers of the Azureus BitTorrent client -have created an "azneti2p" plugin, allowing Azureus users to participate in anonymous -swarms over I2P, or simply to access anonymously hosted trackers while contacting -each peer directly. In addition, Azureus' built in tracker lets people run their -own anonymous trackers without running bytemonsoon (which has substantial prerequisites) -or i2p-bt's tracker. The plugin is currently (July 2005) fully functional, but is in early -beta and has a fairly complicated configuration process, though it is hopefully going -to be streamlined further. -

- -

I2Phex

-

Developed by: sirup

- -

-I2Phex is a fairly direct port of the Phex Gnutella filesharing client to run -entirely on top of I2P. While it has disabled some of Phex's functionality, -such as integration with Gnutella webcaches, the basic file sharing and chatting -system is fully functional. -

- -

I2Pmail/susimail

-

Developed by: postman, susi23, mastiejaner

- -

-I2Pmail is more a service than an application - postman offers both internal and -external email with POP3 and SMTP service through I2PTunnel instances accessing a -series of components developed with mastiejaner, allowing people to use their -preferred mail clients to send and receive mail pseudonymously. However, as most -mail clients expose substantial identifying information, I2P bundles susi23's -web based susimail client which has been built specifically with I2P's anonymity -needs in mind. The I2Pmail/mail.i2p service offers transparent virus filtering as -well as denial of service prevention with hashcash augmented quotas. -In addition, each user has control of their batching strategy prior to delivery -through the mail.i2p outproxies, which are separate from the mail.i2p SMTP and -POP3 servers - both the outproxies and inproxies communicate with the mail.i2p -SMTP and POP3 servers through I2P itself, so compromising those non-anonymous -locations does not give access to the mail accounts or activity patterns of the -user. At the moment the developers work on a decentralized mailsystem, called -"v2mail". More information can be found on the eepsite -hq.postman.i2p. -

- - - diff --git a/router/doc/tunnel-alt-creation.html b/router/doc/tunnel-alt-creation.html deleted file mode 100644 index 0eb4a5d901..0000000000 --- a/router/doc/tunnel-alt-creation.html +++ /dev/null @@ -1,163 +0,0 @@ -$Id: tunnel-alt-creation.html,v 1.1.2.1 2006/02/01 20:28:34 jrandom Exp $ -
-1) Tunnel creation
-1.1) Tunnel creation request record
-1.2) Hop processing
-1.3) Tunnel creation reply record
-1.4) Request preparation
-1.5) Request delivery
-1.6) Endpoint handling
-1.7) Reply processing
-2) Notes
-
- -

1) Tunnel creation encryption:

- -

The tunnel creation is accomplished by a single message passed along -the path of peers in the tunnel, rewritten in place, and transmitted -back to the tunnel creator. This single tunnel message is made up -of a fixed number of records (8) - one for each potential peer in -the tunnel. Individual records are asymmetrically encrypted to be -read only by a specific peer along the path, while an additional -symmetric layer of encryption is added at each hop so as to expose -the asymmetrically encrypted record only at the appropriate time.

- -

1.1) Tunnel creation request record

- -

Cleartext of the record, visible only to the hop being asked:

-  bytes     0-3: tunnel ID to receive messages as
-  bytes    4-35: local router identity hash
-  bytes   36-39: next tunnel ID
-  bytes   40-71: next router identity hash
-  bytes  72-103: AES-256 tunnel layer key
-  bytes 104-135: AES-256 tunnel IV key
-  bytes 136-167: AES-256 reply key
-  bytes 168-183: reply IV
-  byte      184: flags
-  bytes 185-188: request time (in hours since the epoch)
-  bytes 189-192: next message ID
-  bytes 193-222: uninterpreted / random padding
- -

The next tunnel ID and next router identity hash fields are used to -specify the next hop in the tunnel, though for an outbound tunnel -endpoint, they specify where the rewritten tunnel creation reply -message should be sent. In addition, the next message ID specifies the -message ID that the message (or reply) should use.

- -

The flags field currently has two bits defined:

- bit 0: if set, allow messages from anyone
- bit 1: if set, allow messages to anyone, and send the reply to the
-        specified next hop in a tunnel message
- -

That cleartext record is ElGamal 2048 encrypted with the hop's -public encryption key and formatted into a 528 byte record:

-  bytes   0-15: SHA-256-128 of the current hop's router identity
-  bytes 16-527: ElGamal-2048 encrypted request record
- -

Since the cleartext uses the full field, there is no need for -additional padding beyond SHA256(cleartext) + cleartext.

- -

1.2) Hop processing

- -

When a hop receives a TunnelBuildMessage, it looks through the 8 -records contained within it for one starting with their own identity -hash (trimmed to 8 bytes). It then decryptes the ElGamal block from -that record and retrieves the protected cleartext. At that point, -they make sure the tunnel request is not a duplicate by feeding the -AES-256 reply key into a bloom filter and making sure the request -time is within an hour of current. Duplicates or invalid requests -are dropped.

- -

After deciding whether they will agree to participate in the tunnel -or not, they replace the record that had contained the request with -an encrypted reply block. All other records are AES-256/CBC -encrypted with the included reply key and IV (though each is -encrypted separately, rather than chained across records).

- -

1.3) Tunnel creation reply record

- -

After the current hop reads their record, they replace it with a -reply record stating whether or not they agree to participate in the -tunnel, and if they do not, they classify their reason for -rejection. This is simply a 1 byte value, with 0x0 meaning they -agree to participate in the tunnel, and higher values meaning higher -levels of rejection. The reply is encrypted with the AES session -key delivered to it in the encrypted block, padded with random data -until it reaches the full record size:

-  AES-256-CBC(SHA-256(padding+status) + padding + status, key, IV)
- -

1.4) Request preparation

- -

When building a new request, all of the records must first be -built and asymmetrically encrypted. Each record should then be -decrypted with the reply keys and IVs of the hops earlier in the -path. That decryption should be run in reverse order so that the -asymmetrically encrypted data will show up in the clear at the -right hop after their predecessor encrypts it.

- -

The excess records not needed for individual requests are simply -filled with random data by the creator.

- -

1.5) Request delivery

- -

For outbound tunnels, the delivery is done directly from the tunnel -creator to the first hop, packaging up the TunnelBuildMessage as if -the creator was just another hop in the tunnel. For inbound -tunnels, the delivery is done through an existing outbound tunnel -(and during startup, when no outbound tunnel exists yet, a fake 0 -hop outbound tunnel is used).

- -

1.6) Endpoint handling

- -

When the request reaches an outbound endpoint (as determined by the -'allow messages to anyone' flag), the hop is processed as usual, -encrypting a reply in place of the record and encrypting all of the -other records, but since there is no 'next hop' to forward the -TunnelBuildMessage on to, it instead places the encrypted reply -records into a TunnelBuildReplyMessage and delivers it to the -reply tunnel specified within the request record. That reply tunnel -forwards the reply records down to the tunnel creator for -processing, as below.

- -

When the request reaches the inbound endpoint (also known as the -tunnel creator), the router processes each of the replies, as below.

- -

1.7) Reply processing

- -

To process the reply records, the creator simply has to AES decrypt -each record individually, using the reply key and IV of each hop in -the tunnel after the peer (in reverse order). This then exposes the -reply specifying whether they agree to participate in the tunnel or -why they refuse. If they all agree, the tunnel is considered -created and may be used immediately, but if anyone refuses, the -tunnel is discarded.

- -

2) Notes

- diff --git a/router/doc/tunnel-alt.html b/router/doc/tunnel-alt.html deleted file mode 100644 index 2d5f2be2ff..0000000000 --- a/router/doc/tunnel-alt.html +++ /dev/null @@ -1,467 +0,0 @@ -$Id: tunnel-alt.html,v 1.9 2005/07/27 14:04:07 jrandom Exp $ -
-1) Tunnel overview
-2) Tunnel operation
-2.1) Message preprocessing
-2.2) Gateway processing
-2.3) Participant processing
-2.4) Endpoint processing
-2.5) Padding
-2.6) Tunnel fragmentation
-2.7) Alternatives
-2.7.1) Adjust tunnel processing midstream
-2.7.2) Use bidirectional tunnels
-2.7.3) Backchannel communication
-2.7.4) Variable size tunnel messages
-3) Tunnel building
-3.1) Peer selection
-3.1.1) Exploratory tunnel peer selection
-3.1.2) Client tunnel peer selection
-3.2) Request delivery
-3.3) Pooling
-3.4) Alternatives
-3.4.1) Telescopic building
-3.4.2) Non-exploratory tunnels for management
-4) Tunnel throttling
-5) Mixing/batching
-
- -

1) Tunnel overview

- -

Within I2P, messages are passed in one direction through a virtual -tunnel of peers, using whatever means are available to pass the -message on to the next hop. Messages arrive at the tunnel's -gateway, get bundled up and/or fragmented into fixed sizes tunnel messages, -and are forwarded on to the next hop in the tunnel, which processes and verifies -the validity of the message and sends it on to the next hop, and so on, until -it reaches the tunnel endpoint. That endpoint takes the messages -bundled up by the gateway and forwards them as instructed - either -to another router, to another tunnel on another router, or locally.

- -

Tunnels all work the same, but can be segmented into two different -groups - inbound tunnels and outbound tunnels. The inbound tunnels -have an untrusted gateway which passes messages down towards the -tunnel creator, which serves as the tunnel endpoint. For outbound -tunnels, the tunnel creator serves as the gateway, passing messages -out to the remote endpoint.

- -

The tunnel's creator selects exactly which peers will participate -in the tunnel, and provides each with the necessary configuration -data. They may have any number of hops, but may be constrained with various -proof-of-work requests to add on additional steps. It is the intent to make -it hard for either participants or third parties to determine the length of -a tunnel, or even for colluding participants to determine whether they are a -part of the same tunnel at all (barring the situation where colluding peers are -next to each other in the tunnel).

- -

Beyond their length, there are additional configurable parameters -for each tunnel that can be used, such as a throttle on the frequency of -messages delivered, how padding should be used, how long a tunnel should be -in operation, whether to inject chaff messages, and what, if any, batching -strategies should be employed.

- -

In practice, a series of tunnel pools are used for different -purposes - each local client destination has its own set of inbound -tunnels and outbound tunnels, configured to meet its anonymity and -performance needs. In addition, the router itself maintains a series -of pools for participating in the network database and for managing -the tunnels themselves.

- -

I2P is an inherently packet switched network, even with these -tunnels, allowing it to take advantage of multiple tunnels running -in parallel, increasing resilience and balancing load. Outside of -the core I2P layer, there is an optional end to end streaming library -available for client applications, exposing TCP-esque operation, -including message reordering, retransmission, congestion control, etc.

- -

2) Tunnel operation

- -

Tunnel operation has four distinct processes, taken on by various -peers in the tunnel. First, the tunnel gateway accumulates a number -of tunnel messages and preprocesses them into something for tunnel -delivery. Next, that gateway encrypts that preprocessed data, then -forwards it to the first hop. That peer, and subsequent tunnel -participants, unwrap a layer of the encryption, verifying that it isn't -a duplicate, then forward it on to the next peer. -Eventually, the message arrives at the endpoint where the messages -bundled by the gateway are split out again and forwarded on as -requested.

- -

Tunnel IDs are 4 byte numbers used at each hop - participants know what -tunnel ID to listen for messages with and what tunnel ID they should be forwarded -on as to the next hop, and each hop chooses the tunnel ID which they receive messages -on. Tunnels themselves are short lived (10 minutes at the -moment), and even if subsequent tunnels are built using the same sequence of -peers, each hop's tunnel ID will change.

- -

2.1) Message preprocessing

- -

When the gateway wants to deliver data through the tunnel, it first -gathers zero or more I2NP messages, selects how much padding will be used, -fragments it across the necessary number of 1KB tunnel messages, and decides how -each I2NP message should be handled by the tunnel endpoint, encoding that -data into the raw tunnel payload:

- - -

The instructions are encoded with a single control byte, followed by any -necessary additional information. The first bit in that control byte determines -how the remainder of the header is interpreted - if it is not set, the message -is either not fragmented or this is the first fragment in the message. If it is -set, this is a follow on fragment.

- -

With the first bit being 0, the instructions are:

- - -

If the first bit being 1, the instructions are:

- - -

The I2NP message is encoded in its standard form, and the -preprocessed payload must be padded to a multiple of 16 bytes.

- -

2.2) Gateway processing

- -

After the preprocessing of messages into a padded payload, the gateway builds -a random 16 byte IV value, iteratively encrypting it and the tunnel message as -necessary, and forwards the tuple {tunnelID, IV, encrypted tunnel message} to the next hop.

- -

How encryption at the gateway is done depends on whether the tunnel is an -inbound or an outbound tunnel. For inbound tunnels, they simply select a random -IV, postprocessing and updating it to generate the IV for the gateway and using -that IV along side their own layer key to encrypt the preprocessed data. For outbound -tunnels they must iteratively decrypt the (unencrypted) IV and preprocessed -data with the IV and layer keys for all hops in the tunnel. The result of the outbound -tunnel encryption is that when each peer encrypts it, the endpoint will recover -the initial preprocessed data.

- -

2.3) Participant processing

- -

When a peer receives a tunnel message, it checks that the message came from -the same previous hop as before (initialized when the first message comes through -the tunnel). If the previous peer is a different router, or if the message has -already been seen, the message is dropped. The participant then encrypts the -received IV with AES256/ECB using their IV key to determine the current IV, uses -that IV with the participant's layer key to encrypt the data, encrypts the -current IV with AES256/ECB using their IV key again, then forwards the tuple -{nextTunnelId, nextIV, encryptedData} to the next hop. This double encryption -of the IV (both before and after use) help address a certain class of -confirmation attacks.

- -

Duplicate message detection is handled by a decaying Bloom filter on message -IVs. Each router maintains a single Bloom filter to contain the XOR of the IV and -the first block of the message received for all of the tunnels it is participating -in, modified to drop seen entries after 10-20 minutes (when the tunnels will have -expired). The size of the bloom filter and the parameters used are sufficient to -more than saturate the router's network connection with a negligible chance of -false positive. The unique value fed into the Bloom filter is the XOR of the IV -and the first block so as to prevent nonsequential colluding peers in the tunnel -from tagging a message by resending it with the IV and first block switched.

- -

2.4) Endpoint processing

- -

After receiving and validating a tunnel message at the last hop in the tunnel, -how the endpoint recovers the data encoded by the gateway depends upon whether -the tunnel is an inbound or an outbound tunnel. For outbound tunnels, the -endpoint encrypts the message with its layer key just like any other participant, -exposing the preprocessed data. For inbound tunnels, the endpoint is also the -tunnel creator so they can merely iteratively decrypt the IV and message, using the -layer and IV keys of each step in reverse order.

- -

At this point, the tunnel endpoint has the preprocessed data sent by the gateway, -which it may then parse out into the included I2NP messages and forwards them as -requested in their delivery instructions.

- -

2.5) Padding

- -

Several tunnel padding strategies are possible, each with their own merits:

- - - -

These padding strategies can be used on a variety of levels, addressing the -exposure of message size information to different adversaries. After gathering -and reviewing some statistics -from the 0.4 network, as well as exploring the anonymity tradeoffs, we're starting -with a fixed tunnel message size of 1024 bytes. Within this however, the fragmented -messages themselves are not padded by the tunnel at all (though for end to end -messages, they may be padded as part of the garlic wrapping).

- -

2.6) Tunnel fragmentation

- -

To prevent adversaries from tagging the messages along the path by adjusting -the message size, all tunnel messages are a fixed 1024 bytes in size. To accommodate -larger I2NP messages as well as to support smaller ones more efficiently, the -gateway splits up the larger I2NP messages into fragments contained within each -tunnel message. The endpoint will attempt to rebuild the I2NP message from the -fragments for a short period of time, but will discard them as necessary.

- -

Routers have a lot of leeway as to how the fragments are arranged, whether -they are stuffed inefficiently as discrete units, batched for a brief period to -fit more payload into the 1024 byte tunnel messages, or opportunistically padded -with other messages that the gateway wanted to send out.

- -

2.7) Alternatives

- -

2.7.1) Adjust tunnel processing midstream

- -

While the simple tunnel routing algorithm should be sufficient for most cases, -there are three alternatives that can be explored:

- - -

2.7.2) Use bidirectional tunnels

- -

The current strategy of using two separate tunnels for inbound and outbound -communication is not the only technique available, and it does have anonymity -implications. On the positive side, by using separate tunnels it lessens the -traffic data exposed for analysis to participants in a tunnel - for instance, -peers in an outbound tunnel from a web browser would only see the traffic of -an HTTP GET, while the peers in an inbound tunnel would see the payload -delivered along the tunnel. With bidirectional tunnels, all participants would -have access to the fact that e.g. 1KB was sent in one direction, then 100KB -in the other. On the negative side, using unidirectional tunnels means that -there are two sets of peers which need to be profiled and accounted for, and -additional care must be taken to address the increased speed of predecessor -attacks. The tunnel pooling and building process outlined below should -minimize the worries of the predecessor attack, though if it were desired, -it wouldn't be much trouble to build both the inbound and outbound tunnels -along the same peers.

- -

2.7.3) Backchannel communication

- -

At the moment, the IV values used are random values. However, it is -possible for that 16 byte value to be used to send control messages from the -gateway to the endpoint, or on outbound tunnels, from the gateway to any of the -peers. The inbound gateway could encode certain values in the IV once, which -the endpoint would be able to recover (since it knows the endpoint is also the -creator). For outbound tunnels, the creator could deliver certain values to the -participants during the tunnel creation (e.g. "if you see 0x0 as the IV, that -means X", "0x1 means Y", etc). Since the gateway on the outbound tunnel is also -the creator, they can build a IV so that any of the peers will receive the -correct value. The tunnel creator could even give the inbound tunnel gateway -a series of IV values which that gateway could use to communicate with -individual participants exactly one time (though this would have issues regarding -collusion detection)

- -

This technique could later be used deliver message mid stream, or to allow the -inbound gateway to tell the endpoint that it is being DoS'ed or otherwise soon -to fail. At the moment, there are no plans to exploit this backchannel.

- -

2.7.4) Variable size tunnel messages

- -

While the transport layer may have its own fixed or variable message size, -using its own fragmentation, the tunnel layer may instead use variable size -tunnel messages. The difference is an issue of threat models - a fixed size -at the transport layer helps reduce the information exposed to external -adversaries (though overall flow analysis still works), but for internal -adversaries (aka tunnel participants) the message size is exposed. Fixed size -tunnel messages help reduce the information exposed to tunnel participants, but -does not hide the information exposed to tunnel endpoints and gateways. Fixed -size end to end messages hide the information exposed to all peers in the -network.

- -

As always, its a question of who I2P is trying to protect against. Variable -sized tunnel messages are dangerous, as they allow participants to use the -message size itself as a backchannel to other participants - e.g. if you see a -1337 byte message, you're on the same tunnel as another colluding peer. Even -with a fixed set of allowable sizes (1024, 2048, 4096, etc), that backchannel -still exists as peers could use the frequency of each size as the carrier (e.g. -two 1024 byte messages followed by an 8192). Smaller messages do incur the -overhead of the headers (IV, tunnel ID, hash portion, etc), but larger fixed size -messages either increase latency (due to batching) or dramatically increase -overhead (due to padding). Fragmentation helps ammortize the overhead, at the -cost of potential message loss due to lost fragments.

- -

Timing attacks are also relevent when reviewing the effectiveness of fixed -size messages, though they require a substantial view of network activity -patterns to be effective. Excessive artificial delays in the tunnel will be -detected by the tunnel's creator, due to periodic testing, causing that entire -tunnel to be scrapped and the profiles for peers within it to be adjusted.

- -

3) Tunnel building

- -

When building a tunnel, the creator must send a request with the necessary -configuration data to each of the hops and wait for all of them to agree before -enabling the tunnel. The requests are encrypted so that only the peers who need -to know a piece of information (such as the tunnel layer or IV key) has that -data. In addition, only the tunnel creator will have access to the peer's -reply. There are three important dimensions to keep in mind when producing -the tunnels: what peers are used (and where), how the requests are sent (and -replies received), and how they are maintained.

- -

3.1) Peer selection

- -

Beyond the two types of tunnels - inbound and outbound - there are two styles -of peer selection used for different tunnels - exploratory and client. -Exploratory tunnels are used for both network database maintenance and tunnel -maintenance, while client tunnels are used for end to end client messages.

- -

3.1.1) Exploratory tunnel peer selection

- -

Exploratory tunnels are built out of a random selection of peers from a subset -of the network. The particular subset varies on the local router and on what their -tunnel routing needs are. In general, the exploratory tunnels are built out of -randomly selected peers who are in the peer's "not failing but active" profile -category. The secondary purpose of the tunnels, beyond merely tunnel routing, -is to find underutilized high capacity peers so that they can be promoted for -use in client tunnels.

- -

3.1.2) Client tunnel peer selection

- -

Client tunnels are built with a more stringent set of requirements - the local -router will select peers out of its "fast and high capacity" profile category so -that performance and reliability will meet the needs of the client application. -However, there are several important details beyond that basic selection that -should be adhered to, depending upon the client's anonymity needs.

- -

For some clients who are worried about adversaries mounting a predecessor -attack, the tunnel selection can keep the peers selected in a strict order - -if A, B, and C are in a tunnel, the hop after A is always B, and the hop after -B is always C. A less strict ordering is also possible, assuring that while -the hop after A may be B, B may never be before A. Other configuration options -include the ability for just the inbound tunnel gateways and outbound tunnel -endpoints to be fixed, or rotated on an MTBF rate.

- -

In the initial implementation, only random ordering has been implemented, -though more strict ordering will be developed and deployed over time, as well -as controls for the user to select which strategy to use for individual clients.

- -

3.2) Request delivery

- -

A new tunnel request preparation, delivery, and response method has been -devised, which reduces the number of -predecessors exposed, cuts the number of messages transmitted, verifies proper -connectivity, and avoids the message counting attack of traditional telescopic -tunnel creation. The old technique is listed below as an alternative.

- -

Peers may reject tunnel creation requests for a variety of reasons, though -a series of four increasingly severe rejections are known: probabalistic rejection -(due to approaching the router's capacity, or in response to a flood of requests), -transient overload, bandwidth overload, and critical failure. When received, -those four are interpreted by the tunnel creator to help adjust their profile of -the router in question.

- -

3.3) Pooling

- -

To allow efficient operation, the router maintains a series of tunnel pools, -each managing a group of tunnels used for a specific purpose with their own -configuration. When a tunnel is needed for that purpose, the router selects one -out of the appropriate pool at random. Overall, there are two exploratory tunnel -pools - one inbound and one outbound - each using the router's exploration -defaults. In addition, there is a pair of pools for each local destination - -one inbound and one outbound tunnel. Those pools use the configuration specified -when the local destination connected to the router, or the router's defaults if -not specified.

- -

Each pool has within its configuration a few key settings, defining how many -tunnels to keep active, how many backup tunnels to maintain in case of failure, -how frequently to test the tunnels, how long the tunnels should be, whether those -lengths should be randomized, how often replacement tunnels should be built, as -well as any of the other settings allowed when configuring individual tunnels.

- -

3.4) Alternatives

- -

3.4.1) Telescopic building

- -

One question that may arise regarding the use of the exploratory tunnels for -sending and receiving tunnel creation messages is how that impacts the tunnel's -vulnerability to predecessor attacks. While the endpoints and gateways of -those tunnels will be randomly distributed across the network (perhaps even -including the tunnel creator in that set), another alternative is to use the -tunnel pathways themselves to pass along the request and response, as is done -in TOR. This, however, may lead to leaks -during tunnel creation, allowing peers to discover how many hops there are later -on in the tunnel by monitoring the timing or packet count as -the tunnel is built.

- -

3.4.2) Non-exploratory tunnels for management

- -

A second alternative to the tunnel building process is to give the router -an additional set of non-exploratory inbound and outbound pools, using those for -the tunnel request and response. Assuming the router has a well integrated view -of the network, this should not be necessary, but if the router was partitioned -in some way, using non-exploratory pools for tunnel management would reduce the -leakage of information about what peers are in the router's partition.

- -

3.4.3) Exploratory request delivery

- -

A third alternative, used until I2P 0.6.2, garlic encrypts individual tunnel -request messages and delivers them to the hops individually, transmitting them -through exploratory tunnels with their reply coming back in a separate -exploratory tunnel. This strategy has been dropped in favor of the one outlined -above.

- -

4) Tunnel throttling

- -

Even though the tunnels within I2P bear a resemblance to a circuit switched -network, everything within I2P is strictly message based - tunnels are merely -accounting tricks to help organize the delivery of messages. No assumptions are -made regarding reliability or ordering of messages, and retransmissions are left -to higher levels (e.g. I2P's client layer streaming library). This allows I2P -to take advantage of throttling techniques available to both packet switched and -circuit switched networks. For instance, each router may keep track of the -moving average of how much data each tunnel is using, combine that with all of -the averages used by other tunnels the router is participating in, and be able -to accept or reject additional tunnel participation requests based on its -capacity and utilization. On the other hand, each router can simply drop -messages that are beyond its capacity, exploiting the research used on the -normal internet.

- -

5) Mixing/batching

- -

What strategies should be used at the gateway and at each hop for delaying, -reordering, rerouting, or padding messages? To what extent should this be done -automatically, how much should be configured as a per tunnel or per hop setting, -and how should the tunnel's creator (and in turn, user) control this operation? -All of this is left as unknown, to be worked out for -I2P 3.0

diff --git a/router/doc/tunnel.html b/router/doc/tunnel.html deleted file mode 100644 index a15d19c12f..0000000000 --- a/router/doc/tunnel.html +++ /dev/null @@ -1,529 +0,0 @@ -Note: NOT used! see tunnel-alt.html - -$Id: tunnel.html,v 1.10 2005/01/16 01:07:07 jrandom Exp $ -
-1) Tunnel overview
-2) Tunnel operation
-2.1) Message preprocessing
-2.2) Gateway processing
-2.3) Participant processing
-2.4) Endpoint processing
-2.5) Padding
-2.6) Tunnel fragmentation
-2.7) Alternatives
-2.7.1) Don't use a checksum block
-2.7.2) Adjust tunnel processing midstream
-2.7.3) Use bidirectional tunnels
-2.7.4) Use smaller hashes
-3) Tunnel building
-3.1) Peer selection
-3.1.1) Exploratory tunnel peer selection
-3.1.2) Client tunnel peer selection
-3.2) Request delivery
-3.3) Pooling
-3.4) Alternatives
-3.4.1) Telescopic building
-3.4.2) Non-exploratory tunnels for management
-4) Tunnel throttling
-5) Mixing/batching
-
- -

1) Tunnel overview

- -

Within I2P, messages are passed in one direction through a virtual -tunnel of peers, using whatever means are available to pass the -message on to the next hop. Messages arrive at the tunnel's -gateway, get bundled up for the path, and are forwarded on to the -next hop in the tunnel, which processes and verifies the validity -of the message and sends it on to the next hop, and so on, until -it reaches the tunnel endpoint. That endpoint takes the messages -bundled up by the gateway and forwards them as instructed - either -to another router, to another tunnel on another router, or locally.

- -

Tunnels all work the same, but can be segmented into two different -groups - inbound tunnels and outbound tunnels. The inbound tunnels -have an untrusted gateway which passes messages down towards the -tunnel creator, which serves as the tunnel endpoint. For outbound -tunnels, the tunnel creator serves as the gateway, passing messages -out to the remote endpoint.

- -

The tunnel's creator selects exactly which peers will participate -in the tunnel, and provides each with the necessary confiruration -data. They may vary in length from 0 hops (where the gateway -is also the endpoint) to 8 hops (where there are 6 peers after -the gateway and before the endpoint). It is the intent to make -it hard for either participants or third parties to determine -the length of a tunnel, or even for colluding participants to -determine whether they are a part of the same tunnel at all -(barring the situation where colluding peers are next to each other -in the tunnel). Messages that have been corrupted are also dropped -as soon as possible, reducing network load.

- -

Beyond their length, there are additional configurable parameters -for each tunnel that can be used, such as a throttle on the size or -frequency of messages delivered, how padding should be used, how -long a tunnel should be in operation, whether to inject chaff -messages, whether to use fragmentation, and what, if any, batching -strategies should be employed.

- -

In practice, a series of tunnel pools are used for different -purposes - each local client destination has its own set of inbound -tunnels and outbound tunnels, configured to meet its anonymity and -performance needs. In addition, the router itself maintains a series -of pools for participating in the network database and for managing -the tunnels themselves.

- -

I2P is an inherently packet switched network, even with these -tunnels, allowing it to take advantage of multiple tunnels running -in parallel, increasing resiliance and balancing load. Outside of -the core I2P layer, there is an optional end to end streaming library -available for client applications, exposing TCP-esque operation, -including message reordering, retransmission, congestion control, etc.

- -

2) Tunnel operation

- -

Tunnel operation has four distinct processes, taken on by various -peers in the tunnel. First, the tunnel gateway accumulates a number -of tunnel messages and preprocesses them into something for tunnel -delivery. Next, that gateway encrypts that preprocessed data, then -forwards it to the first hop. That peer, and subsequent tunnel -participants, unwrap a layer of the encryption, verifying the -integrity of the message, then forward it on to the next peer. -Eventually, the message arrives at the endpoint where the messages -bundled by the gateway are split out again and forwarded on as -requested.

- -

Tunnel IDs are 4 byte numbers used at each hop - participants know what -tunnel ID to listen for messages with and what tunnel ID they should be forwarded -on as to the next hop. Tunnels themselves are short lived (10 minutes at the -moment), but depending upon the tunnel's purpose, and though subsequent tunnels -may be built using the same sequence of peers, each hop's tunnel ID will change.

- -

2.1) Message preprocessing

- -

When the gateway wants to deliver data through the tunnel, it first -gathers zero or more I2NP messages (no more than 32KB worth), -selects how much padding will be used, and decides how each I2NP -message should be handled by the tunnel endpoint, encoding that -data into the raw tunnel payload:

- - -

The instructions are encoded as follows:

- - -

The I2NP message is encoded in its standard form, and the -preprocessed payload must be padded to a multiple of 16 bytes.

- -

2.2) Gateway processing

- -

After the preprocessing of messages into a padded payload, the gateway -encrypts the payload with the eight keys, building a checksum block so -that each peer can verify the integrity of the payload at any time, as -well as an end to end verification block for the tunnel endpoint to -verify the integrity of the checksum block. The specific details follow.

- -

The encryption used is such that decryption -merely requires running over the data with AES in CBC mode, calculating the -SHA256 of a certain fixed portion of the message (bytes 16 through $size-144), -and searching for the first 16 bytes of that hash in the checksum block. There is a fixed number -of hops defined (8 peers) so that we can verify the message -without either leaking the position in the tunnel or having the message -continually "shrink" as layers are peeled off. For tunnels shorter than 8 -hops, the tunnel creator will take the place of the excess hops, decrypting -with their keys (for outbound tunnels, this is done at the beginning, and for -inbound tunnels, the end).

- -

The hard part in the encryption is building that entangled checksum block, -which requires essentially finding out what the hash of the payload will look -like at each step, randomly ordering those hashes, then building a matrix of -what each of those randomly ordered hashes will look like at each step. The -gateway itself must pretend that it is one of the peers within the checksum -block so that the first hop cannot tell that the previous hop was the gateway. -To visualize this a bit:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
IVPayloadeH[0]eH[1]eH[2]eH[3]eH[4]eH[5]eH[6]eH[7]V
peer0
key=K[0]
recv
sendIV[0]P[0]H(P[0])V[0]
peer1
key=K[1]
recv
sendIV[1]P[1]H(P[1])V[1]
peer2
key=K[2]
recv
sendIV[2]P[2]H(P[2])V[2]
peer3
key=K[3]
recv
sendIV[3]P[3]H(P[3])V[3]
peer4
key=K[4]
recv
sendIV[4]P[4]H(P[4])V[4]
peer5
key=K[5]
recv
sendIV[5]P[5]H(P[5])V[5]
peer6
key=K[6]
recv
sendIV[6]P[6]H(P[6])V[6]
peer7
key=K[7]
recv
sendIV[7]P[7]H(P[7])V[7]
- -

In the above, P[7] is the same as the original data being passed through the -tunnel (the preprocessed messages), and V[7] is the first 16 bytes of the SHA256 of eH[0-7] as seen on -peer7 after decryption. For -cells in the matrix "higher up" than the hash, their value is derived by encrypting -the cell below it with the key for the peer below it, using the end of the column -to the left of it as the IV. For cells in the matrix "lower down" than the hash, -they're equal to the cell above them, decrypted by the current peer's key, using -the end of the previous encrypted block on that row.

- -

With this randomized matrix of checksum blocks, each peer will be able to find -the hash of the payload, or if it is not there, know that the message is corrupt. -The entanglement by using CBC mode increases the difficulty in tagging the -checksum blocks themselves, but it is still possible for that tagging to go -briefly undetected if the columns after the tagged data have already been used -to check the payload at a peer. In any case, the tunnel endpoint (peer 7) knows -for certain whether any of the checksum blocks have been tagged, as that would -corrupt the verification block (V[7]).

- -

The IV[0] is a random 16 byte value, and IV[i] is the first 16 bytes of -H(D(IV[i-1], K[i-1]) xor IV_WHITENER). We don't use the same IV along the path, as that would -allow trivial collusion, and we use the hash of the decrypted value to propogate -the IV so as to hamper key leakage. IV_WHITENER is a fixed 16 byte value.

- -

When the gateway wants to send the message, they export the right row for the -peer who is the first hop (usually the peer1.recv row) and forward that entirely.

- -

2.3) Participant processing

- -

When a participant in a tunnel receives a message, they decrypt a layer with their -tunnel key using AES256 in CBC mode with the first 16 bytes as the IV. They then -calculate the hash of what they see as the payload (bytes 16 through $size-144) and -search for that first 16 bytes of that hash within the decrypted checksum block. If no match is found, the -message is discarded. Otherwise, the IV is updated by decrypting it, XORing that value -with the IV_WHITENER, and replacing it with the first 16 bytes of its hash. The -resulting message is then forwarded on to the next peer for processing.

- -

To prevent replay attacks at the tunnel level, each participant keeps track of -the IVs received during the tunnel's lifetime, rejecting duplicates. The memory -usage required should be minor, as each tunnel has only a very short lifespan (10m -at the moment). A constant 100KBps through a tunnel with full 32KB messages would -give 1875 messages, requiring less than 30KB of memory. Gateways and endpoints -handle replay by tracking the message IDs and expirations on the I2NP messages -contained in the tunnel.

- -

2.4) Endpoint processing

- -

When a message reaches the tunnel endpoint, they decrypts and verifies it like -a normal participant. If the checksum block has a valid match, the endpoint then -computes the hash of the checksum block itself (as seen after decryption) and compares -that to the decrypted verification hash (the last 16 bytes). If that verification -hash does not match, the endpoint takes note of the tagging attempt by one of the -tunnel participants and perhaps discards the message.

- -

At this point, the tunnel endpoint has the preprocessed data sent by the gateway, -which it may then parse out into the included I2NP messages and forwards them as -requested in their delivery instructions.

- -

2.5) Padding

- -

Several tunnel padding strategies are possible, each with their own merits:

- - - -

Which to use? no padding is most efficient, random padding is what -we have now, fixed size would either be an extreme waste or force us to -implement fragmentation. Padding to the closest exponential size (ala freenet) -seems promising. Perhaps we should gather some stats on the net as to what size -messages are, then see what costs and benefits would arise from different -strategies?

- -

2.6) Tunnel fragmentation

- -

For various padding and mixing schemes, it may be useful from an anonymity -perspective to fragment a single I2NP message into multiple parts, each delivered -seperately through different tunnel messages. The endpoint may or may not -support that fragmentation (discarding or hanging on to fragments as needed), -and handling fragmentation will not immediately be implemented.

- -

2.7) Alternatives

- -

2.7.1) Don't use a checksum block

- -

One alternative to the above process is to remove the checksum block -completely and replace the verification hash with a plain hash of the payload. -This would simplify processing at the tunnel gateway and save 144 bytes of -bandwidth at each hop. On the other hand, attackers within the tunnel could -trivially adjust the message size to one which is easily traceable by -colluding external observers in addition to later tunnel participants. The -corruption would also incur the waste of the entire bandwidth necessary to -pass on the message. Without the per-hop validation, it would also be possible -to consume excess network resources by building extremely long tunnels, or by -building loops into the tunnel.

- -

2.7.2) Adjust tunnel processing midstream

- -

While the simple tunnel routing algorithm should be sufficient for most cases, -there are three alternatives that can be explored:

- - -

2.7.3) Use bidirectional tunnels

- -

The current strategy of using two seperate tunnels for inbound and outbound -communication is not the only technique available, and it does have anonymity -implications. On the positive side, by using separate tunnels it lessens the -traffic data exposed for analysis to participants in a tunnel - for instance, -peers in an outbound tunnel from a web browser would only see the traffic of -an HTTP GET, while the peers in an inbound tunnel would see the payload -delivered along the tunnel. With bidirectional tunnels, all participants would -have access to the fact that e.g. 1KB was sent in one direction, then 100KB -in the other. On the negative side, using unidirectional tunnels means that -there are two sets of peers which need to be profiled and accounted for, and -additional care must be taken to address the increased speed of predecessor -attacks. The tunnel pooling and building process outlined below should -minimize the worries of the predecessor attack, though if it were desired, -it wouldn't be much trouble to build both the inbound and outbound tunnels -along the same peers.

- -

2.7.4) Use smaller blocksize

- -

At the moment, our use of AES limits our block size to 16 bytes, which -in turn provides the minimum size for each of the checksum block columns. -If another algorithm was used with a smaller block size, or could otherwise -allow the safe building of the checksum block with smaller portions of the -hash, it might be worth exploring. The 16 bytes used now at each hop should -be more than sufficient.

- -

3) Tunnel building

- -

When building a tunnel, the creator must send a request with the necessary -configuration data to each of the hops, then wait for the potential participant -to reply stating that they either agree or do not agree. These tunnel request -messages and their replies are garlic wrapped so that only the router who knows -the key can decrypt it, and the path taken in both directions is tunnel routed -as well. There are three important dimensions to keep in mind when producing -the tunnels: what peers are used (and where), how the requests are sent (and -replies received), and how they are maintained.

- -

3.1) Peer selection

- -

Beyond the two types of tunnels - inbound and outbound - there are two styles -of peer selection used for different tunnels - exploratory and client. -Exploratory tunnels are used for both network database maintenance and tunnel -maintenance, while client tunnels are used for end to end client messages.

- -

3.1.1) Exploratory tunnel peer selection

- -

Exploratory tunnels are built out of a random selection of peers from a subset -of the network. The particular subset varies on the local router and on what their -tunnel routing needs are. In general, the exploratory tunnels are built out of -randomly selected peers who are in the peer's "not failing but active" profile -category. The secondary purpose of the tunnels, beyond merely tunnel routing, -is to find underutilized high capacity peers so that they can be promoted for -use in client tunnels.

- -

3.1.2) Client tunnel peer selection

- -

Client tunnels are built with a more stringent set of requirements - the local -router will select peers out of its "fast and high capacity" profile category so -that performance and reliability will meet the needs of the client application. -However, there are several important details beyond that basic selection that -should be adhered to, depending upon the client's anonymity needs.

- -

For some clients who are worried about adversaries mounting a predecessor -attack, the tunnel selection can keep the peers selected in a strict order - -if A, B, and C are in a tunnel, the hop after A is always B, and the hop after -B is always C. A less strict ordering is also possible, assuring that while -the hop after A may be B, B may never be before A. Other configuration options -include the ability for just the inbound tunnel gateways and outbound tunnel -endpoints to be fixed, or rotated on an MTBF rate.

- -

3.2) Request delivery

- -

As mentioned above, once the tunnel creator knows what peers should go into -a tunnel and in what order, the creator builds a series of tunnel request -messages, each containing the necessary information for that peer. For instance, -participating tunnels will be given the 4 byte tunnel ID on which they are to -receive messages, the 4 byte tunnel ID on which they are to send out the messages, -the 32 byte hash of the next hop's identity, and the 32 byte layer key used to -remove a layer from the tunnel. Of course, outbound tunnel endpoints are not -given any "next hop" or "next tunnel ID" information. Inbound tunnel gateways -are however given the 8 layer keys in the order they should be encrypted (as -described above). To allow replies, the request contains a random session tag -and a random session key with which the peer may garlic encrypt their decision, -as well as the tunnel to which that garlic should be sent. In addition to the -above information, various client specific options may be included, such as -what throttling to place on the tunnel, what padding or batch strategies to use, -etc.

- -

After building all of the request messages, they are garlic wrapped for the -target router and sent out an exploratory tunnel. Upon receipt, that peer -determines whether they can or will participate, creating a reply message and -both garlic wrapping and tunnel routing the response with the supplied -information. Upon receipt of the reply at the tunnel creator, the tunnel is -considered valid on that hop (if accepted). Once all peers have accepted, the -tunnel is active.

- -

3.3) Pooling

- -

To allow efficient operation, the router maintains a series of tunnel pools, -each managing a group of tunnels used for a specific purpose with their own -configuration. When a tunnel is needed for that purpose, the router selects one -out of the appropriate pool at random. Overall, there are two exploratory tunnel -pools - one inbound and one outbound - each using the router's exploration -defaults. In addition, there is a pair of pools for each local destination - -one inbound and one outbound tunnel. Those pools use the configuration specified -when the local destination connected to the router, or the router's defaults if -not specified.

- -

Each pool has within its configuration a few key settings, defining how many -tunnels to keep active, how many backup tunnels to maintain in case of failure, -how frequently to test the tunnels, how long the tunnels should be, whether those -lengths should be randomized, how often replacement tunnels should be built, as -well as any of the other settings allowed when configuring individual tunnels.

- -

3.4) Alternatives

- -

3.4.1) Telescopic building

- -

One question that may arise regarding the use of the exploratory tunnels for -sending and receiving tunnel creation messages is how that impacts the tunnel's -vulnerability to predecessor attacks. While the endpoints and gateways of -those tunnels will be randomly distributed across the network (perhaps even -including the tunnel creator in that set), another alternative is to use the -tunnel pathways themselves to pass along the request and response, as is done -in TOR. This, however, may lead to leaks -during tunnel creation, allowing peers to discover how many hops there are later -on in the tunnel by monitoring the timing or packet count as the tunnel is -built. Techniques could be used to minimize this issue, such as using each of -the hops as endpoints (per 2.7.2) for a random -number of messages before continuing on to build the next hop.

- -

3.4.2) Non-exploratory tunnels for management

- -

A second alternative to the tunnel building process is to give the router -an additional set of non-exploratory inbound and outbound pools, using those for -the tunnel request and response. Assuming the router has a well integrated view -of the network, this should not be necessary, but if the router was partitioned -in some way, using non-exploratory pools for tunnel management would reduce the -leakage of information about what peers are in the router's partition.

- -

4) Tunnel throttling

- -

Even though the tunnels within I2P bear a resemblence to a circuit switched -network, everything within I2P is strictly message based - tunnels are merely -accounting tricks to help organize the delivery of messages. No assumptions are -made regarding reliability or ordering of messages, and retransmissions are left -to higher levels (e.g. I2P's client layer streaming library). This allows I2P -to take advantage of throttling techniques available to both packet switched and -circuit switched networks. For instance, each router may keep track of the -moving average of how much data each tunnel is using, combine that with all of -the averages used by other tunnels the router is participating in, and be able -to accept or reject additional tunnel participation requests based on its -capacity and utilization. On the other hand, each router can simply drop -messages that are beyond its capacity, exploiting the research used on the -normal internet.

- -

5) Mixing/batching

- -

What strategies should be used at the gateway and at each hop for delaying, -reordering, rerouting, or padding messages? To what extent should this be done -automatically, how much should be configured as a per tunnel or per hop setting, -and how should the tunnel's creator (and in turn, user) control this operation? -All of this is left as unknown, to be worked out for -I2P 3.0

diff --git a/router/doc/udp.html b/router/doc/udp.html deleted file mode 100644 index 4a855ece99..0000000000 --- a/router/doc/udp.html +++ /dev/null @@ -1,759 +0,0 @@ -$Id: udp.html,v 1.19 2006/02/15 00:33:32 jrandom Exp $ - -

Secure Semireliable UDP (SSU)

-DRAFT - -

-The goal of this protocol is to provide secure, authenticated, -semireliable, and unordered message delivery, exposing only a minimal -amount of data easily discernible to third parties. It should -support high degree communication as well as TCP-friendly congestion -control, and may include PMTU detection. It should be capable of -efficiently moving bulk data at rates sufficient for home users. -In addition, it should support techniques for addressing network -obstacles, like most NATs or firewalls.

- -

Addressing and introduction

- -

To contact an SSU peer, one of two sets of information is necessary: -a direct address, for when the peer is publicly reachable, or an -indirect address, for using a third party to introduce the peer. -There is no restriction on the number of addresses a peer may have.

- -
-    Direct: ssu://host:port/introKey[?opts=[A-Z]*]
-  Indirect: ssu://tag@relayhost:port/relayIntroKey/targetIntroKey[?opts=[A-Z]*]
-
- -

These introduction keys are delivered through an external channel -and must be used when establishing a session key. For the indirect -address, the peer must first contact the relayhost and ask them for -an introduction to the peer known at that relayhost under the given -tag. If possible, the relayhost sends a message to the addressed -peer telling them to contact the requesting peer, and also gives -the requesting peer the IP and port on which the addressed peer is -located. In addition, the peer establishing the connection must -already know the public keys of the peer they are connecting to (but -not necessary to any intermediary relay peer).

- -

Each of the addresses may also expose a series of options - special -capabilities of that particular peer. For a list of available -capabilities, see below.

- -

Header

- -

All UDP datagrams begin with a MAC and an IV, followed by a variable -size payload encrypted with the appropriate key. The MAC used is -HMAC-MD5, truncated to 16 bytes, while the key is a full AES256 -key. The specific construct of the MAC is the first 16 bytes from:

-
-  HMAC-MD5(payload || IV || (payloadLength ^ protocolVersion), macKey)
-
- -

The payload itself is AES256/CBC encrypted with the IV and the -sessionKey, with replay prevention addressed within its body, -explained below. The payloadLength in the MAC is a 2 byte unsigned -integer in 2s complement.

- -

The protocolVersion is a 2 byte unsigned integer in 2s complement, -and currently set to 0. Peers using a different protocol version will -not be able to communicate with this peer, though earlier versions not -using this flag are.

- -

Payload

- -

Within the AES encrypted payload, there is a minimal common structure -to the various messages - a one byte flag and a four byte sending -timestamp (*seconds* since the unix epoch). The flag byte contains -the following bitfields:

-
-  bits 0-3: payload type
-     bit 4: rekey?
-     bit 5: extended options included
-  bits 6-7: reserved
-
- -

If the rekey flag is set, 64 bytes of keying material follow the -timestamp. If the extended options flag is set, a one byte option -size value is appended to, followed by that many extended option -bytes, which are currently uninterpreted.

- -

When rekeying, the first 32 bytes of the keying material is fed -into a SHA256 to produce the new MAC key, and the next 32 bytes are -fed into a SHA256 to produce the new session key, though the keys are -not immediately used. The other side should also reply with the -rekey flag set and that same keying material. Once both sides have -sent and received those values, the new keys should be used and the -previous keys discarded. It may be useful to keep the old keys -around briefly, to address packet loss and reordering.

- -
- Header: 37+ bytes
- +----+----+----+----+----+----+----+----+
- |                  MAC                  |
- |                                       |
- +----+----+----+----+----+----+----+----+
- |                   IV                  |
- |                                       |
- +----+----+----+----+----+----+----+----+
- |flag|        time       | (optionally  |
- +----+----+----+----+----+              |
- | this may have 64 byte keying material |
- | and/or a one+N byte extended options) |
- +---------------------------------------|
-
- -

Messages

- -

SessionRequest (type 0)

- - - - - - - -
Peer:Alice to Bob
Data:
    -
  • 256 byte X, to begin the DH agreement
  • -
  • 1 byte IP address size
  • -
  • that many byte representation of Bob's IP address
  • -
  • N bytes, currently uninterpreted (later, for challenges)
  • -
Key used:introKey
- -
- +----+----+----+----+----+----+----+----+
- |         X, as calculated from DH      |
- |                                       |
-                 .   .   .               
- |                                       |
- +----+----+----+----+----+----+----+----+
- |size| that many byte IP address (4-16) |
- +----+----+----+----+----+----+----+----+
- |           arbitrary amount            |
- |        of uninterpreted data          |
-                 .   .   .               
- |                                       |
- +----+----+----+----+----+----+----+----+
-
- -

SessionCreated (type 1)

- - - - - - - -
Peer:Bob to Alice
Data:
    -
  • 256 byte Y, to complete the DH agreement
  • -
  • 1 byte IP address size
  • -
  • that many byte representation of Alice's IP address
  • -
  • 2 byte port number (unsigned, big endian 2s complement)
  • -
  • 4 byte relay tag which Alice can publish (else 0x0)
  • -
  • 4 byte timestamp (seconds from the epoch) for use in the DSA - signature
  • -
  • 40 byte DSA signature of the critical exchanged data - (X + Y + Alice's IP + Alice's port + Bob's IP + Bob's port + Alice's - new relay tag + Bob's signed on time), encrypted with another - layer of encryption using the negotiated sessionKey. The IV - is reused here.
  • -
  • 8 bytes padding, encrypted with an additional layer of encryption - using the negotiated session key as part of the DSA block
  • -
  • N bytes, currently uninterpreted (later, for challenges)
  • -
Key used:introKey, with an additional layer of encryption over the 40 byte - signature and the following 8 bytes padding.
- -
- +----+----+----+----+----+----+----+----+
- |         Y, as calculated from DH      |
- |                                       |
-                 .   .   .               
- |                                       |
- +----+----+----+----+----+----+----+----+
- |size| that many byte IP address (4-16) |
- +----+----+----+----+----+----+----+----+
- | Port (A)| public relay tag  |  signed
- +----+----+----+----+----+----+----+----+
-   on time |                             |
- +----+----+                             |
- |              DSA signature            |
- |                                       |
- |                                       |
- |                                       |
- |         +----+----+----+----+----+----+
- |         |     (8 bytes of padding) 
- +----+----+----+----+----+----+----+----+
-           |                             |
- +----+----+                             |
- |           arbitrary amount            |
- |        of uninterpreted data          |
-                 .   .   .               
- |                                       |
- +----+----+----+----+----+----+----+----+
-
- -

SessionConfirmed (type 2)

- - - - - - - -
Peer:Alice to Bob
Data:
    -
  • 1 byte identity fragment info:
    -bits 0-3: current identity fragment #
    -bits 4-7: total identity fragments
  • -
  • 2 byte size of the current identity fragment
  • -
  • that many byte fragment of Alice's identity.
  • -
  • on the last identity fragment, the signed on time is - included after the identity fragment, and the last 40 - bytes contain the DSA signature of the critical exchanged - data (X + Y + Alice's IP + Alice's port + Bob's IP + Bob's port - + Alice's new relay key + Alice's signed on time)
  • -
Key used:sessionKey
- -
- Fragment 1 through N-1
- +----+----+----+----+----+----+----+----+
- |info| cursize |                        |
- +----+----+----+                        |
- |      fragment of Alice's full         |
- |            identity keys              |
-                 .   .   .               
- |                                       |
- +----+----+----+----+----+----+----+----+
- 
- Fragment N:
- +----+----+----+----+----+----+----+----+
- |info| cursize |                        |
- +----+----+----+                        |
- |      fragment of Alice's full         |
- |            identity keys              |
-                 .   .   .               
- |                                       |
- +----+----+----+----+----+----+----+----+
- |  signed on time   |                   |
- +----+----+----+----+                   |
- |  arbitrary amount of uninterpreted    |
- |        data, up from the end of the   |
- |  identity key to 40 bytes prior to    |
- |       end of the current packet       |
- +----+----+----+----+----+----+----+----+
- | DSA signature                         |
- |                                       |
- |                                       |
- |                                       |
- |                                       |
- +----+----+----+----+----+----+----+----+
-
- -

RelayRequest (type 3)

- - - - - - - -
Peer:Alice to Bob
Data:
    -
  • 4 byte relay tag
  • -
  • 1 byte IP address size
  • -
  • that many byte representation of Alice's IP address
  • -
  • 2 byte port number (of Alice)
  • -
  • 1 byte challenge size
  • -
  • that many bytes to be relayed to Charlie in the intro
  • -
  • Alice's intro key (so Bob can reply with Charlie's info)
  • -
  • 4 byte nonce of alice's relay request
  • -
  • N bytes, currently uninterpreted
  • -
Key used:introKey (or sessionKey, if Alice/Bob is established)
- -
- +----+----+----+----+----+----+----+----+
- |      relay tag    |size| that many    |
- +----+----+----+----+----+         +----|
- | bytes for Alice's IP address     |port
- +----+----+----+----+----+----+----+----+
-  (A) |size| that many challenge bytes   |
- +----+----+                             |
- | to be delivered to Charlie            |
- +----+----+----+----+----+----+----+----+
- | Alice's intro key                     |
- |                                       |
- |                                       |
- |                                       |
- +----+----+----+----+----+----+----+----+
- |       nonce       |                   |
- +----+----+----+----+                   |
- | arbitrary amount of uninterpreted data|
- +----+----+----+----+----+----+----+----+
-
- -

RelayResponse (type 4)

- - - - - - - -
Peer:Bob to Alice
Data:
    -
  • 1 byte IP address size
  • -
  • that many byte representation of Charlie's IP address
  • -
  • 2 byte port number
  • -
  • 1 byte IP address size
  • -
  • that many byte representation of Alice's IP address
  • -
  • 2 byte port number
  • -
  • 4 byte nonce sent by Alice
  • -
  • N bytes, currently uninterpreted
  • -
Key used:introKey (or sessionKey, if Alice/Bob is established)
- -
- +----+----+----+----+----+----+----+----+
- |size| that many bytes making up        |
- +----+                        +----+----+
- | Charlie's IP address        | Port (C)|
- +----+----+----+----+----+----+----+----+
- |size| that many bytes making up        |
- +----+                        +----+----+
- | Alice's IP address          | Port (A)|
- +----+----+----+----+----+----+----+----+
- |       nonce       |                   |
- +----+----+----+----+                   |
- | arbitrary amount of uninterpreted data|
- +----+----+----+----+----+----+----+----+
-
- -

RelayIntro (type 5)

- - - - - - - -
Peer:Bob to Charlie
Data:
    -
  • 1 byte IP address size
  • -
  • that many byte representation of Alice's IP address
  • -
  • 2 byte port number (of Alice)
  • -
  • 1 byte challenge size
  • -
  • that many bytes relayed from Alice
  • -
  • N bytes, currently uninterpreted
  • -
Key used:sessionKey
- -
- +----+----+----+----+----+----+----+----+
- |size| that many bytes making up        |
- +----+                        +----+----+
- | Alice's IP address          | Port (A)|
- +----+----+----+----+----+----+----+----+
- |size| that many bytes of challenge     |
- +----+                                  |
- | data relayed from Alice               |
- +----+----+----+----+----+----+----+----+
- | arbitrary amount of uninterpreted data|
- +----+----+----+----+----+----+----+----+
-
- -

Data (type 6)

- - - - - - - -
Peer:Any
Data:
    -
  • 1 byte flags:
    -   bit 0: explicit ACKs included
    -   bit 1: ACK bitfields included
    -   bit 2: reserved
    -   bit 3: explicit congestion notification
    -   bit 4: request previous ACKs
    -   bit 5: want reply
    -   bit 6: extended data included
    -   bit 7: reserved
  • -
  • if explicit ACKs are included:
      -
    • a 1 byte number of ACKs
    • -
    • that many 4 byte MessageIds being fully ACKed
    • -
  • -
  • if ACK bitfields are included:
      -
    • a 1 byte number of ACK bitfields
    • -
    • that many 4 byte MessageIds + a 1 or more byte ACK bitfield. - The bitfield uses the 7 low bits of each byte, with the high - bit specifying whether an additional bitfield byte follows it - (1 = true, 0 = the current bitfield byte is the last). These - sequence of 7 bit arrays represent whether a fragment has been - received - if a bit is 1, the fragment has been received. To - clarify, assuming fragments 0, 2, 5, and 9 have been received, - the bitfield bytes would be as follows:
      -byte 0
      -   bit 0: 1 (further bitfield bytes follow)
      -   bit 1: 1 (fragment 0 received)
      -   bit 2: 0 (fragment 1 not received)
      -   bit 3: 1 (fragment 2 received)
      -   bit 4: 0 (fragment 3 not received)
      -   bit 5: 0 (fragment 4 not received)
      -   bit 6: 1 (fragment 5 received)
      -   bit 7: 0 (fragment 6 not received)
      -byte 1
      -   bit 0: 0 (no further bitfield bytes)
      -   bit 1: 0 (fragment 7 not received)
      -   bit 1: 0 (fragment 8 not received)
      -   bit 1: 1 (fragment 9 received)
      -   bit 1: 0 (fragment 10 not received)
      -   bit 1: 0 (fragment 11 not received)
      -   bit 1: 0 (fragment 12 not received)
      -   bit 1: 0 (fragment 13 not received)
    • -
  • -
  • If extended data included:
      -
    • 1 byte data size
    • -
    • that many bytes of extended data (currently uninterpreted)
    • -
    • 1 byte number of fragments
    • -
    • that many message fragments:
        -
      • 4 byte messageId
      • -
      • 3 byte fragment info:
        -  bits 0-6: fragment #
        -     bit 7: isLast (1 = true)
        -  bits 8-9: unused
        -bits 10-23: fragment size
      • -
      • that many bytes
      -
    • N bytes padding, uninterpreted
    • -
Key used:sessionKey
- -
- +----+----+----+----+----+----+----+----+
- |flag| (additional headers, determined  |
- +----+                                  |
- | by the flags, such as ACKs or         |
- | bitfields                             |
- +----+----+----+----+----+----+----+----+
- |#frg|     messageId     |   frag info  |
- +----+----+----+----+----+----+----+----+
- | that many bytes of fragment data      |
-                  .  .  .                                       
- |                                       |
- +----+----+----+----+----+----+----+----+
- |     messageId     |   frag info  |    |
- +----+----+----+----+----+----+----+    |
- | that many bytes of fragment data      |
-                  .  .  .                                       
- |                                       |
- +----+----+----+----+----+----+----+----+
- |     messageId     |   frag info  |    |
- +----+----+----+----+----+----+----+    |
- | that many bytes of fragment data      |
-                  .  .  .                                       
- |                                       |
- +----+----+----+----+----+----+----+----+
- | arbitrary amount of uninterpreted data|
- +----+----+----+----+----+----+----+----+
-
- -

PeerTest (type 7)

- - - - - - - -
Peer:Any
Data:
    -
  • 4 byte nonce
  • -
  • 1 byte IP address size
  • -
  • that many byte representation of Alice's IP address
  • -
  • 2 byte port number
  • -
  • Alice's introduction key
  • -
  • N bytes, currently uninterpreted
  • -
Key used:introKey (or sessionKey if the connection has already been established)
- -
- +----+----+----+----+----+----+----+----+
- |    test nonce     |size| that many    |
- +----+----+----+----+----+              |
- |bytes making up Alice's IP address     |
- |----+----+----+----+----+----+----+----+
- | Port (A)| Alice or Charlie's          |
- +----+----+                             |
- | introduction key (Alice's is sent to  |
- | Bob and Charlie, while Charlie's is   |                                      |
- | sent to Alice)                        |
- |         +----+----+----+----+----+----+
- |         | arbitrary amount of         |
- |----+----+                             |
- | uninterpreted data                    |
- +----+----+----+----+----+----+----+----+
-
- -

Congestion control

- -

SSU's need for only semireliable delivery, TCP-friendly operation, -and the capacity for high throughput allows a great deal of latitude in -congestion control. The congestion control algorithm outlined below is -meant to be both efficient in bandwidth as well as simple to implement.

- -

Packets are scheduled according to the the router's policy, taking care -not to exceed the router's outbound capacity or to exceed the measured -capacity of the remote peer. The measured capacity should operate along the -lines of TCP's slow start and congestion avoidance, with additive increases -to the sending capacity and multiplicative decreases in face of congestion. -Veering away from TCP, however, routers may give up on some messages after -a given period or number of retransmissions while continuing to transmit -other messages.

- -

The congestion detection techniques vary from TCP as well, since each -message has its own unique and nonsequential identifier, and each message -has a limited size - at most, 32KB. To efficiently transmit this feedback -to the sender, the receiver periodically includes a list of fully ACKed -message identifiers and may also include bitfields for partially received -messages, where each bit represents the reception of a fragment. If -duplicate fragments arrive, the message should be ACKed again, or if the -message has still not been fully received, the bitfield should be -retransmitted with any new updates.

- -

The simplest possible implementation does not need to pad the packets to -any particular size, but instead just places a single message fragment into -a packet and sends it off (careful not to exceed the MTU). A more efficient -strategy would be to bundle multiple message fragments into the same packet, -so long as it doesn't exceed the MTU, but this is not necessary. Eventually, -a set of fixed packet sizes may be appropriate to further hide the data -fragmentation to external adversaries, but the tunnel, garlic, and end to -end padding should be sufficient for most needs until then.

- -

Keys

- -

All encryption used is AES256/CBC with 32 byte keys and 16 byte IVs. -The MAC and session keys are negotiated as part of the DH exchange, used -for the HMAC and encryption, respectively. Prior to the DH exchange, -the publicly knowable introKey is used for the MAC and encryption.

- -

When using the introKey, both the initial message and any subsequent -reply use the introKey of the responder (Bob) - the responder does -not need to know the introKey of the requestor (Alice). The DSA -signing key used by Bob should already be known to Alice when she -contacts him, though Alice's DSA key may not already be known by -Bob.

- -

Upon receiving a message, the receiver checks the from IP address -with any established sessions - if there is one or more matches, -those session's MAC keys are tested sequentially in the HMAC. If none -of those verify or if there are no matching IP addresses, the -receiver tries their introKey in the MAC. If that does not verify, -the packet is dropped. If it does verify, it is interpreted -according to the message type, though if the receiver is overloaded, -it may be dropped anyway.

- -

If Alice and Bob have an established session, but Alice loses the -keys for some reason and she wants to contact Bob, she may at any -time simply establish a new session through the SessionRequest and -related messages. If Bob has lost the key but Alice does not know -that, she will first attempt to prod him to reply, by sending a -DataMessage with the wantReply flag set, and if Bob continually -fails to reply, she will assume the key is lost and reestablish a -new one.

- -

For the DH key agreement, -RFC3526 2048bit -MODP group (#14) is used:

-
-  p = 2^2048 - 2^1984 - 1 + 2^64 * { [2^1918 pi] + 124476 }
-  g = 2
-
- -

The DSA p, q, and g are shared according to the scope of the -identity which created them.

- -

Replay prevention

- -

Replay prevention at the SSU layer occurs by rejecting packets -with exceedingly old timestamps or those which reuse an IV. To -detect duplicate IVs, a sequence of Bloom filters are employed to -"decay" periodically so that only recently added IVs are detected.

- -

The messageIds used in DataMessages are defined at layers above -the SSU transport and are passed through transparently. These IDs -are not in any particular order - in fact, they are likely to be -entirely random. The SSU layer makes no attempt at messageId -replay prevention - higher layers should take that into account.

- -

Introduction

- -

Indirect session establishment by means of a third party introduction -is necessary for efficient NAT traversal. Charlie, a router behind a -NAT or firewall which does not allow unsolicited inbound UDP packets, -first contacts a few peers, choosing some to serve as introducers. Each -of these peers (Bob, Bill, Betty, etc) provide Charlie with an introduction -tag - a 4 byte random number - which he then makes available to the public -as methods of contacting him. Alice, a router who has Charlie's published -contact methods, first sends a RelayRequest packet to one or more of the -introducers, asking each to introduce her to Charlie (offering the -introduction tag to identify Charlie). Bob then forwards a RelayIntro -packet to Charlie including Alice's public IP and port number, then sends -Alice back a RelayResponse packet containing Charlie's public IP and port -number. When Charlie receives the RelayIntro packet, he sends off a small -random packet to Alice's IP and port (poking a hole in his NAT/firewall), -and when Alice receive's Bob's RelayResponse packet, she begins a new -full direction session establishment with the specified IP and port.

- - - -

Peer testing

- -

The automation of collaborative reachability testing for peers is -enabled by a sequence of PeerTest messages. With its proper -execution, a peer will be able to determine their own reachability -and may update its behavior accordingly. The testing process is -quite simple:

- -
-        Alice                  Bob                  Charlie
-    PeerTest ------------------->
-                             PeerTest-------------------->
-                                <-------------------PeerTest
-         <-------------------PeerTest
-         <------------------------------------------PeerTest
-    PeerTest------------------------------------------>
-         <------------------------------------------PeerTest
-
- -

Each of the PeerTest messages carry a nonce identifying the -test series itself, as initialized by Alice. If Alice doesn't -get a particular message that she expects, she will retransmit -accordingly, and based upon the data received or the messages -missing, she will know her reachability. The various end states -that may be reached are as follows:

- - - -

Alice should choose Bob arbitrarily from known peers who seem -to be capable of participating in peer tests. Bob in turn should -choose Charlie arbitrarily from peers that he knows who seem to be -capable of participating in peer tests and who are on a different -IP from both Bob and Alice. If the first error condition occurs -(Alice doesn't get PeerTest messages from Bob), Alice may decide -to designate a new peer as Bob and try again with a different nonce.

- -

Alice's introduction key is included in all of the PeerTest -messages so that she doesn't need to already have an established -session with Bob and so that Charlie can contact her without knowing -any additional information. Alice may go on to establish a session -with either Bob or Charlie, but it is not required.

- -

Message sequences

- -

Connection establishment (direct)

- -
-        Alice                         Bob
-    SessionRequest--------------------->
-          <---------------------SessionCreated
-    SessionConfirmed------------------->
-    SessionConfirmed------------------->
-    SessionConfirmed------------------->
-    SessionConfirmed------------------->
-          <--------------------------Data
-
- -

Connection establishment (indirect)

- -
-        Alice                         Bob                  Charlie
-    RelayRequest ---------------------->
-         <--------------RelayResponse    RelayIntro----------->
-         <--------------------------------------------Data (ignored)
-    SessionRequest-------------------------------------------->
-         <--------------------------------------------SessionCreated
-    SessionConfirmed------------------------------------------>
-    SessionConfirmed------------------------------------------>
-    SessionConfirmed------------------------------------------>
-    SessionConfirmed------------------------------------------>
-         <---------------------------------------------------Data
-
- -

Sample datagrams

- -Minimal data message (no fragments, no ACKs, no NACKs, etc)
-(Size: 39 bytes) - -
- +----+----+----+----+----+----+----+----+
- |                  MAC                  |
- |                                       |
- +----+----+----+----+----+----+----+----+
- |                   IV                  |
- |                                       |
- +----+----+----+----+----+----+----+----+
- |flag|        time       |flag|#frg|    |
- +----+----+----+----+----+----+----+    |
- |  padding to fit a full AES256 block   |
- +----+----+----+----+----+----+----+----+
-
- -Minimal data message with payload
-(Size: 46+fragmentSize bytes) - -
- +----+----+----+----+----+----+----+----+
- |                  MAC                  |
- |                                       |
- +----+----+----+----+----+----+----+----+
- |                   IV                  |
- |                                       |
- +----+----+----+----+----+----+----+----+
- |flag|        time       |flag|#frg| 
- +----+----+----+----+----+----+----+----+
-   messageId    |   frag info  |         |
- +----+----+----+----+----+----+         |
- | that many bytes of fragment data      |
-                  .  .  .                                       
- |                                       |
- +----+----+----+----+----+----+----+----+
-
- -

Peer capabilities

- -
-
B
-
If the peer address contains the 'B' capability, that means - they are willing and able to participate in peer tests as - a 'Bob' or 'Charlie'.
-
C
-
If the peer address contains the 'C' capability, that means - they are willing and able to serve as an introducer - serving - as a Bob for an otherwise unreachable Alice.
-
diff --git a/router/doc/udp.png b/router/doc/udp.png deleted file mode 100644 index b4fd6241a1..0000000000 Binary files a/router/doc/udp.png and /dev/null differ