diff --git a/router/doc/tunnel-alt.html b/router/doc/tunnel-alt.html new file mode 100644 index 0000000000000000000000000000000000000000..075709e45f588859e98478055c320cbd616680e6 --- /dev/null +++ b/router/doc/tunnel-alt.html @@ -0,0 +1,428 @@ +<code>$Id: tunnel.html,v 1.10 2005/01/16 01:07:07 jrandom Exp $</code> +<pre> +1) <a href="#tunnel.overview">Tunnel overview</a> +2) <a href="#tunnel.operation">Tunnel operation</a> +2.1) <a href="#tunnel.preprocessing">Message preprocessing</a> +2.2) <a href="#tunnel.gateway">Gateway processing</a> +2.3) <a href="#tunnel.participant">Participant processing</a> +2.4) <a href="#tunnel.endpoint">Endpoint processing</a> +2.5) <a href="#tunnel.padding">Padding</a> +2.6) <a href="#tunnel.fragmentation">Tunnel fragmentation</a> +2.7) <a href="#tunnel.prng">PRNG pairs</a> +2.8) <a href="#tunnel.alternatives">Alternatives</a> +2.8.1) <a href="#tunnel.reroute">Adjust tunnel processing midstream</a> +2.8.2) <a href="#tunnel.bidirectional">Use bidirectional tunnels</a> +3) <a href="#tunnel.building">Tunnel building</a> +3.1) <a href="#tunnel.peerselection">Peer selection</a> +3.1.1) <a href="#tunnel.selection.exploratory">Exploratory tunnel peer selection</a> +3.1.2) <a href="#tunnel.selection.client">Client tunnel peer selection</a> +3.2) <a href="#tunnel.request">Request delivery</a> +3.3) <a href="#tunnel.pooling">Pooling</a> +3.4) <a href="#tunnel.building.alternatives">Alternatives</a> +3.4.1) <a href="#tunnel.building.telescoping">Telescopic building</a> +3.4.2) <a href="#tunnel.building.nonexploratory">Non-exploratory tunnels for management</a> +4) <a href="#tunnel.throttling">Tunnel throttling</a> +5) <a href="#tunnel.mixing">Mixing/batching</a> +</pre> + +<h2>1) <a name="tunnel.overview">Tunnel overview</a></h2> + +<p>Within I2P, messages are passed in one direction through a virtual +tunnel of peers, using whatever means are available to pass the +message on to the next hop. Messages arrive at the tunnel's +gateway, get bundled up and/or fragmented into fixed sizes tunnel messages, +and are forwarded on to the next hop in the tunnel, which processes and verifies +the validity of the message and sends it on to the next hop, and so on, until +it reaches the tunnel endpoint. That endpoint takes the messages +bundled up by the gateway and forwards them as instructed - either +to another router, to another tunnel on another router, or locally.</p> + +<p>Tunnels all work the same, but can be segmented into two different +groups - inbound tunnels and outbound tunnels. The inbound tunnels +have an untrusted gateway which passes messages down towards the +tunnel creator, which serves as the tunnel endpoint. For outbound +tunnels, the tunnel creator serves as the gateway, passing messages +out to the remote endpoint.</p> + +<p>The tunnel's creator selects exactly which peers will participate +in the tunnel, and provides each with the necessary confiruration +data. They may have any number of hops, but may be constrained with various +proof-of-work requests to add on additional steps. It is the intent to make +it hard for either participants or third parties to determine the length of +a tunnel, or even for colluding participants to determine whether they are a +part of the same tunnel at all (barring the situation where colluding peers are +next to each other in the tunnel). A pair of synchronized PRNGs are used at +each hop in the tunnel to validate incoming messages and prevent abuse through +loops.</p> + +<p>Beyond their length, there are additional configurable parameters +for each tunnel that can be used, such as a throttle on the frequency of +messages delivered, how padding should be used, how long a tunnel should be +in operation, whether to inject chaff messages, and what, if any, batching +strategies should be employed.</p> + +<p>In practice, a series of tunnel pools are used for different +purposes - each local client destination has its own set of inbound +tunnels and outbound tunnels, configured to meet its anonymity and +performance needs. In addition, the router itself maintains a series +of pools for participating in the network database and for managing +the tunnels themselves.</p> + +<p>I2P is an inherently packet switched network, even with these +tunnels, allowing it to take advantage of multiple tunnels running +in parallel, increasing resiliance and balancing load. Outside of +the core I2P layer, there is an optional end to end streaming library +available for client applications, exposing TCP-esque operation, +including message reordering, retransmission, congestion control, etc.</p> + +<h2>2) <a name="tunnel.operation">Tunnel operation</a></h2> + +<p>Tunnel operation has four distinct processes, taken on by various +peers in the tunnel. First, the tunnel gateway accumulates a number +of tunnel messages and preprocesses them into something for tunnel +delivery. Next, that gateway encrypts that preprocessed data, then +forwards it to the first hop. That peer, and subsequent tunnel +participants, unwrap a layer of the encryption, verifying the +integrity of the message, then forward it on to the next peer. +Eventually, the message arrives at the endpoint where the messages +bundled by the gateway are split out again and forwarded on as +requested.</p> + +<p>Tunnel IDs are 4 byte numbers used at each hop - participants know what +tunnel ID to listen for messages with and what tunnel ID they should be forwarded +on as to the next hop. Tunnels themselves are short lived (10 minutes at the +moment), but depending upon the tunnel's purpose, and though subsequent tunnels +may be built using the same sequence of peers, each hop's tunnel ID will change.</p> + +<h3>2.1) <a name="tunnel.preprocessing">Message preprocessing</a></h3> + +<p>When the gateway wants to deliver data through the tunnel, it first +gathers zero or more I2NP messages, selects how much padding will be used, +fragments it across the necessary number of 1KB tunnel messages, and decides how +each I2NP message should be handled by the tunnel endpoint, encoding that +data into the raw tunnel payload:</p> +<ul> +<li>the first 4 bytes of the SHA256 of the remaining preprocessed data</li> +<li>0 or more bytes containing random nonzero integers</li> +<li>1 byte containing 0x00</li> +<li>a series of zero or more { instructions, message } pairs</li> +</ul> + +<p>The instructions are encoded as follows:</p> +<ul> +<li>1 byte value:<pre> + bits 0-1: delivery type + (0x0 = LOCAL, 0x01 = TUNNEL, 0x02 = ROUTER) + bit 2: delay included? (1 = true, 0 = false) + bit 3: fragmented? (1 = true, 0 = false) + bit 4: extended options? (1 = true, 0 = false) + bits 5-7: reserved</pre></li> +<li>if the delivery type was TUNNEL, a 4 byte tunnel ID</li> +<li>if the delivery type was TUNNEL or ROUTER, a 32 byte router hash</li> +<li>if the delay included flag is true, a 1 byte value:<pre> + bit 0: type (0 = strict, 1 = randomized) + bits 1-7: delay exponent (2^value minutes)</pre></li> +<li>if the fragmented flag is true, a 4 byte message ID, and a 1 byte value:<pre> + bits 0-6: fragment number + bit 7: is last? (1 = true, 0 = false)</pre></li> +<li>if the extended options flag is true:<pre> + = a 1 byte option size (in bytes) + = that many bytes</pre></li> +<li>2 byte size of the I2NP message</li> +</ul> + +<p>The I2NP message is encoded in its standard form, and the +preprocessed payload must be padded to a multiple of 16 bytes.</p> + +<h3>2.2) <a name="tunnel.gateway">Gateway processing</a></h3> + +<p>After the preprocessing of messages into a padded payload, the gateway builds +a random 4 byte preIV value, iteratively encrypting it and the tunnel message as +necessary, selects the next message ID from its outbound PRNG, and forwards the tuple +{tunnelID, messageID, preIV, encrypted tunnel message} to the next hop.</p> + +<p>How encryption at the gateway is done depends on whether the tunnel is an +inbound or an outbound tunnel. For inbound tunnels, they simply select a random +preIV, postprocessing and updating it to generate the IV for the gateway and using +that IV along side their own layer key to encrypt the preprocessed data. For outbound +tunnels they must iteratively decrypt the (unencrypted) preIV and preprocessed +data with the layer keys for all hops in the tunnel. The result of the outbound +tunnel encryption is that when each peer encrypts it, the endpoint will recover +the initial preprocessed data.</p> + +<p>The preIV postprocessing should be a secure transform of the received value +with sufficient expansion to provide the full 16 byte IV necessary for AES256. +<i>What transform should be used - HMAC-SHA256(preIV, layerKey), using bytes +0:15 as the IV, passing on bytes 16-19 as the next step's preIV? Should +we deliver an additional postprocessing layer key to each peer during the +<a href="#tunnel.request">tunnel creation</a> to reduce the potential exposure +of the layerKey? Should we replace the 4 byte preIV with a full 16 byte preIV +(even though 4 bytes will likely provide a sufficient keyspace in which to +operate, as a single tunnel pumping 100KBps would only use 60,000 IVs)?</i></p> + +<h3>2.3) <a name="tunnel.participant">Participant processing</a></h3> + +<p>When a peer receives a tunnel message, it checks the inbound PRNG for that +tunnel, verifying that the message ID specified is one of the next available IDs, +thereby removing it from the PRNG and moving the window. If the message ID is +not one of the available IDs, it is dropped. The participant then postprocesses +and updates the preIV received to determine the current hop's IV, using that +with the layer key to encrypt the tunnel message. They then select the next +selects the next message ID from its outbound PRNG, forwarding the tuple +{nextTunnelID, nextMessageID, nextPreIV, encrypted tunnel message} to the next hop.</p> + +<p>Each participant also maintains a bloom filter of preIV values used for the +lifetime of the tunnel at their hop, allowing them to drop any messages with +duplicate preIVs. <i>The details of the hash functions used in the bloom filter +are not yet worked out. Suggestions?</i></p> + +<h3>2.4) <a name="tunnel.endpoint">Endpoint processing</a></h3> + +<p>After receiving and validating a tunnel message at the last hop in the tunnel, +how the endpoint recovers the data encoded by the gateway depends upon whether +the tunnel is an inbound or an outbound tunnel. For outbound tunnels, the +endpoint encrypts the message with its layer key just like any other participant, +exposing the preprocessed data. For inbound tunnels, the endpoint is also the +tunnel creator so they can merely iteratively decrypt the preIV and message, using the +layer keys of each step in reverse order.</p> + +<p>At this point, the tunnel endpoint has the preprocessed data sent by the gateway, +which it may then parse out into the included I2NP messages and forwards them as +requested in their delivery instructions.</p> + +<h3>2.5) <a name="tunnel.padding">Padding</a></h3> + +<p>Several tunnel padding strategies are possible, each with their own merits:</p> + +<ul> +<li>No padding</li> +<li>Padding to a random size</li> +<li>Padding to a fixed size</li> +<li>Padding to the closest KB</li> +<li>Padding to the closest exponential size (2^n bytes)</li> +</ul> + +<p><i>Which to use? no padding is most efficient, random padding is what +we have now, fixed size would either be an extreme waste or force us to +implement fragmentation. Padding to the closest exponential size (ala freenet) +seems promising. Perhaps we should gather some stats on the net as to what size +messages are, then see what costs and benefits would arise from different +strategies? <b>See <a href="http://dev.i2p.net/~jrandom/messageSizes/">gathered +stats</a></b></i></p> + +<h3>2.6) <a name="tunnel.fragmentation">Tunnel fragmentation</a></h3> + +<p>To prevent adversaries from tagging the messages along the path by adjusting +the message size, all tunnel messages are a fixed 1KB in size. To accomidate +larger I2NP messages as well as to support smaller ones more efficiently, the +gateway splits up the larger I2NP messages into fragments contained within each +tunnel message. The endpoint will attempt to rebuild the I2NP message from the +fragments for a short period of time, but will discard them as necessary.</p> + +<h3>2.7) <a name="tunnel.prng">PRNG pairs</a></h3> + +<p>To minimize the damage from a DoS attack created by looped tunnels, a series +of synchronized PRNGs are used across the tunnel - the gateway has one, the +endpoint has one, and every participant has two. These in turn are broken down +into the inbound and outbound PRNG for each tunnel - the outbound PRNG is +synchronized with the inbound PRNG of the peer after you (obvious exception being +the endpoint, which has no peer after it). Outside of the PRNG with which each +is synchronized with, there is no relationship between any of the other PRNGs. +This is accomplished by using a common PRNG algorithm <i>[tbd, perhaps +<a href="http://java.sun.com/j2se/1.4.2/docs/api/java/util/Random.html">java.lang.random</a>?]</i>, +seeded with the values delivered with the tunnel creation request. Each peer +prefetches the next few values out of the inbound PRNG so that it can handle +lost or briefly out of order delivery, using these values to compare against the +received message IDs.</p> + +<p>An adversary can still build loops within the tunnels, but the damage done is +minimized in two ways. First, if there is a loop created by providing a later +hop with its next hop pointing at a previous peer, that loop will need to be +seeded with the right value so that its PRNG stays synchronized with the previous +peer's inbound PRNG. While some messages would go into the loop, as they start +to actually loop back, two things would happen. Either they would be accepted +by that peer, thereby breaking the synchronization with the other PRNG which is +really "earlier" in the tunnel, or the messages would be rejected if the real +"earlier" peer sent enough messages into the loop to break the synchronization.</p> + +<p>If the adversary is very well coordinated and is colluding with several +participants, they could still build a functioning loop, though that loop would +expire when the tunnel does. This still allows an expansion of their work factor +against the overall network load, but with tunnel throttling this could even +be a useful positive tool for mitigating active traffic analysis.</p> + +<h3>2.8) <a name="tunnel.alternatives">Alternatives</a></h3> + +<h4>2.8.1) <a name="tunnel.reroute">Adjust tunnel processing midstream</a></h4> + +<p>While the simple tunnel routing algorithm should be sufficient for most cases, +there are three alternatives that can be explored:</p> +<ul> +<li>Have a peer other than the endpoint temporarily act as the termination +point for a tunnel by adjusting the encryption used at the gateway to give them +the plaintext of the preprocessed I2NP messages. Each peer could check to see +whether they had the plaintext, processing the message when received as if they +did.</li> +<li>Allow routers participating in a tunnel to remix the message before +forwarding it on - bouncing it through one of that peer's own outbound tunnels, +bearing instructions for delivery to the next hop.</li> +<li>Implement code for the tunnel creator to redefine a peer's "next hop" in +the tunnel, allowing further dynamic redirection.</li> +</ul> + +<h4>2.8.2) <a name="tunnel.bidirectional">Use bidirectional tunnels</a></h4> + +<p>The current strategy of using two seperate tunnels for inbound and outbound +communication is not the only technique available, and it does have anonymity +implications. On the positive side, by using separate tunnels it lessens the +traffic data exposed for analysis to participants in a tunnel - for instance, +peers in an outbound tunnel from a web browser would only see the traffic of +an HTTP GET, while the peers in an inbound tunnel would see the payload +delivered along the tunnel. With bidirectional tunnels, all participants would +have access to the fact that e.g. 1KB was sent in one direction, then 100KB +in the other. On the negative side, using unidirectional tunnels means that +there are two sets of peers which need to be profiled and accounted for, and +additional care must be taken to address the increased speed of predecessor +attacks. The tunnel pooling and building process outlined below should +minimize the worries of the predecessor attack, though if it were desired, +it wouldn't be much trouble to build both the inbound and outbound tunnels +along the same peers.</p> + +<h2>3) <a name="tunnel.building">Tunnel building</a></h2> + +<p>When building a tunnel, the creator must send a request with the necessary +configuration data to each of the hops, then wait for the potential participant +to reply stating that they either agree or do not agree. These tunnel request +messages and their replies are garlic wrapped so that only the router who knows +the key can decrypt it, and the path taken in both directions is tunnel routed +as well. There are three important dimensions to keep in mind when producing +the tunnels: what peers are used (and where), how the requests are sent (and +replies received), and how they are maintained.</p> + +<h3>3.1) <a name="tunnel.peerselection">Peer selection</a></h3> + +<p>Beyond the two types of tunnels - inbound and outbound - there are two styles +of peer selection used for different tunnels - exploratory and client. +Exploratory tunnels are used for both network database maintenance and tunnel +maintenance, while client tunnels are used for end to end client messages. </p> + +<h4>3.1.1) <a name="tunnel.selection.exploratory">Exploratory tunnel peer selection</a></h4> + +<p>Exploratory tunnels are built out of a random selection of peers from a subset +of the network. The particular subset varies on the local router and on what their +tunnel routing needs are. In general, the exploratory tunnels are built out of +randomly selected peers who are in the peer's "not failing but active" profile +category. The secondary purpose of the tunnels, beyond merely tunnel routing, +is to find underutilized high capacity peers so that they can be promoted for +use in client tunnels.</p> + +<h4>3.1.2) <a name="tunnel.selection.client">Client tunnel peer selection</a></h4> + +<p>Client tunnels are built with a more stringent set of requirements - the local +router will select peers out of its "fast and high capacity" profile category so +that performance and reliability will meet the needs of the client application. +However, there are several important details beyond that basic selection that +should be adhered to, depending upon the client's anonymity needs.</p> + +<p>For some clients who are worried about adversaries mounting a predecessor +attack, the tunnel selection can keep the peers selected in a strict order - +if A, B, and C are in a tunnel, the hop after A is always B, and the hop after +B is always C. A less strict ordering is also possible, assuring that while +the hop after A may be B, B may never be before A. Other configuration options +include the ability for just the inbound tunnel gateways and outbound tunnel +endpoints to be fixed, or rotated on an MTBF rate.</p> + +<h3>3.2) <a name="tunnel.request">Request delivery</a></h3> + +<p>As mentioned above, once the tunnel creator knows what peers should go into +a tunnel and in what order, the creator builds a series of tunnel request +messages, each containing the necessary information for that peer. For instance, +participating tunnels will be given the 4 byte tunnel ID on which they are to +receive messages, the 4 byte tunnel ID on which they are to send out the messages, +the 32 byte hash of the next hop's identity, the pair of PRNG seeds for the inbound +and outbound PRNG, and the 32 byte layer key used to +remove a layer from the tunnel. Of course, outbound tunnel endpoints are not +given any "next hop" or "next tunnel ID" information, and neither the inbound +tunnel gateways nor the outbound tunnel endpoints need both PRNG seeds. To allow +replies, the request contains a random session tag and a random session key with +which the peer may garlic encrypt their decision, as well as the tunnel to which +that garlic should be sent. In addition to the above information, various client +specific options may be included, such as what throttling to place on the tunnel, +what padding or batch strategies to use, etc.</p> + +<p>After building all of the request messages, they are garlic wrapped for the +target router and sent out an exploratory tunnel. Upon receipt, that peer +determines whether they can or will participate, creating a reply message and +both garlic wrapping and tunnel routing the response with the supplied +information. Upon receipt of the reply at the tunnel creator, the tunnel is +considered valid on that hop (if accepted). Once all peers have accepted, the +tunnel is active.</p> + +<h3>3.3) <a name="tunnel.pooling">Pooling</a></h3> + +<p>To allow efficient operation, the router maintains a series of tunnel pools, +each managing a group of tunnels used for a specific purpose with their own +configuration. When a tunnel is needed for that purpose, the router selects one +out of the appropriate pool at random. Overall, there are two exploratory tunnel +pools - one inbound and one outbound - each using the router's exploration +defaults. In addition, there is a pair of pools for each local destination - +one inbound and one outbound tunnel. Those pools use the configuration specified +when the local destination connected to the router, or the router's defaults if +not specified.</p> + +<p>Each pool has within its configuration a few key settings, defining how many +tunnels to keep active, how many backup tunnels to maintain in case of failure, +how frequently to test the tunnels, how long the tunnels should be, whether those +lengths should be randomized, how often replacement tunnels should be built, as +well as any of the other settings allowed when configuring individual tunnels.</p> + +<h3>3.4) <a name="tunnel.building.alternatives">Alternatives</a></h3> + +<h4>3.4.1) <a name="tunnel.building.telescoping">Telescopic building</a></h4> + +<p>One question that may arise regarding the use of the exploratory tunnels for +sending and receiving tunnel creation messages is how that impacts the tunnel's +vulnerability to predecessor attacks. While the endpoints and gateways of +those tunnels will be randomly distributed across the network (perhaps even +including the tunnel creator in that set), another alternative is to use the +tunnel pathways themselves to pass along the request and response, as is done +in <a href="http://tor.eff.org/">TOR</a>. This, however, may lead to leaks +during tunnel creation, allowing peers to discover how many hops there are later +on in the tunnel by monitoring the timing or packet count as the tunnel is +built. Techniques could be used to minimize this issue, such as using each of +the hops as endpoints (per <a href="#tunnel.reroute">2.7.2</a>) for a random +number of messages before continuing on to build the next hop.</p> + +<h4>3.4.2) <a name="tunnel.building.nonexploratory">Non-exploratory tunnels for management</a></h4> + +<p>A second alternative to the tunnel building process is to give the router +an additional set of non-exploratory inbound and outbound pools, using those for +the tunnel request and response. Assuming the router has a well integrated view +of the network, this should not be necessary, but if the router was partitioned +in some way, using non-exploratory pools for tunnel management would reduce the +leakage of information about what peers are in the router's partition.</p> + +<h2>4) <a name="tunnel.throttling">Tunnel throttling</a></h2> + +<p>Even though the tunnels within I2P bear a resemblence to a circuit switched +network, everything within I2P is strictly message based - tunnels are merely +accounting tricks to help organize the delivery of messages. No assumptions are +made regarding reliability or ordering of messages, and retransmissions are left +to higher levels (e.g. I2P's client layer streaming library). This allows I2P +to take advantage of throttling techniques available to both packet switched and +circuit switched networks. For instance, each router may keep track of the +moving average of how much data each tunnel is using, combine that with all of +the averages used by other tunnels the router is participating in, and be able +to accept or reject additional tunnel participation requests based on its +capacity and utilization. On the other hand, each router can simply drop +messages that are beyond its capacity, exploiting the research used on the +normal internet.</p> + +<h2>5) <a name="tunnel.mixing">Mixing/batching</a></h2> + +<p>What strategies should be used at the gateway and at each hop for delaying, +reordering, rerouting, or padding messages? To what extent should this be done +automatically, how much should be configured as a per tunnel or per hop setting, +and how should the tunnel's creator (and in turn, user) control this operation? +All of this is left as unknown, to be worked out for +<a href="http://www.i2p.net/roadmap#3.0">I2P 3.0</a></p> \ No newline at end of file