updates with new alternative crypto, including Connelly's suggestions for the IV

0e5cf81f · jrandom · zzz · 61f217c6 · 0e5cf81f
Commit 0e5cf81f authored 20 years ago by jrandom Committed by zzz 20 years ago
--- a/router/doc/tunnel-alt.html
+++ b/router/doc/tunnel-alt.html
+<code>$Id: tunnel.html,v 1.10 2005/01/16 01:07:07 jrandom Exp $</code>
+<pre>
+1) <a href="#tunnel.overview">Tunnel overview</a>
+2) <a href="#tunnel.operation">Tunnel operation</a>
+2.1) <a href="#tunnel.preprocessing">Message preprocessing</a>
+2.2) <a href="#tunnel.gateway">Gateway processing</a>
+2.3) <a href="#tunnel.participant">Participant processing</a>
+2.4) <a href="#tunnel.endpoint">Endpoint processing</a>
+2.5) <a href="#tunnel.padding">Padding</a>
+2.6) <a href="#tunnel.fragmentation">Tunnel fragmentation</a>
+2.7) <a href="#tunnel.prng">PRNG pairs</a>
+2.8) <a href="#tunnel.alternatives">Alternatives</a>
+2.8.1) <a href="#tunnel.reroute">Adjust tunnel processing midstream</a>
+2.8.2) <a href="#tunnel.bidirectional">Use bidirectional tunnels</a>
+3) <a href="#tunnel.building">Tunnel building</a>
+3.1) <a href="#tunnel.peerselection">Peer selection</a>
+3.1.1) <a href="#tunnel.selection.exploratory">Exploratory tunnel peer selection</a>
+3.1.2) <a href="#tunnel.selection.client">Client tunnel peer selection</a>
+3.2) <a href="#tunnel.request">Request delivery</a>
+3.3) <a href="#tunnel.pooling">Pooling</a>
+3.4) <a href="#tunnel.building.alternatives">Alternatives</a>
+3.4.1) <a href="#tunnel.building.telescoping">Telescopic building</a>
+3.4.2) <a href="#tunnel.building.nonexploratory">Non-exploratory tunnels for management</a>
+4) <a href="#tunnel.throttling">Tunnel throttling</a>
+5) <a href="#tunnel.mixing">Mixing/batching</a>
+</pre>
+
+<h2>1) <a name="tunnel.overview">Tunnel overview</a></h2>
+
+<p>Within I2P, messages are passed in one direction through a virtual
+tunnel of peers, using whatever means are available to pass the 
+message on to the next hop.  Messages arrive at the tunnel's 
+gateway, get bundled up and/or fragmented into fixed sizes tunnel messages, 
+and are forwarded on to the next hop in the tunnel, which processes and verifies
+the validity of the message and sends it on to the next hop, and so on, until
+it reaches the tunnel endpoint.  That endpoint takes the messages
+bundled up by the gateway and forwards them as instructed - either
+to another router, to another tunnel on another router, or locally.</p>
+
+<p>Tunnels all work the same, but can be segmented into two different
+groups - inbound tunnels and outbound tunnels.  The inbound tunnels
+have an untrusted gateway which passes messages down towards the 
+tunnel creator, which serves as the tunnel endpoint.  For outbound 
+tunnels, the tunnel creator serves as the gateway, passing messages
+out to the remote endpoint.</p>
+
+<p>The tunnel's creator selects exactly which peers will participate
+in the tunnel, and provides each with the necessary confiruration
+data.  They may have any number of hops, but may be constrained with various
+proof-of-work requests to add on additional steps.  It is the intent to make
+it hard for either participants or third parties to determine the length of 
+a tunnel, or even for colluding participants to determine whether they are a
+part of the same tunnel at all (barring the situation where colluding peers are
+next to each other in the tunnel).  A pair of synchronized PRNGs are used at 
+each hop in the tunnel to validate incoming messages and prevent abuse through
+loops.</p>
+
+<p>Beyond their length, there are additional configurable parameters
+for each tunnel that can be used, such as a throttle on the frequency of 
+messages delivered, how padding should be used, how long a tunnel should be 
+in operation, whether to inject chaff messages, and what, if any, batching
+strategies should be employed.</p>
+
+<p>In practice, a series of tunnel pools are used for different
+purposes - each local client destination has its own set of inbound
+tunnels and outbound tunnels, configured to meet its anonymity and
+performance needs.  In addition, the router itself maintains a series
+of pools for participating in the network database and for managing
+the tunnels themselves.</p>
+
+<p>I2P is an inherently packet switched network, even with these 
+tunnels, allowing it to take advantage of multiple tunnels running 
+in parallel, increasing resiliance and balancing load.  Outside of
+the core I2P layer, there is an optional end to end streaming library 
+available for client applications, exposing TCP-esque operation,
+including message reordering, retransmission, congestion control, etc.</p>
+
+<h2>2) <a name="tunnel.operation">Tunnel operation</a></h2>
+
+<p>Tunnel operation has four distinct processes, taken on by various 
+peers in the tunnel.  First, the tunnel gateway accumulates a number
+of tunnel messages and preprocesses them into something for tunnel
+delivery.  Next, that gateway encrypts that preprocessed data, then
+forwards it to the first hop.  That peer, and subsequent tunnel 
+participants, unwrap a layer of the encryption, verifying the 
+integrity of the message, then forward it on to the next peer.  
+Eventually, the message arrives at the endpoint where the messages
+bundled by the gateway are split out again and forwarded on as 
+requested.</p>
+
+<p>Tunnel IDs are 4 byte numbers used at each hop - participants know what
+tunnel ID to listen for messages with and what tunnel ID they should be forwarded
+on as to the next hop.  Tunnels themselves are short lived (10 minutes at the 
+moment), but depending upon the tunnel's purpose, and though subsequent tunnels 
+may be built using the same sequence of peers, each hop's tunnel ID will change.</p>
+
+<h3>2.1) <a name="tunnel.preprocessing">Message preprocessing</a></h3>
+
+<p>When the gateway wants to deliver data through the tunnel, it first
+gathers zero or more I2NP messages, selects how much padding will be used, 
+fragments it across the necessary number of 1KB tunnel messages, and decides how
+each I2NP message should be handled by the tunnel endpoint, encoding that
+data into the raw tunnel payload:</p>
+<ul>
+<li>the first 4 bytes of the SHA256 of the remaining preprocessed data</li>
+<li>0 or more bytes containing random nonzero integers</li>
+<li>1 byte containing 0x00</li>
+<li>a series of zero or more { instructions, message } pairs</li>
+</ul>
+
+<p>The instructions are encoded as follows:</p>
+<ul>
+<li>1 byte value:<pre>
+   bits 0-1: delivery type
+             (0x0 = LOCAL, 0x01 = TUNNEL, 0x02 = ROUTER)
+      bit 2: delay included?  (1 = true, 0 = false)
+      bit 3: fragmented?  (1 = true, 0 = false)
+      bit 4: extended options?  (1 = true, 0 = false)
+   bits 5-7: reserved</pre></li>
+<li>if the delivery type was TUNNEL, a 4 byte tunnel ID</li>
+<li>if the delivery type was TUNNEL or ROUTER, a 32 byte router hash</li>
+<li>if the delay included flag is true, a 1 byte value:<pre>
+      bit 0: type (0 = strict, 1 = randomized)
+   bits 1-7: delay exponent (2^value minutes)</pre></li>
+<li>if the fragmented flag is true, a 4 byte message ID, and a 1 byte value:<pre>
+   bits 0-6: fragment number
+      bit 7: is last?  (1 = true, 0 = false)</pre></li>
+<li>if the extended options flag is true:<pre>
+   = a 1 byte option size (in bytes)
+   = that many bytes</pre></li>
+<li>2 byte size of the I2NP message</li>
+</ul>
+
+<p>The I2NP message is encoded in its standard form, and the 
+preprocessed payload must be padded to a multiple of 16 bytes.</p>
+
+<h3>2.2) <a name="tunnel.gateway">Gateway processing</a></h3>
+
+<p>After the preprocessing of messages into a padded payload, the gateway builds
+a random 4 byte preIV value, iteratively encrypting it and the tunnel message as
+necessary, selects the next message ID from its outbound PRNG, and forwards the tuple 
+{tunnelID, messageID, preIV, encrypted tunnel message} to the next hop.</p>
+
+<p>How encryption at the gateway is done depends on whether the tunnel is an
+inbound or an outbound tunnel.  For inbound tunnels, they simply select a random
+preIV, postprocessing and updating it to generate the IV for the gateway and using 
+that IV along side their own layer key to encrypt the preprocessed data.  For outbound 
+tunnels they must iteratively decrypt the (unencrypted) preIV and preprocessed 
+data with the layer keys for all hops in the tunnel.  The result of the outbound
+tunnel encryption is that when each peer encrypts it, the endpoint will recover 
+the initial preprocessed data.</p>
+
+<p>The preIV postprocessing should be a secure transform of the received value 
+with sufficient expansion to provide the full 16 byte IV necessary for AES256.  
+<i>What transform should be used - HMAC-SHA256(preIV, layerKey), using bytes
+0:15 as the IV, passing on bytes 16-19 as the next step's preIV?  Should
+we deliver an additional postprocessing layer key to each peer during the 
+<a href="#tunnel.request">tunnel creation</a> to reduce the potential exposure
+of the layerKey?  Should we replace the 4 byte preIV with a full 16 byte preIV 
+(even though 4 bytes will likely provide a sufficient keyspace in which to
+operate, as a single tunnel pumping 100KBps would only use 60,000 IVs)?</i></p>
+
+<h3>2.3) <a name="tunnel.participant">Participant processing</a></h3>
+
+<p>When a peer receives a tunnel message, it checks the inbound PRNG for that
+tunnel, verifying that the message ID specified is one of the next available IDs,
+thereby removing it from the PRNG and moving the window.  If the message ID is
+not one of the available IDs, it is dropped.  The participant then postprocesses
+and updates the preIV received to determine the current hop's IV, using that 
+with the layer key to encrypt the tunnel message.  They then select the next 
+selects the next message ID from its outbound PRNG, forwarding the tuple 
+{nextTunnelID, nextMessageID, nextPreIV, encrypted tunnel message} to the next hop.</p>
+
+<p>Each participant also maintains a bloom filter of preIV values used for the 
+lifetime of the tunnel at their hop, allowing them to drop any messages with 
+duplicate preIVs.  <i>The details of the hash functions used in the bloom filter
+are not yet worked out.  Suggestions?</i></p>
+
+<h3>2.4) <a name="tunnel.endpoint">Endpoint processing</a></h3>
+
+<p>After receiving and validating a tunnel message at the last hop in the tunnel,
+how the endpoint recovers the data encoded by the gateway depends upon whether 
+the tunnel is an inbound or an outbound tunnel.  For outbound tunnels, the 
+endpoint encrypts the message with its layer key just like any other participant, 
+exposing the preprocessed data.  For inbound tunnels, the endpoint is also the 
+tunnel creator so they can merely iteratively decrypt the preIV and message, using the 
+layer keys of each step in reverse order.</p>
+
+<p>At this point, the tunnel endpoint has the preprocessed data sent by the gateway,
+which it may then parse out into the included I2NP messages and forwards them as
+requested in their delivery instructions.</p>
+
+<h3>2.5) <a name="tunnel.padding">Padding</a></h3>
+
+<p>Several tunnel padding strategies are possible, each with their own merits:</p>
+
+<ul>
+<li>No padding</li>
+<li>Padding to a random size</li>
+<li>Padding to a fixed size</li>
+<li>Padding to the closest KB</li>
+<li>Padding to the closest exponential size (2^n bytes)</li>
+</ul>
+
+<p><i>Which to use?  no padding is most efficient, random padding is what
+we have now, fixed size would either be an extreme waste or force us to
+implement fragmentation.  Padding to the closest exponential size (ala freenet)
+seems promising.  Perhaps we should gather some stats on the net as to what size
+messages are, then see what costs and benefits would arise from different 
+strategies?  <b>See <a href="http://dev.i2p.net/~jrandom/messageSizes/">gathered
+stats</a></b></i></p>
+
+<h3>2.6) <a name="tunnel.fragmentation">Tunnel fragmentation</a></h3>
+
+<p>To prevent adversaries from tagging the messages along the path by adjusting
+the message size, all tunnel messages are a fixed 1KB in size.  To accomidate 
+larger I2NP messages as well as to support smaller ones more efficiently, the
+gateway splits up the larger I2NP messages into fragments contained within each
+tunnel message.  The endpoint will attempt to rebuild the I2NP message from the
+fragments for a short period of time, but will discard them as necessary.</p>
+
+<h3>2.7) <a name="tunnel.prng">PRNG pairs</a></h3>
+
+<p>To minimize the damage from a DoS attack created by looped tunnels, a series
+of synchronized PRNGs are used across the tunnel - the gateway has one, the 
+endpoint has one, and every participant has two.  These in turn are broken down 
+into the inbound and outbound PRNG for each tunnel - the outbound PRNG is 
+synchronized with the inbound PRNG of the peer after you (obvious exception being
+the endpoint, which has no peer after it).  Outside of the PRNG with which each
+is synchronized with, there is no relationship between any of the other PRNGs.  
+This is accomplished by using a common PRNG algorithm <i>[tbd, perhaps 
+<a href="http://java.sun.com/j2se/1.4.2/docs/api/java/util/Random.html">java.lang.random</a>?]</i>, 
+seeded with the values delivered with the tunnel creation request.  Each peer 
+prefetches the next few values out of the inbound PRNG so that it can handle 
+lost or briefly out of order delivery, using these values to compare against the
+received message IDs.</p>
+
+<p>An adversary can still build loops within the tunnels, but the damage done is
+minimized in two ways.  First, if there is a loop created by providing a later
+hop with its next hop pointing at a previous peer, that loop will need to be 
+seeded with the right value so that its PRNG stays synchronized with the previous
+peer's inbound PRNG.  While some messages would go into the loop, as they start
+to actually loop back, two things would happen.  Either they would be accepted
+by that peer, thereby breaking the synchronization with the other PRNG which is
+really "earlier" in the tunnel, or the messages would be rejected if the real
+"earlier" peer sent enough messages into the loop to break the synchronization.</p>
+
+<p>If the adversary is very well coordinated and is colluding with several 
+participants, they could still build a functioning loop, though that loop would
+expire when the tunnel does.  This still allows an expansion of their work factor
+against the overall network load, but with tunnel throttling this could even
+be a useful positive tool for mitigating active traffic analysis.</p>
+
+<h3>2.8) <a name="tunnel.alternatives">Alternatives</a></h3>
+
+<h4>2.8.1) <a name="tunnel.reroute">Adjust tunnel processing midstream</a></h4>
+
+<p>While the simple tunnel routing algorithm should be sufficient for most cases,
+there are three alternatives that can be explored:</p>
+<ul>
+<li>Have a peer other than the endpoint temporarily act as the termination 
+point for a tunnel by adjusting the encryption used at the gateway to give them
+the plaintext of the preprocessed I2NP messages.  Each peer could check to see 
+whether they had the plaintext, processing the message when received as if they
+did.</li>
+<li>Allow routers participating in a tunnel to remix the message before 
+forwarding it on - bouncing it through one of that peer's own outbound tunnels,
+bearing instructions for delivery to the next hop.</li>
+<li>Implement code for the tunnel creator to redefine a peer's "next hop" in
+the tunnel, allowing further dynamic redirection.</li>
+</ul>
+
+<h4>2.8.2) <a name="tunnel.bidirectional">Use bidirectional tunnels</a></h4>
+
+<p>The current strategy of using two seperate tunnels for inbound and outbound
+communication is not the only technique available, and it does have anonymity
+implications.  On the positive side, by using separate tunnels it lessens the
+traffic data exposed for analysis to participants in a tunnel - for instance,
+peers in an outbound tunnel from a web browser would only see the traffic of
+an HTTP GET, while the peers in an inbound tunnel would see the payload 
+delivered along the tunnel.  With bidirectional tunnels, all participants would
+have access to the fact that e.g. 1KB was sent in one direction, then 100KB
+in the other.  On the negative side, using unidirectional tunnels means that
+there are two sets of peers which need to be profiled and accounted for, and
+additional care must be taken to address the increased speed of predecessor
+attacks.  The tunnel pooling and building process outlined below should
+minimize the worries of the predecessor attack, though if it were desired,
+it wouldn't be much trouble to build both the inbound and outbound tunnels
+along the same peers.</p>
+
+<h2>3) <a name="tunnel.building">Tunnel building</a></h2>
+
+<p>When building a tunnel, the creator must send a request with the necessary
+configuration data to each of the hops, then wait for the potential participant
+to reply stating that they either agree or do not agree.  These tunnel request
+messages and their replies are garlic wrapped so that only the router who knows
+the key can decrypt it, and the path taken in both directions is tunnel routed
+as well.  There are three important dimensions to keep in mind when producing
+the tunnels: what peers are used (and where), how the requests are sent (and 
+replies received), and how they are maintained.</p>
+
+<h3>3.1) <a name="tunnel.peerselection">Peer selection</a></h3>
+
+<p>Beyond the two types of tunnels - inbound and outbound - there are two styles
+of peer selection used for different tunnels - exploratory and client.
+Exploratory tunnels are used for both network database maintenance and tunnel
+maintenance, while client tunnels are used for end to end client messages.  </p>
+
+<h4>3.1.1) <a name="tunnel.selection.exploratory">Exploratory tunnel peer selection</a></h4>
+
+<p>Exploratory tunnels are built out of a random selection of peers from a subset
+of the network.  The particular subset varies on the local router and on what their
+tunnel routing needs are.  In general, the exploratory tunnels are built out of
+randomly selected peers who are in the peer's "not failing but active" profile
+category.  The secondary purpose of the tunnels, beyond merely tunnel routing,
+is to find underutilized high capacity peers so that they can be promoted for
+use in client tunnels.</p>
+
+<h4>3.1.2) <a name="tunnel.selection.client">Client tunnel peer selection</a></h4>
+
+<p>Client tunnels are built with a more stringent set of requirements - the local
+router will select peers out of its "fast and high capacity" profile category so
+that performance and reliability will meet the needs of the client application.
+However, there are several important details beyond that basic selection that 
+should be adhered to, depending upon the client's anonymity needs.</p>
+  
+<p>For some clients who are worried about adversaries mounting a predecessor 
+attack, the tunnel selection can keep the peers selected in a strict order -
+if A, B, and C are in a tunnel, the hop after A is always B, and the hop after
+B is always C.  A less strict ordering is also possible, assuring that while
+the hop after A may be B, B may never be before A.  Other configuration options
+include the ability for just the inbound tunnel gateways and outbound tunnel
+endpoints to be fixed, or rotated on an MTBF rate.</p>
+
+<h3>3.2) <a name="tunnel.request">Request delivery</a></h3>
+
+<p>As mentioned above, once the tunnel creator knows what peers should go into
+a tunnel and in what order, the creator builds a series of tunnel request 
+messages, each containing the necessary information for that peer.  For instance,
+participating tunnels will be given the 4 byte tunnel ID on which they are to
+receive messages, the 4 byte tunnel ID on which they are to send out the messages,
+the 32 byte hash of the next hop's identity, the pair of PRNG seeds for the inbound
+and outbound PRNG, and the 32 byte layer key used to
+remove a layer from the tunnel.  Of course, outbound tunnel endpoints are not 
+given any "next hop" or "next tunnel ID" information, and neither the inbound 
+tunnel gateways nor the outbound tunnel endpoints need both PRNG seeds.  To allow 
+replies, the request contains a random session tag and a random session key with 
+which the peer may garlic encrypt their decision, as well as the tunnel to which
+that garlic should be sent.  In addition to the above information, various client
+specific options may be included, such as what throttling to place on the tunnel,
+what padding or batch strategies to use, etc.</p>
+
+<p>After building all of the request messages, they are garlic wrapped for the
+target router and sent out an exploratory tunnel.  Upon receipt, that peer 
+determines whether they can or will participate, creating a reply message and
+both garlic wrapping and tunnel routing the response with the supplied 
+information.  Upon receipt of the reply at the tunnel creator, the tunnel is
+considered valid on that hop (if accepted).  Once all peers have accepted, the
+tunnel is active.</p>
+
+<h3>3.3) <a name="tunnel.pooling">Pooling</a></h3>
+
+<p>To allow efficient operation, the router maintains a series of tunnel pools,
+each managing a group of tunnels used for a specific purpose with their own
+configuration.  When a tunnel is needed for that purpose, the router selects one
+out of the appropriate pool at random.  Overall, there are two exploratory tunnel
+pools - one inbound and one outbound - each using the router's exploration 
+defaults.  In addition, there is a pair of pools for each local destination -
+one inbound and one outbound tunnel.  Those pools use the configuration specified
+when the local destination connected to the router, or the router's defaults if
+not specified.</p>
+
+<p>Each pool has within its configuration a few key settings, defining how many
+tunnels to keep active, how many backup tunnels to maintain in case of failure,
+how frequently to test the tunnels, how long the tunnels should be, whether those
+lengths should be randomized, how often replacement tunnels should be built, as 
+well as any of the other settings allowed when configuring individual tunnels.</p>
+
+<h3>3.4) <a name="tunnel.building.alternatives">Alternatives</a></h3>
+
+<h4>3.4.1) <a name="tunnel.building.telescoping">Telescopic building</a></h4>
+
+<p>One question that may arise regarding the use of the exploratory tunnels for
+sending and receiving tunnel creation messages is how that impacts the tunnel's 
+vulnerability to predecessor attacks.  While the endpoints and gateways of 
+those tunnels will be randomly distributed across the network (perhaps even 
+including the tunnel creator in that set), another alternative is to use the
+tunnel pathways themselves to pass along the request and response, as is done
+in <a href="http://tor.eff.org/">TOR</a>.  This, however, may lead to leaks 
+during tunnel creation, allowing peers to discover how many hops there are later
+on in the tunnel by monitoring the timing or packet count as the tunnel is
+built.  Techniques could be used to minimize this issue, such as using each of 
+the hops as endpoints (per <a href="#tunnel.reroute">2.7.2</a>) for a random
+number of messages before continuing on to build the next hop.</p>
+
+<h4>3.4.2) <a name="tunnel.building.nonexploratory">Non-exploratory tunnels for management</a></h4>
+
+<p>A second alternative to the tunnel building process is to give the router 
+an additional set of non-exploratory inbound and outbound pools, using those for
+the tunnel request and response.  Assuming the router has a well integrated view
+of the network, this should not be necessary, but if the router was partitioned
+in some way, using non-exploratory pools for tunnel management would reduce the
+leakage of information about what peers are in the router's partition.</p>
+
+<h2>4) <a name="tunnel.throttling">Tunnel throttling</a></h2>
+
+<p>Even though the tunnels within I2P bear a resemblence to a circuit switched
+network, everything within I2P is strictly message based - tunnels are merely
+accounting tricks to help organize the delivery of messages.  No assumptions are
+made regarding reliability or ordering of messages, and retransmissions are left
+to higher levels (e.g. I2P's client layer streaming library).  This allows I2P
+to take advantage of throttling techniques available to both packet switched and
+circuit switched networks.  For instance, each router may keep track of the 
+moving average of how much data each tunnel is using, combine that with all of 
+the averages used by other tunnels the router is participating in, and be able
+to accept or reject additional tunnel participation requests based on its 
+capacity and utilization.  On the other hand, each router can simply drop 
+messages that are beyond its capacity, exploiting the research used on the 
+normal internet.</p>
+
+<h2>5) <a name="tunnel.mixing">Mixing/batching</a></h2>
+
+<p>What strategies should be used at the gateway and at each hop for delaying,
+reordering, rerouting, or padding messages?  To what extent should this be done
+automatically, how much should be configured as a per tunnel or per hop setting,
+and how should the tunnel's creator (and in turn, user) control this operation?
+All of this is left as unknown, to be worked out for 
+<a href="http://www.i2p.net/roadmap#3.0">I2P 3.0</a></p>
\ No newline at end of file