Race between first-time local leaseset lookup, and storing the leaseset locally.
When looking up a leaseset for the first time, the lookup can fail even when the search job was successful. This issue does not occur on subsequent searches for the leaseset.
There seems to be an ordering issue between threads. The lookup job is triggered as completed before the leaseset is stored locally. The completion job for the lookup then fails because the leaseset isn't stored locally yet.
Here's an example from the logs I've been getting on my testing network, showing that search succeeds, but the lookup reports as failed, followed closely by the PersistentDataStore storing the leaseset afterwards.
Aug 5, 2022 6:46:35 PM DEBUG [P reader 1/4] demlia.IterativeLookupSelector: 2611:
DSM match IL Selector for [Hash: jWwtC4J2afSSemUvwuDJAiyQI-xo-s2msC388tAJ7as=]
Aug 5, 2022 6:46:35 PM INFO [P reader 1/4] demlia.FloodOnlyLookupMatchJob: 2611:
got a DSM for jWwtC4J2afSSemUvwuDJAiyQI-xo-s2msC388tAJ7as=
Aug 5, 2022 6:46:35 PM INFO [JobQueue 4/5] demlia.FloodOnlyLookupMatchJob: 2611:
search match for [Hash: jWwtC4J2afSSemUvwuDJAiyQI-xo-s2msC388tAJ7as=]
Aug 5, 2022 6:46:35 PM INFO [JobQueue 2/5] oodfillDatabaseStoreMessageJob:
Handling dbStore of leaseset [DatabaseStoreMessage:
Expiration: Fri Aug 05 23:46:42 GMT 2022
Unique ID: 23016875
Key: rvwc2c4cozu7jet2mux4fygjaiwjai7mnd5m3jvqfx6pfuaj5wvq.b32.i2p
Entry: [LeaseSet2:
Destination: [Destination:
Hash: jWwtC4J2afSSemUvwuDJAiyQI-xo-s2msC388tAJ7as=
Certificate: [Certificate: type: Null payload: null]
PublicKey: [PublicKey ELGAMAL_2048 size: 256]
SigningPublicKey: [SigningPublicKey DSA_SHA1 size: 128]]
B32: rvwc2c4cozu7jet2mux4fygjaiwjai7mnd5m3jvqfx6pfuaj5wvq.b32.i2p
Encryption Keys: 2
Encryption Key 0: [PublicKey ECIES_X25519 07bUfhxNxxgCvuUPXgWHHAinIWl1xWQ1xh2saEEvnxs=]
Encryption Key 1: [PublicKey ELGAMAL_2048 size: 256]
Options: 0
Unpublished? false
Blinded? false
Signature: [Signature DSA_SHA1: size: 40]
Published: Fri Aug 05 23:45:22 GMT 2022
Expires: Fri Aug 05 23:55:20 GMT 2022
Leases: #3
[Lease:
End Date: Aug 5, 2022 6:55 PM
Gateway: [Hash: qms1707Mq55PrEZJ~rPpmwH5HxRxIczCAH8P7K26APE=]
TunnelId: 3837809383]
[Lease:
End Date: Aug 5, 2022 6:52 PM
Gateway: [Hash: MWyrUXpQ8PAvY9LScoAIVz3tRcVSohDynh4vaPHESg0=]
TunnelId: 2980884198]
[Lease:
End Date: Aug 5, 2022 6:52 PM
Gateway: [Hash: Ll1seM7vKpuELH3vxYOTXvLnnfjVfPKbMzgjj7W3VeU=]
TunnelId: 1112020207]]]
Aug 5, 2022 6:46:35 PM INFO [JobQueue 4/5] db.kademlia.IterativeSearchJob:
2611: ISJ for [Hash: jWwtC4J2afSSemUvwuDJAiyQI-xo-s2msC388tAJ7as=] successful after 744, peers queried: 1
Aug 5, 2022 6:46:35 PM DEBUG [JobQueue 1/1] 2p.router.client.LookupDestJob:
Failed hash lookup [Hash: jWwtC4J2afSSemUvwuDJAiyQI-xo-s2msC388tAJ7as=]
Aug 5, 2022 6:46:35 PM DEBUG [nal Reader 4] lient.impl.I2PSessionMuxedImpl:
[I2PSnark #40191(OPEN)]: Message received of type 39 to be handled by HostReplyMessageHandler
Aug 5, 2022 6:46:35 PM DEBUG [nal Reader 4] t.impl.HostReplyMessageHandler: Handle message [HostReplyMessage:
[SessionId: 40191]
ReqID: 1
Result: 1]
Aug 5, 2022 6:46:35 PM WARN [520cd9650 #1] ient.streaming.I2PSocketEepGet: ERR: doFetch failed
java.net.UnknownHostException: Unknown or non-i2p host: rvwc2c4cozu7jet2mux4fygjaiwjai7mnd5m3jvqfx6pfuaj5wvq.b32.i2p
at net.i2p.client.streaming.I2PSocketEepGet.sendRequest(I2PSocketEepGet.java:187)
at net.i2p.util.EepGet.fetch(EepGet.java:693)
at net.i2p.client.streaming.I2PSocketEepGet.fetch(I2PSocketEepGet.java:76)
at net.i2p.util.EepGet.fetch(EepGet.java:649)
at org.klomp.snark.I2PSnarkUtil.get(I2PSnarkUtil.java:559)
at org.klomp.snark.TrackerClient.doRequest(TrackerClient.java:885)
at org.klomp.snark.TrackerClient.getPeersFromTrackers(TrackerClient.java:522)
at org.klomp.snark.TrackerClient.loop(TrackerClient.java:420)
at org.klomp.snark.TrackerClient.run(TrackerClient.java:259)
at java.lang.Thread.run(Thread.java:750)
at net.i2p.util.I2PThread.run(I2PThread.java:103)
Aug 5, 2022 6:46:35 PM WARN [520cd9650 #1] ient.streaming.I2PSocketEepGet: All attempts failed for http://rvwc2c4cozu7jet2mux4fygjaiwjai7mnd5m3jvqfx6pfuaj5wvq.b32.i2p/a?info_hash=%d1%a9%0ac%a4j%b5%aa%86%85%09%f9%98%f0%e5%05%20%cd%96P&peer_id=%00%00%00%00%00%00%00%00%00%03%03%03%88%1b%d2%c01%88%d1y&port=6881&ip=U~4o6kwGBFquivF4HWw6pDu7IsnzN2-S-2SZbY1QhtsIuCXwM8LLK6sZTK6Hig2xhGQPNHqLUuNLfuuL3RsORxNosnRnPDDmh8ukTpvIGdq5Xr-PY-gK1K05Fd9xLXLgQ2600PoLoBWOyeUwp1jU1LWS6PQ9yMviw1SW~T1iEmptP5Wyzz9isjD4OUk7sIL2ZRNRwBP6MY3EPG4L5jGlKaqBjXynBkyZBbkhN0Rl7s36UpQe71B9AeD8x0lFxGx2WFEDQSa9RKtTcLaS7GY-GCu4NL-GIIMwENZMzKSeTG6k-KWj40AsAFSSeqguIt~X4aijFE2s91aap3rP6Pycm18ZooNL7Fm-YEStTm0qobS2ms27mIoVTHN2MsistoQMJSbdnWoutL9bPpMC1kGaqVbIYOQ4MOE7SSpUJLWO1Ple-EEUf5b8S9GG~xHMOnbVYmxzv2btgozCNT-WvPNrvKzgVJCtDsci3rk4OUTWbaubNoezDyN79dQ1R3vm51FuBQAEAAcAAA==.i2p&uploaded=0&downloaded=0&left=96937463&compact=1&event=started&numwant=24
Aug 5, 2022 6:46:35 PM WARN [520cd9650 #1] org.klomp.snark.I2PSnarkUtil : Fetch failed [http://rvwc2c4cozu7jet2mux4fygjaiwjai7mnd5m3jvqfx6pfuaj5wvq.b32.i2p/a?info_hash=%d1%a9%0ac%a4j%b5%aa%86%85%09%f9%98%f0%e5%05%20%cd%96P&peer_id=%00%00%00%00%00%00%00%00%00%03%03%03%88%1b%d2%c01%88%d1y&port=6881&ip=U~4o6kwGBFquivF4HWw6pDu7IsnzN2-S-2SZbY1QhtsIuCXwM8LLK6sZTK6Hig2xhGQPNHqLUuNLfuuL3RsORxNosnRnPDDmh8ukTpvIGdq5Xr-PY-gK1K05Fd9xLXLgQ2600PoLoBWOyeUwp1jU1LWS6PQ9yMviw1SW~T1iEmptP5Wyzz9isjD4OUk7sIL2ZRNRwBP6MY3EPG4L5jGlKaqBjXynBkyZBbkhN0Rl7s36UpQe71B9AeD8x0lFxGx2WFEDQSa9RKtTcLaS7GY-GCu4NL-GIIMwENZMzKSeTG6k-KWj40AsAFSSeqguIt~X4aijFE2s91aap3rP6Pycm18ZooNL7Fm-YEStTm0qobS2ms27mIoVTHN2MsistoQMJSbdnWoutL9bPpMC1kGaqVbIYOQ4MOE7SSpUJLWO1Ple-EEUf5b8S9GG~xHMOnbVYmxzv2btgozCNT-WvPNrvKzgVJCtDsci3rk4OUTWbaubNoezDyN79dQ1R3vm51FuBQAEAAcAAA==.i2p&uploaded=0&downloaded=0&left=96937463&compact=1&event=started&numwant=24]
Aug 5, 2022 6:46:35 PM DEBUG [JobQueue 2/5] b.kademlia.PersistentDataStore: Storing key [Hash: jWwtC4J2afSSemUvwuDJAiyQI-xo-s2msC388tAJ7as=]
Aug 5, 2022 6:46:35 PM INFO [JobQueue 2/5] b.kademlia.PersistentDataStore: New leaseset for [Hash: jWwtC4J2afSSemUvwuDJAiyQI-xo-s2msC388tAJ7as=]: expires Fri Aug 05 23:52:32 GMT 2022
Aug 5, 2022 6:46:35 PM DEBUG [JobQueue 2/5] b.kademlia.PersistentDataStore: RAP? false RAR? true
Aug 5, 2022 6:46:35 PM DEBUG [520cd9650 #1] lient.impl.I2CPMessageProducer: New period after idle, 579 bytes
I've found myself stumped trying to reorder the flow in order to insure completion is triggered after PersistentDataStore has a chance to store the leaseset, but I'm stumped.
I have formed a solution that mitigates the race by performing a single retry if the first lookup fails.
From: obscuratus <obscuratus@mail.i2p>
Date: Sat, 6 Aug 2022 13:05:41 -0500
Subject: [PATCH] [LookupDestJob] Retry once to prevent race with Leaseset
storage.
When a leaseset is stored for the first time, it can result
in a race between the local storage of the leaseset and the
ability to look the leaseset up.
Provide for a single retry after a one second delay.
Signed-off-by: obscuratus <obscuratus@mail.i2p>
---
router/java/src/net/i2p/router/client/LookupDestJob.java | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/router/java/src/net/i2p/router/client/LookupDestJob.java b/router/java/src/net/i2p/router/client/LookupDestJob.java
index e983c4e60..8be1ad52c 100644
--- a/router/java/src/net/i2p/router/client/LookupDestJob.java
+++ b/router/java/src/net/i2p/router/client/LookupDestJob.java
@@ -199,6 +199,15 @@ class LookupDestJob extends JobImpl {
public String getName() { return "LeaseSet Lookup Reply to Client"; }
public void runJob() {
Destination dest = getContext().netDb().lookupDestinationLocally(_hash);
+ // Attempt a single retry if dest==null to allow
+ // for a possible race with the storing of the leaseset
+ // on new destination.
+ if (dest == null) {
+ try { Thread.sleep(1000); } catch (InterruptedException ie) {}
+ dest = getContext().netDb().lookupDestinationLocally(_hash);
+ if (_log.shouldDebug() && dest != null)
+ _log.debug("Retry required for lookup of destination: " + dest);
+ }
if (dest == null && _blindData != null) {
// TODO store and lookup original hash instead
LeaseSet ls = getContext().netDb().lookupLeaseSetLocally(_hash);
--
2.35.1
When running this patch and the actual I2P network, I see thousands of successful retries, that previously has reported as failures.
This large amount of first-time lookups happens frequently when running I2PSnark with DHT enabled. The DHT table can generate a prodigious amount of first-time lookups.