#1903 describes a way to reproduce, where the per-torrent config files under i2psnark.config.d are deleted. Those config files contain state information including completion and whether to remap filenames to "safe" characters. When the files are deleted, we forget that information. Torrents will be rescanned but we dont guess what the file mapping will be in the absence of a config file.
I don't understand why the config files would be deleted or corrupted on shutdown, I'll try to come up with a theory.
Please provide the info requested over in the forum thread: quote:
Please provide the version information from the top of /logs in the console, and the number of torrents you have running.
Installed new i2p machine, 0% participating traffic, pointed Snark at data dir with 72 torrents already successfully created and seeded on a different machine.
On first start all torrents were scanned. Files with "unsafe" characters were recreated and redownloaded as described in the forum. Torrents without "unsafe" characters were checked 100% complete.
First restart after some hours: All torrents rescanned - obviously all config files corrupted.
Second restart after some more hours: 19 torrents directly loaded, rest rescanned - obviously most config files corrupted.
Third restart after some more hours: 7 torrents directly loaded, rest rescanned - obviously most config files corrupted.
This has been seen on all my i2p machines on wheezy, jessie, stretch and macOS with 4 different HW platforms, all with journaled file systems. Others also have this problem as there are often leechers appearing being 80 - 95% complete on torrents that have been inactive for weeks. This must be previous seeders exposed to the same problem.
Conclusion 1: The "safe" file name mapping causes leechers to create file names different from the original torrent. If the config file of the original torrent gets corrupted, it can only properly be recreated by recreating the .torrent file. Recreating by rescan loses the original names. Possible solution: "Unsafe" file names get renamed before torrent creation.
Conclusion 2: The config file corruption problem caused by crash or by restart without participating tunnels really exists on a variety of platforms and i2p versions. Possible solution: If a torrent is rescanned, check file content against the torrent file, do not trust the config file. Introduce a way of integrity checking for the config files themselves.
I2P version: 0.9.28-0
Java version: Oracle Corporation 1.8.0_25 (Java™ SE Runtime Environment 1.8.0_25-b17)
Wrapper version: 3.5.30
Server version: 8.1.21.v20160908
Servlet version: Jasper JSP 2.1 Engine
JSTL version: standard-taglib 1.2.0
Platform: Mac OS X x86_64 10.12.2
Jcpuid version: 0
Processor: uninitialized (unrecognized)
Jbigi: Native BigInteger? library jbigi not loaded - using pure Java - poor performance may result - see http://i2p-projekt.i2p/jbigi for help
I think I was confused by your use of the word "corrupted". If a torrent gets rescanned in this scenario, it's probably not that the config file was corrupted or deleted - it probably just didn't get updated (i.e. rewritten) on shutdown. Since the torrent data will then be more recent than the config file, the torrent will be rescanned at the next startup.
In addition, the config file contains the setting on mapping of unsafe characters. Without that file, we make some assumption. We set mapping if we downloaded the torrent; we don't map if we created the torrent. If the torrent just appears in the directory, that's where we don't know what to do. The code may be able to handle this better, not sure. In any case, it's unrelated to the shutdown issue, and I may make a separate ticket for it.
I'll be looking at ways to speed up and parallelize the shutdown, and wait longer for it to complete.
In 4f859c6acce391a4c0b3bee3bbe96cded077c60f to be 0.9.28-5:
Parallelize the context shutdown tasks, so the jetty shutdown starts sooner
Increase max time for shutdown tasks from 10 sec to 60 sec (120 on ARM)
Change 20 ms delay between unannounces to every 8th unannounce
Even with the previous 20 ms delay in the loop, we should have been able to get almost 500 torrents to shut down cleanly (maybe 450… I think we have Jetty configured to wait a second), so I still don't fully understand what's going on. With this change we should be able to do a clean shutdown with several thousand torrents.
The only thing that's threaded is the unannounces themselves, so even if some of those aren't completed before shutdown, it shouldn't affect the config file.
The torrents are all closing quickly for me, perhaps though it takes longer if a lot of peers are connected.
Opening ticket #1938 for the unsafe character mapping issue.
Logs after change:
xx:50:21.519 WARN [le Jetty-122] net.i2p.router.Router : Router state change from RUNNING to GRACEFUL_SHUTDOWNxx:50:21.519 CRIT [ShutdownHook] net.i2p.router.Router : Graceful shutdown progress: No more tunnels, starting final shutdownxx:50:23.519 WARN [ShutdownHook] net.i2p.router.Router : Router state change from GRACEFUL_SHUTDOWN to FINAL_SHUTDOWN_1xx:50:23.519 WARN [ShutdownHook] net.i2p.router.Router : Router state change from FINAL_SHUTDOWN_1 to FINAL_SHUTDOWN_2xx:50:23.519 CRIT [ShutdownHook] net.i2p.router.Router : Starting final shutdown(2)xx:50:23.521 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.data.SDSCache$Shutdownxx:50:23.521 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.data.SDSCache$Shutdownxx:50:23.521 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.data.SDSCache$Shutdownxx:50:23.522 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.router.util.DecayingBloomFilter$Shutdownxx:50:23.522 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.util.SimpleTimer2$Shutdownxx:50:23.522 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.router.time.RouterTimestamper$Shutdownxx:50:23.522 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.router.util.DecayingBloomFilter$Shutdownxx:50:23.524 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.router.util.DecayingBloomFilter$Shutdownxx:50:23.525 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.router.web.ConfigServiceHandler$UpdateWrapperOrRekeyTaskxx:50:23.525 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.client.naming.BlockfileNamingService$Shutdownxx:50:23.525 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.router.util.DecayingBloomFilter$Shutdownxx:50:23.525 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.router.web.StatSummarizer$Shutdownxx:50:23.533 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.router.startup.RouterAppManager$Shutdownxx:50:23.533 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.router.util.DecayingBloomFilter$Shutdownxx:50:23.533 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.router.util.DecayingBloomFilter$Shutdownxx:50:23.534 WARN [ShutdownHook] net.i2p.router.Router : Running shutdown task class net.i2p.util.SimpleTimer2$Shutdownxx:50:23.535 WARN [ger$Shutdown] outer.startup.RouterAppManager: Shutting down client Router Consolexx:50:23.535 INFO [ger$Shutdown] outer.startup.RouterAppManager: Client Router Console is now STOPPINGxx:50:23.536 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.data.SDSCache$Shutdownxx:50:23.536 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.data.SDSCache$Shutdownxx:50:23.536 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.data.SDSCache$Shutdownxx:50:23.536 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.router.util.DecayingBloomFilter$Shutdownxx:50:23.536 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.util.SimpleTimer2$Shutdownxx:50:23.536 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.router.time.RouterTimestamper$Shutdownxx:50:23.536 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.router.util.DecayingBloomFilter$Shutdownxx:50:23.536 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.router.util.DecayingBloomFilter$Shutdownxx:50:23.536 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.router.web.ConfigServiceHandler$UpdateWrapperOrRekeyTaskxx:50:23.536 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.client.naming.BlockfileNamingService$Shutdownxx:50:23.536 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.router.util.DecayingBloomFilter$Shutdownxx:50:23.570 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.router.web.StatSummarizer$Shutdownxx:50:24.563 WARN [ger$Shutdown] org.klomp.snark.SnarkManager : Snark stop() beginjava.lang.Exception: I did it at org.klomp.snark.SnarkManager.stop(SnarkManager.java:307) at org.klomp.snark.web.I2PSnarkServlet.destroy(I2PSnarkServlet.java:102) at org.eclipse.jetty.servlet.ServletHolder.destroyInstance(ServletHolder.java:395) at org.eclipse.jetty.servlet.ServletHolder.doStop(ServletHolder.java:368) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) at org.eclipse.jetty.servlet.ServletHandler.doStop(ServletHandler.java:240) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) at org.eclipse.jetty.server.handler.HandlerWrapper.doStop(HandlerWrapper.java:107) at org.eclipse.jetty.security.SecurityHandler.doStop(SecurityHandler.java:392) at org.eclipse.jetty.security.ConstraintSecurityHandler.doStop(ConstraintSecurityHandler.java:464) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) at org.eclipse.jetty.server.handler.HandlerWrapper.doStop(HandlerWrapper.java:107) at org.eclipse.jetty.server.session.SessionHandler.doStop(SessionHandler.java:136) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) at org.eclipse.jetty.server.handler.HandlerWrapper.doStop(HandlerWrapper.java:107) at org.eclipse.jetty.server.handler.ContextHandler.doStop(ContextHandler.java:815) at org.eclipse.jetty.servlet.ServletContextHandler.doStop(ServletContextHandler.java:160) at org.eclipse.jetty.webapp.WebAppContext.doStop(WebAppContext.java:518) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) at org.eclipse.jetty.server.handler.HandlerCollection.doStop(HandlerCollection.java:250) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) at org.eclipse.jetty.server.handler.HandlerCollection.doStop(HandlerCollection.java:250) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) at org.eclipse.jetty.server.handler.HandlerWrapper.doStop(HandlerWrapper.java:107) at org.eclipse.jetty.server.Server.doStop(Server.java:343) at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) at net.i2p.router.web.RouterConsoleRunner.shutdown(RouterConsoleRunner.java:233) at net.i2p.router.startup.RouterAppManager.shutdown(RouterAppManager.java:190) at net.i2p.router.startup.RouterAppManager$Shutdown.run(RouterAppManager.java:201) at java.lang.Thread.run(Thread.java:745) at net.i2p.util.I2PThread.run(I2PThread.java:103)xx:50:24.566 WARN [ger$Shutdown] org.klomp.snark.SnarkManager : SnarkManager final shutdownxx:50:30.485 WARN [ger$Shutdown] org.klomp.snark.SnarkManager : Snark stop() endxx:50:30.544 INFO [ger$Shutdown] outer.startup.RouterAppManager: Client Router Console is now STOPPEDxx:50:30.544 WARN [ger$Shutdown] outer.startup.RouterAppManager: Shutting down client i2ptunnelxx:50:30.544 INFO [ger$Shutdown] outer.startup.RouterAppManager: Client i2ptunnel is now STOPPINGxx:50:30.545 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.router.startup.RouterAppManager$Shutdownxx:50:30.545 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.router.util.DecayingBloomFilter$Shutdownxx:50:30.545 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.router.util.DecayingBloomFilter$Shutdownxx:50:30.545 INFO [ShutdownHook] net.i2p.router.Router : Shutdown task complete: Shutdown task net.i2p.util.SimpleTimer2$Shutdownxx:50:35.196 WARN [ShutdownHook] net.i2p.router.Router : Router state change from FINAL_SHUTDOWN_2 to FINAL_SHUTDOWN_3xx:50:35.197 CRIT [ShutdownHook] net.i2p.router.Router : Shutdown(2) complete
However you are clearly describing that a clean shutdown is necessary to avoid torrent rescans at startup. That corresponds to my reports that this bug produces the same symptoms as a machine crash. I propose a real fix for the issue that will survive a "kill -9" by
either writing out the config files frequently with every change
or separating the config files into two (metadata and status) with one that is never touched once the torrent is complete and thus allows a clean reload after a restart.
Thanks for your comments. Unfortunately, if you think about it, it's impossible to guarantee a consistent, clean state after a kill -9. If you stop code at an arbitrary time, you're going to get arbitrary results. The best we can do is recognize that state on restart and do a rescan. With certain changes such as rewriting the config file after every change, we might make it better… or could we make it worse, by increasing the chance that the config file partially written? I'll think about possible further improvements but no-rescan-ever isn't possible.
The fix described above does not work in all cases.
I did a graceful restart upgrading from 0.9.29-3 to 4 and all config files were dumped out successfully at that time.
Subsequently, config files were written upon creation, deletion and completion of torrents.
I then did a graceful restart in order to upgrade from from 0.9.29-4 to 5 (seeding 99 torrents at that time). Not a single config file was written out. I lost the progress data at least.
Again no config files written upon a graceful restart updating to 0.9.29-6.
Webapp shutdown was broken by Jetty 9 (0.9.29-4) as reported in comment 8. Good catch!
After that was fixed, all config files for running torrents were being written at shutdown, whether changed or not. This was broken in late 2015, possibly as a temporary thing. Fixed to only write the necessary config files, which will speed up the shutdown. Only write when the storage (i.e. the bitfield) changed, or we uploaded something, so the upload byte count changed.
Also, not directly related to OP's issue, since he's on a Mac, but writing the config files is very slow on Raspberry Pi, made a change in DataHelper? to not sync the file system, which speeds it up by about 2x. I did tests on my macbook, and syncs are very fast, shouldn't be an issue.
All in 0.9.29-14 b2f40838a26c79ba642a60232f1505588783cc12
"gets lost" as in the config file disappears completely, or is overwritten with a fresh config, or is only partially stored, or some particular field is gone, or?
"gets lost" as in the config file disappears completely, or is overwritten with a fresh config, or is only partially stored, or some particular field is gone, or?
I haven't been monitoring the config files, but I'm assuming the config is overwritten with a new file. When metadata disappears, it doesn't disappear for all torrents at the same time.. sometimes the metadata will disappear for a couple of torrents, leaving the metadata intact for a several others. When the metadata does disappear, everything disappears including any comments associated with the torrent.