Torrents wrecked by inconsistent handling of "unsafe" characters
Opened 3 years ago
Last modified 3 years ago
#2304newdefect
Torrents wrecked by inconsistent handling of "unsafe" characters
Reported by:joggerOwned by:zzz Priority: minor Milestone: undecided Component: apps/i2psnark Version: 0.9.36 Keywords:
Cc:
Parent Tickets:
Sensitive: no
Description
Bug surfaces with filenames written by some Mac applications containing characters that have their high bit set. Example hexdump "EFBC8F" wich is displayed as " / ". Lots of those sequences exist.
Torrent created correctly with these characters inside the torrent file on Linux and Mac, Java 9 and 10.
Torrent downloads unchanged to Linux, Java 9 and 10. Downloaded torrent checks clean when moved to another instance on Linux or after crash. Same behaviour observed on Mac for 0.9.35 and Java 9.
On Mac with 0.9.36 and Java 10 above sequence is changed to a single underscore. Torrents do not check clean after a crash or when moved in after downloaded on Linux. As a consequence one can not be sure that it will be possible to seed a downloaded torrent at a later time or on a different machine.
Note about the standard for testing these kind of issues:
It was Kernighan & Pike in The Practice of Programming who said as much in Chapter 6, Testing, §6.5 Stress Tests:
When Steve Bourne was writing his Unix shell (which came to be known as the Bourne shell), he made a directory of 254 files with one-character names, one for each byte value except '\0' and slash, the two characters that cannot appear in Unix file names. He used that directory for all manner of tests of pattern-matching and tokenization. (The test directory was of course created by a program.) For years afterwards, that directory was the bane of file-tree-walking programs; it tested them to destruction.