Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Good observations!

Indeed: ZIP files are stream-writable, and some ZIP files are stream-readable, but not both: ZIP files that were stream-written are not steam-readable.

Also, steaming unzipping always requires that by the time you arrive at the central directory, you delete so-far-unzipped files that don't have entries in it, as those were "deleted".

> Then in 4.3.6 it describes the file format, which seems to be fundamentally incompatible to altering zip files by appending data, as the resulting file would not conform to this format.

My interpretation of the spec and specifically of 4.3.6 is that it is informational for how ZIP files usually look, and that you may store arbitrary data in between files; such data then doesn't count as "files". This reading then does allow appending and concatenation.

Unfortunately 4.3.6 does not have MUST/MAY wording so we don't really know if this reading was intended by the authors (maybe they clarify it in the future), but allowing this reading seems rather useful because it permits append-only modification of existing ZIP files. (The wording /'A ZIP file MUST have only one "end of central directory record"'/ suggests somehow that the authors didn't intend this, but again one could argue that this is not necessarily to state, that there is only one EOCDR by definition, and that any previous ones are just garbage data that is to be ignored.)



> Unfortunately 4.3.6 does not have MUST/MAY wording so we don't really know if this reading was intended by the authors

It's not explicit from the APPNOTE, but that's not the same as saying "we don't really know". We do know—islands of opaque data are allowed and that's the entire reason the format is designed the way it is. Katz designed ZIP for use on machines with floppy drives, and append-only modification of archives spanning multiple media was therefore baked into the scheme from the very roots. He also an produced an implementation that we happen to be able to check, and we know it works that way.

The only way to arrive at another interpretation is to look at APPNOTE in isolation and devoid of context.

> one could argue that this is not necessarily to state, that there is only one EOCDR by definition, and that any previous ones are just garbage data that is to be ignored

That is the correct interpretation; the wording doesn't suggest otherwise.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: