[Bug 57342] Writing very large file via SXSSF leads to corrupt file

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[Bug 57342] Writing very large file via SXSSF leads to corrupt file

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=57342

--- Comment #23 from [hidden email] ---
I did an in depth analysis of this issue. Turns out the problem is not with the
OOXML data generated by POI. The problem has to do with the ZIP format.
Specifically with ZIP64 extension. That's why it's all OK up until sheet1.xml
reaches over 4GB (uncompressed).
I have all the details written up in a blog post:
https://rzymek.github.io/post/excel-zip64/
Short story: Excel will want to repair the file if uncompressed size of a zip
entry exceeds 4GB and ZIP's Local File Header (LFH) does not specify zip spec
version 4.5

A minimal fix would go like this:
1. Switch commons-compress ZipArchiveOutputStream to Zip64Mode.Always (Apache
POI uses commons-compress not java.util.zip)
2. Modify commons-compress to put 0 in 32bit size fields if size is not known.
That is in LFH creation in streaming mode.  Currently in Zip64Mode.Always and
streaming zip creation commons-compress stores FF FF FF FF in 32bit size field
and 00 00 00 00 00 00 00 00 in 64bit in LFH. Excel expects 00 00 00 00 in the
32bit size field only in LFH.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]