The tl;dr of this issue is: JSZip cannot write valid ZIP files that are larger than 4GB.
If this is an anticipated limitation, then this issue can be closed immediately (but it should be pointed out more prominently on the "Limitations" page)
The Central Directory Entries of ZIP files store a "relative offset of local file header". This is a 32 bit value by default. When the offset is larger than a value that can be represented with 32bits, then this value has to be 0xFFFFFFFF
to indicate that the ZIP64 standard is used (See Wikipedia about ZIP64 as an entry point for the relevant specifications).
The following program creates two archives (filled with dummy data). One of them is just so below 4GB. The other one is just so above 4GB:
const fs = require("fs");
const JSZip = require("jszip");
function createFile(numFiles, fileSize, fileName) {
const zip = new JSZip();
const content = Buffer.alloc(fileSize);
for (let i = 0; i < numFiles; i++) {
zip.file("file" + i, content);
}
const generateOptions = {
type: "nodebuffer",
compression: "STORE",
streamFiles: true,
};
zip
.generateNodeStream(generateOptions)
.pipe(fs.createWriteStream(fileName))
.on("finish", function () {
console.log(fileName + " written.");
});
}
const defaultFileSize = 1000 * 1000 * 100;
createFile(40, defaultFileSize, "jsZip_small.zip");
createFile(45, defaultFileSize, "jsZip_large.zip");
The jsZip_small.zip
can be opened as usual, e.g. with the Windows Explorer. The jsZip_large.zip
can not be opened. (It can be opened with 7zip, for example, but not with some other tools). The reason for that is most likely that the information from the central directory is invalid. The following is the information that is stored in the last central directory entry of that archive (the one for file44
):
indexFileDirectoryEntry:
headerSignature: 33639248
versionMadeBy: 20
versionNeeded: 10
generalPurposeBitFlag: 8
compressionMethod: 0
lastModificationTime: 27148
lastModificationDate: 21873
uncompressedCrc32: 557995341
compressedSize: 100000000
uncompressedSize: 100000000
fileNameLength: 6
extraFieldLength: 0
fileCommentLength: 0
fileStartDiskNumber: 0
internalFileAttributes: 0
externalFileAttributes: 0
relativeLocalFileHeaderOffset: 105034982
fileName: file44
extraField: java.nio.HeapByteBuffer[pos=0 lim=0 cap=0]
fileComment:
The "relative local file header offset" is 105034982.
Looking at this offset with a tool that can show the "real" offset of that entry, one can see that the actual offset is 4400002278. And noticing that 4400002278 = 105034982 + 232, this strongly suggests that the wrong value is the result of a plain 32bit overflow.
The solution will likely have to happen somehwere near that line in ZipFileWorker.js
: It should detect when the offset
is larger than a 32 bit value, and use the ZIP64 version of the entry then. As very high-level pseudocode:
if (offset > maxValueFor32bit) {
dirRecord.offset = 0xFFFFFFFF; // According to ZIP64 specification
dirRecord.extras.setOffset(number); // Set the "real" offset in the ZIP64 "extras", as an 8-byte value at offset 20
}
From my local tests: This is the same program implemented with 'archiver.js' (based on 'zip-stream'):
const fs = require("fs");
const archiver = require("archiver");
function createFile(numFiles, fileSize, fileName) {
const outputStream = fs.createWriteStream(fileName);
const archive = archiver("zip", {
store: true,
});
archive.pipe(outputStream);
outputStream.on("close", () => {
console.log(fileName + " written.");
});
const content = Buffer.alloc(fileSize);
for (let i = 0; i < numFiles; i++) {
archive.append(content, { name: "file" + i });
}
archive.finalize();
}
const defaultFileSize = 1000 * 1000 * 100;
createFile(40, defaultFileSize, "archiver_small.zip");
createFile(45, defaultFileSize, "archiver_large.zip");
In this case, the "large" file is also valid, and looking more closely at that file, it shows that it uses the proper ZIP64 directory index format for the entries that are above the 4GB limit.