Skip to content
pixelglow edited this page Sep 4, 2014 · 31 revisions

Easy-to-use interface

Archive classes

We use the familiar Foundation idiom of a container class ZZArchive.

ZZArchive keeps an array of ZZArchiveEntry.

ZZArchive allows its entries to be overwritten with a different array, which actually then writes them out to disk. This array can contain any combination of existing and new ZZArchiveEntry.

Archive entry classes

The ZZArchiveEntry class is a class cluster that serves as a facade for both existing ( ZZOldArchiveEntry ) and new ( ZZNewArchiveEntry ) entries. You can inspect zip entry fields through the corresponding property.

Streaming support

Mac OS X and iOS developers typically use either NSData, NSInputStream / NSOutputStream or CGDataProviderRef / CGDataConsumerRef classes to move data in and out of disk files. We support all of these methods.

Efficient implementation

Reading

Instead of explicitly reading in portions of the zip file, we make a read-only memory map of the entire content. So when memory runs low, the operating system can just discard parts of the zip file in memory instead of paging it to the swap file (Mac OS X) or running out of usable memory (iOS).

  • Existing deflated entries either use, wrap or just return a ZZInflateInputStream instance that inflates the entry data from the memory map.
  • Existing stored entries either use, wrap or just return the raw entry data from the memory map.

Writing

You supply new entry content just-in-time while writing out the zip file, instead of all-at-once before writing. This technique uses blocks that are called back while writing to the open zip file handle. In particular, you can use streaming API like NSStream and CGDataConsumerRef to save even more memory.

  • New deflated entries either use, wrap or just provide a ZZDeflateOutputStream instance to your callback that deflates to the open zip file handle.
  • New stored entries either wrap or just provide a ZZStoreOutputStream instance to your callback that stores to the open zip file handle.

Rewriting

If you are updating an existing zip file, we will skip reading and writing any initial unchanged entries. This idea reduces disk thrashing for document formats that keep large unchanging files toward the zip file start, and small changing files toward the zip file end.

For example, a HTML editor could keep a self-contained website's images at zip file start and its actual HTML at zip file end, and only have to pay for writing out the HTML when that changes.

If the entries at the end are compressed, the zip file write may end up faster than writing the entries as separate files.

File format compatibility

zipzap closely follows the zip file format specification. In particular, we support reading version 1.0 and above. However, any data descriptors present need to be prefaced with a signature. We currently don't support Zip64 or encrypted zip files (but we could, in the future).

zipzap writes version 3.0 zip files with UNIX file compatibility, just like the Mac OS X GUI and CLI zip tools do.

zipzap features an extensive unit test suite that tests the library against the CLI zip, unzip and zipinfo tools.

Alternatives

File formats

  • document package. Both Mac OS X and iOS support this format well with an extensive Objective-C API e.g. NSFileWrapper. Easy to casually inspect. Since they appear as directories on other platforms, it may be harder to treat as one unit e.g. iTunes file sharing, email attachments. No compression. When reading or writing, can process individual parts. Can stream large files.
  • binary plist. Both Mac OS X and iOS support this format well with an extensive Objective-C API. Somewhat easy to casually inspect. Little support on other platforms. No compression. When reading or writing, need to process entire file. Cannot stream large files.
  • sqlite. Both Mac OS X and iOS support this format, but with a C API. Somewhat hard to casually inspect. Some support on other platforms. Not a good fit for heterogenous data with a flat organization. No compression. When reading or writing, need only process relevant parts. Cannot stream large files.
  • CoreData (sqlite store). Both Mac OS X and iOS support this format with an extensive Objective-C API. Hard to inspect. Little support on other platforms. Not a good fit for heterogenous data with a flat organization. No compression. When reading or writing, need only process relevant parts. Cannot stream large files.

Libraries

  • minizip. Part of the zlib distribution, very popular. C API. Good support for Zip64. Browsing entries is through first/next iterating functions. When rewriting, only supports appending; no inserting, replacing or deleting. Can stream large files.
  • ZipKit. Objective-C API. Good support for Zip64. Many public classes to support file vs. data archives, extraction to files/data, resource forks etc. Uses multiple threads and notifications. When rewriting, only supports appending; no inserting, replacing or deleting. Cannot stream large files. No test suite.
  • objective-zip. Objective-C API, wrapper around minizip. Few public classes. Browsing entries is through first/next iterating functions. When rewriting, only supports appending; no inserting, replacing or deleting. Can stream large files through non-NSStream streams. GUI test suite.
  • ZipFile. Objective-C API, wrapper around minizip. One public class. Cannot write or rewrite. Cannot stream large files. No test suite.