+------------------------------------------------------------------------------+
|               Elastic, Compressed, Content-Addressed Container               |
|               ................................................               |
|                           File Format Specification                          |
+------------------------------------------------------------------------------+

version: 1.0

1 Introduction
==============

   This section provides a brief introduction to the goals that EC3 is intended
   to fulfill.


   1.1 File Format Purpose and Design Goals
   ----------------------------------------

   The primary goals of the EC3 image format can be found in its name:

      * Elastic: The format should be adaptable and useful in a wide range of
        use-cases

      * Compressed: The format should support compression to reduce filesize
        and increase efficiency, without compromising random-access to file
        data

      * Content-Addressed: The format should support data de-duplication to
        further increase storage efficiency.

      * Container: The format should support storing multiple independent
        filesystems.

   At a low-level, EC3 is designed to be a format for storing multiple
   independent streams of data in a single image, with support for optional
   features such as encryption and compression.

   On top of this base, EC3 provides facilities for storing multiple whole
   filesystems within an image file. With support for extended attributes,
   a directory (or whole filesystem) can be accurately captured within an
   EC3 image, while compression and cluster-based data de-duplication greatly
   reduces the amount of disk space required.


   1.2 Document Scope
   ------------------

   This document describes the general layout of an EC3 image, and all of the
   data structures contained within. It provides all of the information required
   to read and write fully-featured container images.

   This document does not describe how to implement any software that can
   read or write containers, with the exception of describing any algorithms
   that are used.


   1.3 Terminology
   ---------------

   Several terms have particular meaning in the context of EC3. Those terms
   and their meaning are listed here.


   1.1.1 Image
      An Image is any EC3 file. An Image contains one or more Tags containing
      binary data.

   1.1.2 Tag
      A Tag is a contiguous range of binary data, with an associated type and
      identifier. The type of a Tag determines the format of the data and how
      it should be interpreted, while the identifier can be used to distinguish
      one Tag from another.

   1.1.2 Container
      A Container refers to an EC3 file that contains one or more Volumes. It
      is analogous to a storage device that contains one or more formatted
      partitions. Containers represent a subset of Images: while all Containers
      are Images, not all Images are Containers.

   1.1.3 Volume
      A Volume is a structured collection of logical files and directories
      stored within a Container. It is analogous to a partition of a storage
      device. The data that makes up a Volume is stored across a set of Tags
      within an Image.

   1.1.4 Image Key
      The Image Key is the symmetric cryptograpic key used to encrypt and
      decrypt data within an Image.

   1.1.5 Image Certificate
      The Image Certificate is a cryptographic public key and certificate that
      is embedded within an Image, and is used for digital signature
      verification.

   1.1.6 Image Signature
      The Image Signature is the cryptographic signature that is calculated
      from the data stored in the Image, and stored in a dedicated Tag.


2 Overview
==========

   This section provides a general overview of what an EC3 image is, how it
   works, and a preview of some of the internal data structures.


   2.1 What Is An EC3 Image?
   -------------------------

   An EC3 image is a data file that can contain, among other things, a set of
   zero or more logical filesystems, called volumes. Each volume has its own
   distinct tree of directories and files, while the actual file data is shared
   across all volumes within the container.

   An EC3 image is analogous to a traditional disk image containing a logical
   volume management (LVM) partition. Under an LVM partition scheme, a disk
   can have multiple "logical" partitions contained within a single "physical"
   partition. The logical partitions are separate, just like traditional
   partitions, but they all make use of the same contiguous range of sectors on
   the disk. Because of this, resizing partitions within an LVM group is as
   simple as changing the quota of blocks that a particular logical partition
   is allowed to allocate, and doesn't require physically moving any sectors
   around.

   EC3 builds upon this concept by employing cross-volume data de-duplication.
   Every file that is stored within an EC3 image is split into a set of fixed-
   size, content-addressed clusters. The size of these clusters is constant
   within a container. A typical cluster size would be 32KB. So, if two files
   within a container have the same contents, even if those files are in
   different volumes, the files will reference the same range of clusters. Only
   one copy of the file data is stored within the container. Even if the two
   files vary to some degree, as long as at least one cluster's worth of data is
   identical, some data can still be shared between the files.

   Clusters can also be compressed to further reduce file size. The clustering
   system provides some additional benefits when compression is in use. Seeking
   through a file is more performant, as you don't have to decompress the entire
   file to reach the target offset. You can simply skip to the cluster that
   corresponds to the offset you're looking for. Editing files within a volume
   is also easier as, again, you only have to decompress and re-write the
   cluster that has changed.

   Alongside volumes, EC3 images can contain a range of other data, including:
      * Manifests
      * Arbitrary binary blobs.
      * Executable files.
      * Digital signatures.
      * Certificates for digital signature verification.

   In contrast to volumes, these other data types are much simpler. An
   application can wrap their own binary data within an EC3 image and
   immediately make use of features like compression, encryption, and digital
   signature verification.


   2.2 Tags: The Core Unit Of Data
   -------------------------------

   At its most basic level, an EC3 image is just a set of one or more tags.
   A tag is a contiguous segment of binary data with an associated type and
   identifier. The contents of a tag can be optionally encrypted and signed.
   With the exception of the image header and tag table, all data contained
   within an EC3 image can be found in a tag. The tag tables contains
   information about all of the tags in the image.


3 Types & Units
===============

   This section describes the fundamental data types used within EC3 data
   structures, as well as some of the units used throughout this document.

   3.1 Integral Types
   ------------------

   All integer values are stored in big-endian format. All signed integer values
   are stored in 2s-complement format. The following integer types are used:

      Name         Size                    Sign
      -----------------------------------------------
      uint8        8 bits (1 byte)         Unsigned
      uint16       16 bits (2 bytes)       Unsigned
      uint32       32 bits (4 bytes)       Unsigned
      uint64       64 bits (8 bytes)       Unsigned
      int8         8 bits (1 byte)         Signed
      int16        16 bits (2 bytes)       Signed
      int32        32 bits (4 bytes)       Signed
      int64        64 bits (8 bytes)       Signed


   3.2 String Types
   ----------------

   All strings are stored in UTF-8 Unicode format with a trailing null
   terminator byte.


   3.3 Storage Size Units
   ----------------------

   Throughout this document, any reference to kilobytes, megabytes, etc refer
   to the base-2 units, rather than the base-10 units. For example, 1 kilobyte
   (or 1 KB) is equal to 1024 bytes (rather than 1000 bytes).


4 Algorithms
============

   EC3 uses a range of algorithms. A selection of hashing algorithms are used
   for fast data lookup and for ensuring data integrity.


   4.1 Fast Hast
   -------------

   The Fast Hash algorithm is optimised for hashing string data. It is intended
   for use in string-based hashmaps. The algorithm used for this purpose is
   the Fowler-Noll-Vo FNV-1 hashing algorithm, with a 64-bit digest size.

   The implementation of this algorithm can be found elsewhere, but the integer
   constants used to calculate hashes used by EC3 are provided here:

      * Offset Basis: 0xCBF29CE484222325
      * Prime: 0x100000001B3


   4.2 Slow Hash
   -------------

   The Slow Hash function is optimised for minimal chance of hash collisions.
   It is intended to generate the content hashes used to uniquely identify data
   clusters. The algorithm used for this purpose is the SHA-3 algorithm with a
   256-bit digest size.


   4.3 Checksum
   ------------

   The Checksum algorithm is used to validate the contents of an EC3 image
   and detect any corruption. The algorithm used for this purpose is the CRC32
   algorithm with a 32-bit digest size.

   Note that it is not intended to defend against intentional modification of an
   image, as this can be easily hidden by re-calculating the checksum. EC3
   provides other features to defend against malicious modifications.


5 Image Header
==============

   The Image Header can be found at the beginning of every EC3 image file.
   It provides critical information about the rest of the file, including the
   version of the file format that the file uses, and the location and size of
   the tag table. The header also includes two magic numbers:

      * A signature to validate that the file is in fact an EC3 image. This
        must have the value 0x45433358 ('EC3X' in ASCII).
      * An application magic number that is reserved for use by the creator of
        the image.


   5.1 Image Header Layout
   -----------------------

      Offset    Description             Type
      ----------------------------------------
      0x00      Signature               uint32
      0x04      Format Version          uint16
      0x06      Cluster Size            uint16
      0x08      Tag Table Offset        uint64
      0x10      Tag Count               uint64
      0x18      Application Magic       uint64

   5.1.1 Signature
      The Signature is found at the very beginning of the image file. It, like
      all integer types, is stored in big-endian. It always has the value
      0x45433358 (or 'EC3X' is ASCII).

   5.1.2 Format Version
      This specifies which version of the EC3 Image file format
      the rest of the file conforms to. Only the Signature and Format Version
      header items are guaranteed to be the same across all format versions.
      The format version is encoded as a 16-bit integer, with the following
      format:
                           0              1
                           0              6
                           XXXXXXXXYYYYYYYY

      Where X encodes the major number of the format version, and Y encodes
      the minor version of the format version. For example, version 3.2 would
      be encoded as 0x0302.

   5.1.3 Cluster Size
      This specifies the size of all data clusters stored within the image,
      before any transformation operations such as compression or encryption are
      applied.

      The following cluster size values are defined:

         Header Value      Cluster Size (bytes)      Cluster Size (kilobytes)
         --------------------------------------------------------------------
         0x00              4,096                     4
         0x01              8,192                     8
         0x02              16,384                    16
         0x03              32,768                    32
         0x04              65,536                    64

   5.1.4 Tag Table Offset
      This specifies the offset in bytes from the beginning of the image file
      to the beginning of the tag table.

   5.1.5 Tag Count
      This specifies the number of entries in the tag table.

   5.1.6 Application Magic
      This is an application-defined value. The creator of an EC3 image can
      set this to any arbitrary value. Any generic EC3 manipulation tools should
      preserve the value of this field and, if the tool supports creating EC3
      images, allow the user to specify the value to store in this field.


6 Tags
======

   Tags are the fundamental units of data storage in an EC3 image. Every image
   contains one or more tags. A tag is essentially a contiguous range of data
   within an image, with an associated type, identifier, and flags. Various
   data processing layers can be applied to the contents of a tag, such as
   encryption or compression. Every tag within an image can be referenced either
   by its index within the tag table or by an optional 64-bit identifier.


   6.1 The Tag Table
   -----------------

   The Tag Table describes all of the tags in an image. Its location and size
   can be found by parsing the Image Header. The Tag Table consists of a number
   of entries, one for each tag in the image.

   Each entry in the Tag Table has the following layout:

      Offset    Description             Type
      ----------------------------------------
      0x00      Tag Type                uint32
      0x04      Flags                   uint32
      0x08      Checksum                uint32
      0x1C      Reserved                uint32
      0x20      Identifier              uint64
      0x28      Offset                  uint64
      0x30      Size                    uint64
      0x38      Reserved                uint64

   6.1.1 Tag Type
      A 32-bit integer indicating the type of the tag. EC3 defines a range
      of different tag types, which can be found in Section 4.2

   6.1.2 Flags
      Flags describing certain attributes of a tag, such as whether the tag
      is compressed, encrypted, or signed. The full set of flags can be found
      in Section 6.3

   6.1.3 Checksum
      A checksum of the tag data, calculated on the raw data as it appears
      on-disk, after any Data Filters (compression, encryption, etc)
      have been applied. This checksum should be checked before the tag data is
      processed any further. The checksum is calculated using the algorithm
      described in Section 4.3

   6.1.4 Identifier
      An arbitrary 64-bit integer that can be used to identify a tag. Every tag
      within an image must have a unique identifier. The only exception is the
      identifier value 0x00, which any number of tags can use as their
      identifier and is used to indicate that a tag has no identifier.

   6.1.5 Offset and Size
      The offset from the beginning of the image file to the beginning of the
      tag data, and the length of the tag data. Both values are measured in
      bytes.


   6.2 Tag Types
   -------------

   The type of a tag determines the format of the data contained within it.

   6.2.1 VOLU: Volume
      Volume tags contain the filesystem tree and file/directory metadata for a
      single volume within the container.

   6.2.2 CTAB: Cluster Table
      The Cluster Table contains the file data clusters for all volumes within
      the container.

   6.2.3 XATR: Extended Attributes Table
      The Extended Attributes table contains any extended attributes referenced
      by any file or directory stored in any of the volumes in the container.

   6.2.4 STAB: String Table
      The String Table contains all of the strings used as file/directory names
      for all files and directores stored in the container.

   6.2.5 MFST: Manifest
      The manifest is a key-value data store that holds information describing
      the container. Apart from a few required keys, any arbitrary keys and
      values can be stored in the manifest.

   6.2.6 BLOB: Binary Data
      Binary blobs are contiguous buffers of arbitrary binary data. EC3 places
      no requirements on the length or layout of this data, so these tags can
      be used for any application-defined purpose.

   6.2.7 EXEC: Executable
      Executable tags are used to store embedded executable files. For certain
      executable file formats, these tags can also include auxiliary information
      about the executable file to allow readers to load and run the executable
      without having to implement a parser for the executable file format.

   6.2.8 CERT: Digital Certificate
      If any part of the image is digitally signed, it will also contain one or
      more Digital Certificate tags. These tags contain either:

         a) the certificate used to sign the container; or
         b) (optionally) any intermediate certificates needed to link the
            signing certificate back to a trusted root certificate.

   6.2.9 CSIG: Digital Signature
      If any part of the image is digitally signed, this tag contains the actual
      signature data.


   6.3 Tag Flags
   -------------

   A Tag can have a number of different flags set. A full list of these flags,
   including their values and meanings, is provided here.

   6.3.1 0x00000001: Signed
      The data in this Tag is included in the Image's digital
      signature.

   6.3.2 0x00000002: Compressed
      The data in this Tag is compressed. Note that, in most cases, this flag
      will not be enabled on the Cluster Table, as each Cluster is compressed
      separately.

   6.3.3 0x00000004: Encrypted
      The data in this Tag is encrypted using the Image Key.


   6.4 Tag Identifiers
   -------------------

   Every Tag in an Image must have a unique Identifier. The Identifier is a
   64-bit integer value, which can optionally be interpreted as a string of no
   more than 8 ASCII characters.

   If no Identifier is specified for a Tag, a sequential Identifier should be
   assigned automatically.


   6.5 Data Filtering
   ------------------

   The different types of processing that can be performed on a Tag's data, such
   as encryption and compression, are called Filters. Filters are applied to a
   Tag's data as it is being written, and are applied in reverse order when the
   data is being read.

   To facilitate multiple Filters being used together, the order in which
   Filters are applied to a particular Tag's data is strictly defined. When

   It is critical that Filters are applied in the correct order to maximise
   effectiveness. For example, Tag data must be compressed BEFORE it is encrypted.
   Encrypting data greatly increases its entropy and "randomness", making it
   essentially uncompressable.

   The types of Filters supported by EC3 are listed below, in the order they are
   applied when writing data to a Tag. When reading Tag data, the filters are
   applied in the reverse order.

   6.3.1 Compression
      Tag data is compressed before being written to the Image to reduce
      file size. This is the only Filter that changes the amount of data that
      is written to a file.

      Note that this Filter will reduce I/O performance and require that data
      is read sequentially from the Tag. Random access to compressed Tag data
      is not supported.

   6.3.2 Encryption
      Tag data is encrypted using the specified encryption key before being
      written to disk.

   6.3.3 Digital Signature
      Tag data is included in the set of data that makes up the Image's digital
      signature. Unlike the other Filters, this one does not modify the Tag
      data that is written to the Image, but rather specifies that the data is
      included as part of the whole Image's digital signature hash.

      More information about how the Image Signature is calculated and verified
      can be found in Section 11.


7 String Table
==============


8 Manifest
==========


9 Volumes
=========

   9.1 Filesystem Tree
   -------------------


   9.2 Clusters
   ------------


   9.3 Extended Attributes
   -----------------------


10 Binary Blobs
===============


11 Embedded Executables
=======================


12 Signature Verification
=========================


13 Encryption
=============


vim: shiftwidth=3 expandtab
