# Encrypted Archive Binary Format Specification **Version:** 1.1 **Date:** 2026-02-26 **Status:** Normative --- ## Table of Contents 1. [Overview and Design Goals](#1-overview-and-design-goals) 2. [Notation Conventions](#2-notation-conventions) 3. [Archive Structure Diagram](#3-archive-structure-diagram) 4. [Archive Header Definition](#4-archive-header-definition) 5. [Table of Contents (TOC) Entry Definition](#5-table-of-contents-toc-entry-definition) 6. [Data Block Layout](#6-data-block-layout) 7. [Encryption and Authentication Details](#7-encryption-and-authentication-details) 8. [Compression Details](#8-compression-details) 9. [Obfuscation Features](#9-obfuscation-features) 10. [Decode Order of Operations](#10-decode-order-of-operations) 11. [Version Compatibility Rules](#11-version-compatibility-rules) 12. [Worked Example](#12-worked-example) 13. [Appendix: Shell Decoder Reference](#13-appendix-shell-decoder-reference) --- ## 1. Overview and Design Goals This document specifies the binary format for `encrypted_archive` -- a custom archive container designed to be **unrecognizable by standard tools**. Standard utilities (`file`, `binwalk`, `7z`, `tar`, `unzip`) must not be able to identify or extract the contents of an archive produced in this format. ### Target Decoders Three independent implementations will build against this specification: 1. **Rust CLI archiver** (`encrypted_archive pack`/`unpack`) -- the reference encoder and primary decoder, runs on Linux/macOS. 2. **Kotlin Android decoder** -- runs on Android 13 (Qualcomm SoC) using only `javax.crypto` and `java.util.zip`. Primary extraction path on the target device. 3. **Busybox shell decoder** -- a fallback shell script using only standard busybox commands: `dd`, `xxd`, `openssl`, `gunzip`, and `sh`. Must work without external dependencies. ### Core Constraint The shell decoder must be able to parse the archive format using `dd` (for byte extraction), `xxd` (for hex conversion), and `openssl enc` (for AES-CBC decryption with raw key mode: `-K`/`-iv`/`-nosalt`). This constraint drives several design choices: - Fixed-size header at a known offset (no variable-length preamble before the TOC pointer) - Absolute offsets (no relative offset chains that require cumulative addition) - IVs stored in the file table, not embedded in data blocks (single `dd` call per extraction) - Little-endian integers (native byte order on ARM and x86) --- ## 2. Notation Conventions | Convention | Meaning | |------------|---------| | **LE** | Little-endian byte order | | **u8** | Unsigned 8-bit integer (1 byte) | | **u16** | Unsigned 16-bit integer (2 bytes) | | **u32** | Unsigned 32-bit integer (4 bytes) | | **bytes** | Raw byte sequence (no endianness) | | Offset `0xNN` | Absolute byte offset from archive byte 0 | | Size | Always in bytes unless stated otherwise | | `\|\|` | Concatenation of byte sequences | - All multi-byte integers are **little-endian (LE)**. - All sizes are in **bytes** unless stated otherwise. - All offsets are **absolute** from archive byte 0 (the first byte of the file). - Entry names are **UTF-8 encoded** relative paths using `/` as the path separator (e.g., `dir/subdir/file.txt`). Names MUST NOT start with `/` or contain `..` components. For top-level files, the name is just the filename (e.g., `readme.txt`). Names are length-prefixed with a u16 byte count (NOT null-terminated). - Reserved fields are **zero-filled** and MUST be written as `0x00` bytes. --- ## 3. Archive Structure Diagram ``` +=======================================+ | ARCHIVE HEADER | Fixed 40 bytes | magic(4) | ver(1) | flags(1) | | entry_count(2) | toc_offset(4) | | toc_size(4) | toc_iv(16) | | reserved(8) | +=======================================+ | FILE TABLE (TOC) | Variable size | Entry 1: name, type, perms, | Optionally encrypted | sizes, offset, iv, hmac, | Files AND directories | sha256, flags | (see Section 9.2) | Entry 2: ... | | ... | | Entry N: ... | +=======================================+ | DATA BLOCK 1 | encrypted_size bytes | [ciphertext] | +---------------------------------------+ | [DECOY PADDING 1] | Optional (see Section 9.3) +---------------------------------------+ | DATA BLOCK 2 | encrypted_size bytes | [ciphertext] | +---------------------------------------+ | [DECOY PADDING 2] | Optional (see Section 9.3) +---------------------------------------+ | ... | +=======================================+ ``` The archive consists of three contiguous regions: 1. **Header** (fixed 40 bytes) -- contains magic bytes, version, flags, and a pointer to the file table. 2. **File Table (TOC)** (variable size) -- contains one entry per archived file or directory with all metadata needed for extraction. 3. **Data Blocks** (variable size) -- contains the encrypted (and optionally compressed) file contents, one block per file entry (directory entries have no data block), optionally separated by decoy padding. --- ## 4. Archive Header Definition The header is a fixed-size 40-byte structure at offset 0x00. | Offset | Size | Type | Endian | Field | Description | |--------|------|------|--------|-------|-------------| | `0x00` | 4 | bytes | - | `magic` | Custom magic bytes: `0x00 0xEA 0x72 0x63`. The leading `0x00` signals binary content; the remaining bytes (`0xEA 0x72 0x63`) do not match any known file signature. | | `0x04` | 1 | u8 | - | `version` | Format version. Value `2` for this specification (v1.1). Value `1` for legacy v1.0 (no directory support). | | `0x05` | 1 | u8 | - | `flags` | Feature flags bitfield (see below). | | `0x06` | 2 | u16 | LE | `entry_count` | Number of entries (files and directories) stored in the archive. | | `0x08` | 4 | u32 | LE | `toc_offset` | Absolute byte offset of the entry table from archive start. | | `0x0C` | 4 | u32 | LE | `toc_size` | Size of the entry table in bytes (if TOC encryption is on, this is the encrypted size including PKCS7 padding). | | `0x10` | 16 | bytes | - | `toc_iv` | Initialization vector for encrypted TOC. Zero-filled (`0x00` x 16) when TOC encryption flag (bit 1) is off. | | `0x20` | 8 | bytes | - | `reserved` | Reserved for future use. MUST be zero-filled. | **Total header size: 40 bytes (0x28).** ### Flags Bitfield | Bit | Mask | Name | Description | |-----|------|------|-------------| | 0 | `0x01` | `compression` | Per-file compression enabled. When set, files MAY be individually gzip-compressed (per-file `compression_flag` controls each file). When clear, all files are stored raw. | | 1 | `0x02` | `toc_encrypted` | File table is encrypted with AES-256-CBC using `toc_iv`. When clear, file table is stored as plaintext. | | 2 | `0x04` | `xor_header` | Header bytes are XOR-obfuscated (see Section 9.1). When clear, header is stored as-is. | | 3 | `0x08` | `decoy_padding` | Random decoy bytes are inserted after data blocks (see Section 9.3). When clear, `padding_after` in every file table entry is 0. | | 4-7 | `0xF0` | reserved | Reserved. MUST be `0`. | --- ## 5. Table of Contents (TOC) Entry Definition The file table (TOC) is a contiguous sequence of variable-length entries, one per file or directory. Entries are stored so that directory entries appear before any files within them (parent-before-child ordering). There is no per-entry delimiter; entries are read sequentially using the `name_length` field to determine where each entry's variable-length name ends. ### Entry Field Table | Field | Size | Type | Endian | Description | |-------|------|------|--------|-------------| | `name_length` | 2 | u16 | LE | Entry name length in bytes (UTF-8 encoded byte count). | | `name` | `name_length` | bytes | - | Entry name as UTF-8 bytes. NOT null-terminated. Relative path using `/` as separator (see Entry Name Semantics below). | | `entry_type` | 1 | u8 | - | Entry type: `0x00` = regular file, `0x01` = directory. Directories have `original_size`, `compressed_size`, and `encrypted_size` all set to 0 and no corresponding data block. | | `permissions` | 2 | u16 | LE | Unix permission bits (lower 12 bits of POSIX `mode_t`). Bit layout: `[suid(1)][sgid(1)][sticky(1)][owner_rwx(3)][group_rwx(3)][other_rwx(3)]`. Example: `0o755` = `0x01ED` = owner rwx, group r-x, other r-x. Stored as u16 LE. | | `original_size` | 4 | u32 | LE | Original file size in bytes (before compression). For directories: 0. | | `compressed_size` | 4 | u32 | LE | Size after gzip compression. Equals `original_size` if `compression_flag` is 0 (no compression). For directories: 0. | | `encrypted_size` | 4 | u32 | LE | Size after AES-256-CBC encryption with PKCS7 padding. Formula: `((compressed_size / 16) + 1) * 16`. For directories: 0. | | `data_offset` | 4 | u32 | LE | Absolute byte offset of this entry's data block from archive start. For directories: 0. | | `iv` | 16 | bytes | - | Random AES-256-CBC initialization vector for this file. For directories: zero-filled. | | `hmac` | 32 | bytes | - | HMAC-SHA-256 over `iv || ciphertext`. See Section 7 for details. For directories: zero-filled. | | `sha256` | 32 | bytes | - | SHA-256 hash of the original file content (before compression and encryption). For directories: zero-filled. | | `compression_flag` | 1 | u8 | - | `0` = raw (no compression), `1` = gzip compressed. For directories: 0. | | `padding_after` | 2 | u16 | LE | Number of decoy padding bytes after this file's data block. Always `0` when flags bit 3 (decoy_padding) is off. | ### Entry Type Values | Value | Name | Description | |-------|------|-------------| | `0x00` | File | Regular file. Has associated data block with ciphertext. All size fields and data_offset are meaningful. | | `0x01` | Directory | Directory entry. `original_size`, `compressed_size`, `encrypted_size` are all 0. `data_offset` is 0. `iv` is zero-filled. `hmac` is zero-filled. `sha256` is zero-filled. `compression_flag` is 0. No data block exists for this entry. | ### Permission Bits Layout | Bits | Mask | Name | Description | |------|------|------|-------------| | 11 | `0o4000` | setuid | Set user ID on execution | | 10 | `0o2000` | setgid | Set group ID on execution | | 9 | `0o1000` | sticky | Sticky bit | | 8-6 | `0o0700` | owner | Owner read(4)/write(2)/execute(1) | | 5-3 | `0o0070` | group | Group read(4)/write(2)/execute(1) | | 2-0 | `0o0007` | other | Other read(4)/write(2)/execute(1) | Common examples: `0o755` (rwxr-xr-x) = `0x01ED`, `0o644` (rw-r--r--) = `0x01A4`, `0o700` (rwx------) = `0x01C0`. ### Entry Name Semantics - Names are relative paths from the archive root, using `/` as separator. - Example: a file at `project/src/main.rs` has name `project/src/main.rs`. - A directory entry for `project/src/` has name `project/src` (no trailing slash). - Names MUST NOT start with `/` (no absolute paths). - Names MUST NOT contain `..` components (no directory traversal). - The encoder MUST sort entries so that directory entries appear before any files within them (parent-before-child ordering). This allows the decoder to `mkdir -p` or create directories in a single sequential pass. ### Entry Size Formula Each TOC entry has a total size of: ``` entry_size = 2 + name_length + 1 + 2 + 4 + 4 + 4 + 4 + 16 + 32 + 32 + 1 + 2 = 104 + name_length bytes ``` ### File Table Total Size The total file table size is the sum of all entry sizes: ``` toc_size = SUM(104 + name_length_i) for i in 0..entry_count-1 ``` When TOC encryption (flags bit 1) is active, the encrypted TOC size includes PKCS7 padding: ``` encrypted_toc_size = ((toc_size / 16) + 1) * 16 ``` The `toc_size` field in the header stores the **actual size on disk** (encrypted size if TOC encryption is on, plaintext size if off). --- ## 6. Data Block Layout Each file entry has a single contiguous data block containing **only the ciphertext** (the AES-256-CBC encrypted output). Directory entries (`entry_type = 0x01`) have no data block. The decoder MUST skip directory entries when processing data blocks. ``` [ciphertext: encrypted_size bytes] ``` **Important design decisions:** - The **IV is stored only in the file table entry**, not duplicated at the start of the data block. The data block contains only ciphertext. This simplifies `dd` extraction in the shell decoder: a single `dd` call with the correct offset and size extracts the complete ciphertext. - The **HMAC is stored only in the file table entry**, not appended to the data block. The decoder reads the HMAC from the TOC, then verifies against the data block contents. - If decoy padding is enabled (flags bit 3), `padding_after` bytes of random data follow the ciphertext. The decoder MUST skip these bytes. The next file's data block starts at offset `data_offset + encrypted_size + padding_after`. ### Data Block Ordering Data blocks appear in the same order as file table entries. For file entry `i`: ``` data_offset_0 = toc_offset + toc_size data_offset_i = data_offset_{i-1} + encrypted_size_{i-1} + padding_after_{i-1} ``` --- ## 7. Encryption and Authentication Details ### Pipeline Each file is processed through the following pipeline, in order: ``` original_file | v [1. SHA-256 checksum] --> stored in file table entry as `sha256` | v [2. Gzip compress] (if compression_flag = 1) --> compressed_data | (size = compressed_size) v [3. PKCS7 pad] --> padded_data | (size = encrypted_size) v [4. AES-256-CBC encrypt] (with random IV) --> ciphertext | (size = encrypted_size) v [5. HMAC-SHA-256] (over IV || ciphertext) --> stored in file table entry as `hmac` ``` ### AES-256-CBC - **Key:** 32 bytes (256 bits), hardcoded and shared across all three decoders. - **IV:** 16 bytes, randomly generated for each file. Stored in the file table entry `iv` field. - **Block size:** 16 bytes. - **Mode:** CBC (Cipher Block Chaining). - The same 32-byte key is used for all files in the archive. ### PKCS7 Padding PKCS7 padding is applied to the compressed (or raw) data before encryption. PKCS7 **always adds at least 1 byte** of padding. If the input length is already a multiple of 16, a full 16-byte padding block is added. **Formula:** ``` encrypted_size = ((compressed_size / 16) + 1) * 16 ``` Where `/` is integer division (floor). **Examples:** | `compressed_size` | Padding bytes | `encrypted_size` | |-------------------|---------------|------------------| | 0 | 16 | 16 | | 1 | 15 | 16 | | 15 | 1 | 16 | | 16 | 16 | 32 | | 17 | 15 | 32 | | 31 | 1 | 32 | | 32 | 16 | 48 | | 100 | 12 | 112 | ### HMAC-SHA-256 - **Key:** The same 32-byte key used for AES-256-CBC encryption. (v1 uses a single key for both encryption and authentication. v2 will derive separate subkeys using HKDF.) - **Input:** The concatenation of the 16-byte IV and the ciphertext: ``` HMAC_input = IV (16 bytes) || ciphertext (encrypted_size bytes) Total HMAC input length = 16 + encrypted_size bytes ``` - **Output:** 32 bytes, stored in the file table entry `hmac` field. ### Encrypt-then-MAC This format uses the **Encrypt-then-MAC** construction: 1. The HMAC is computed **after** encryption, over the IV and ciphertext. 2. The decoder **MUST verify the HMAC before attempting decryption**. If the HMAC does not match, the decoder MUST reject the file without decrypting. This prevents padding oracle attacks and avoids processing tampered data. ### SHA-256 Integrity Checksum - **Input:** The original file content (before compression, before encryption). - **Output:** 32 bytes, stored in the file table entry `sha256` field. - **Verification:** After the decoder decrypts and decompresses a file, it computes SHA-256 of the result and compares it to the stored `sha256`. A mismatch indicates data corruption or an incorrect key. --- ## 8. Compression Details - **Algorithm:** Standard gzip (DEFLATE, RFC 1952). - **Granularity:** Per-file. Each file has its own `compression_flag` in the file table entry. - **Global flag:** The header flags bit 0 (`compression`) enables per-file compression. When this bit is clear, ALL files are stored raw regardless of individual `compression_flag` values. - **Recommendation:** Already-compressed files (APK, ZIP, PNG, JPEG) should use `compression_flag = 0` (raw) to avoid size inflation. ### Size Tracking - `original_size`: Size of the file before any processing. - `compressed_size`: Size after gzip compression. If `compression_flag = 0`, then `compressed_size = original_size`. - `encrypted_size`: Size after AES-256-CBC with PKCS7 padding. Always `>= compressed_size`. ### Decompression in Each Decoder | Decoder | Library/Command | |---------|-----------------| | Rust | `flate2` crate (`GzDecoder`) | | Kotlin | `java.util.zip.GZIPInputStream` | | Shell | `gunzip` (busybox) | --- ## 9. Obfuscation Features These features are defined fully in this v1 specification but are intended for implementation in Phase 6 (after all three decoders work without obfuscation). Each feature is controlled by a flag bit in the header and can be activated independently. ### 9.1 XOR Header Obfuscation (flags bit 2, mask `0x04`) When flags bit 2 is set, the entire 40-byte header is XOR-obfuscated with a fixed repeating 8-byte key. **XOR Key:** `0xA5 0x3C 0x96 0x0F 0xE1 0x7B 0x4D 0xC8` (8 bytes, repeating) **XOR Range:** Bytes `0x00` through `0x27` (the entire 40-byte header). **Application:** - XOR is applied **after** the header is fully constructed (all fields written). - The 8-byte key repeats cyclically across the 40 bytes: byte `i` of the header is XORed with `key[i % 8]`. **Decoding:** - The decoder reads the first 40 bytes and XORs them with the same repeating key (XOR is its own inverse). - After de-XOR, the decoder reads header fields normally. **Bootstrapping problem:** When XOR obfuscation is active, the flags byte itself is XORed. The decoder MUST: 1. Always attempt de-XOR on the first 40 bytes. 2. Read the flags byte from the de-XORed header. 3. Check if bit 2 is set. If it is, the de-XOR was correct. If it is not, re-read the header from the original (un-XORed) bytes. Alternatively, the decoder can check the magic bytes: if the first 4 bytes are `0x00 0xEA 0x72 0x63`, the header is not XOR-obfuscated. If they are not, attempt de-XOR and re-check. **When flags bit 2 is 0:** The header is stored as-is (no XOR). ### 9.2 TOC Encryption (flags bit 1, mask `0x02`) When flags bit 1 is set, the entire file table is encrypted with AES-256-CBC. - **Key:** The same 32-byte key used for file encryption. - **IV:** The `toc_iv` field in the header (16 bytes, randomly generated). - **Input:** The serialized file table (all entries concatenated). - **Padding:** PKCS7 padding is applied to the entire serialized TOC. - **`toc_size` in header:** Stores the **encrypted** TOC size (including PKCS7 padding), not the plaintext size. **Decoding:** 1. Read `toc_offset`, `toc_size`, and `toc_iv` from the (de-XORed) header. 2. Read `toc_size` bytes starting at `toc_offset`. 3. Decrypt with AES-256-CBC using `toc_iv` and the 32-byte key. 4. Remove PKCS7 padding. 5. Parse file table entries from the decrypted plaintext. **When flags bit 1 is 0:** The file table is stored as plaintext. `toc_iv` is zero-filled but unused. ### 9.3 Decoy Padding (flags bit 3, mask `0x08`) When flags bit 3 is set, random bytes are inserted after each file's data block. - The number of random padding bytes for each file is stored in the file table entry `padding_after` field (u16 LE). - Padding bytes are cryptographically random and carry no meaningful data. - The decoder MUST skip `padding_after` bytes after reading the ciphertext of each file. - The padding disrupts size-based analysis: an observer cannot determine individual file sizes from the data block layout. **Next data block offset:** ``` next_data_offset = data_offset + encrypted_size + padding_after ``` **When flags bit 3 is 0:** `padding_after` is `0` for every file table entry. No padding bytes exist between data blocks. --- ## 10. Decode Order of Operations The following steps MUST be followed in order by all decoders: ``` 1. Read 40 bytes from offset 0x00. 2. Attempt XOR de-obfuscation: a. Check if bytes 0x00-0x03 equal magic (0x00 0xEA 0x72 0x63). b. If YES: header is not XOR-obfuscated. Use as-is. c. If NO: XOR bytes 0x00-0x27 with key (0xA5 0x3C 0x96 0x0F 0xE1 0x7B 0x4D 0xC8), repeating cyclically. Re-check magic. If still wrong, reject archive. 3. Parse header fields: - Verify magic == 0x00 0xEA 0x72 0x63 - Read version (must be 2 for v1.1) - Read flags - Check for unknown flag bits (bits 4-7 must be 0; reject if not) - Read entry_count - Read toc_offset, toc_size, toc_iv 4. Read TOC: a. Seek to toc_offset. b. Read toc_size bytes. c. If flags bit 1 (toc_encrypted) is set: - Decrypt TOC with AES-256-CBC using toc_iv and the 32-byte key. - Remove PKCS7 padding. d. Parse entry_count entries sequentially from the (decrypted) TOC bytes. 5. For each entry (i = 0 to entry_count - 1): a. Check entry_type. If 0x01 (directory): create the directory using the entry name as a relative path, apply permissions from the `permissions` field, and skip to the next entry (no ciphertext to read). b. Read ciphertext (file entries only): - Seek to data_offset. - Read encrypted_size bytes. c. Verify HMAC: - Compute HMAC-SHA-256(key, iv || ciphertext). - Compare with stored hmac (32 bytes). - If mismatch: REJECT this file. Do NOT attempt decryption. d. Decrypt: - Decrypt ciphertext with AES-256-CBC using entry's iv and the 32-byte key. - Remove PKCS7 padding. - Result = compressed_data (or raw data if compression_flag = 0). e. Decompress (if compression_flag = 1): - Decompress with gzip. - Result = original file content. f. Verify integrity: - Compute SHA-256 of the decompressed/raw result. - Compare with stored sha256 (32 bytes). - If mismatch: WARN (data corruption or wrong key). g. Write to output: - Create parent directories as needed (using the path components of the entry name). - Create output file using stored name. - Write the verified content. - Apply permissions from the entry's `permissions` field. ``` --- ## 11. Version Compatibility Rules 1. **Version field:** The `version` field at offset `0x04` identifies the format version. This specification defines version `2` (v1.1). Version `1` was the original v1.0 format (no directory support, no entry_type/permissions fields). 2. **Version 2 changes from version 1:** - TOC entries now include `entry_type` (1 byte) and `permissions` (2 bytes) fields after `name` and before `original_size`. - Entry size formula changed from `101 + name_length` to `104 + name_length`. - `file_count` header field renamed to `entry_count` (same offset, same type; directories count as entries). - Entry names are relative paths with `/` separator (not filename-only). - Entries are ordered parent-before-child (directories before their contents). 3. **Forward compatibility:** Decoders MUST reject archives with `version` greater than their supported version. A v2 decoder encountering `version = 3` MUST fail with a clear error message. 4. **Unknown flags:** Decoders MUST reject archives that have any reserved flag bits (bits 4-7) set to `1`. Unknown flags indicate features the decoder does not understand and cannot safely skip. Silent ignoring of unknown flags is prohibited. 5. **Future versions:** Version 3+ MAY: - Add fields after the `reserved` bytes in the header (growing header size). - Define new flag bits (bits 4-7). - Change the `reserved` field to carry metadata. - Introduce HKDF-derived per-file keys (replacing single shared key). 6. **Backward compatibility:** Future versions SHOULD maintain the same magic bytes and the same position of the `version` field (offset `0x04`) so that decoders can read the version before deciding how to proceed. --- ## 12. Worked Example This section constructs a complete 3-entry directory archive byte by byte, demonstrating the v1.1 format with entry types, permissions, and relative paths. All offsets, field sizes, and hex values are internally consistent and can be verified by summing field sizes. This example serves as a **golden reference** for implementation testing. ### 12.1 Input Structure ``` project/ project/src/ (directory, mode 0755) project/src/main.rs (file, mode 0644, content: "fn main() {}\n" = 14 bytes) project/empty/ (empty directory, mode 0755) ``` This demonstrates: - A nested directory (`project/src/`) - A file inside a nested directory (`project/src/main.rs`) - An empty directory (`project/empty/`) - Three TOC entries total: 2 directories + 1 file | # | Entry Name | Type | Permissions | Content | Size | |---|------------|------|-------------|---------|------| | 1 | `project/src` | directory | `0o755` | (none) | 0 bytes | | 2 | `project/src/main.rs` | file | `0o644` | `fn main() {}\n` | 14 bytes | | 3 | `project/empty` | directory | `0o755` | (none) | 0 bytes | Entries are ordered parent-before-child: `project/src` appears before `project/src/main.rs`. ### 12.2 Parameters - **Key:** 32 bytes: `00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F` - **Flags:** `0x01` (compression enabled, no obfuscation) - **Version:** `2` ### 12.3 Per-Entry Pipeline Walkthrough #### Entry 1: `project/src` (directory) Directory entries have no data. All crypto fields are zero-filled: - `entry_type`: `0x01` - `permissions`: `0o755` = `0x01ED` (LE: `ED 01`) - `original_size`: 0 - `compressed_size`: 0 - `encrypted_size`: 0 - `data_offset`: 0 - `iv`: zero-filled (16 bytes of `0x00`) - `hmac`: zero-filled (32 bytes of `0x00`) - `sha256`: zero-filled (32 bytes of `0x00`) - `compression_flag`: 0 #### Entry 2: `project/src/main.rs` (file) **Step 1: SHA-256 checksum of original content** ``` SHA-256("fn main() {}\n") = 536e506bb90914c243a12b397b9a998f85ae2cbd9ba02dfd03a9e155ca5ca0f4 ``` As bytes: ``` 53 6E 50 6B B9 09 14 C2 43 A1 2B 39 7B 9A 99 8F 85 AE 2C BD 9B A0 2D FD 03 A9 E1 55 CA 5C A0 F4 ``` **Step 2: Gzip compression** Gzip output is implementation-dependent (timestamps, OS flags vary). For this example, we use a representative compressed size of **30 bytes**. The actual gzip output will differ between implementations, but the pipeline and sizes are computed from this value. - `compressed_size = 30` **Step 3: Compute encrypted_size (PKCS7 padding)** ``` encrypted_size = ((30 / 16) + 1) * 16 = ((1) + 1) * 16 = 32 bytes ``` PKCS7 padding adds `32 - 30 = 2` bytes of value `0x02`. **Step 4: AES-256-CBC encryption** - IV (randomly chosen for this example): `AA BB CC DD EE FF 00 11 22 33 44 55 66 77 88 99` - Ciphertext: 32 bytes (actual value depends on the gzip output and IV; representative bytes used in the hex dump below) **Step 5: HMAC-SHA-256** ``` HMAC_input = IV (16 bytes) || ciphertext (32 bytes) = 48 bytes total HMAC-SHA-256(key, HMAC_input) = <32 bytes> ``` The HMAC value depends on the actual ciphertext; representative bytes (`0xC1` repeated) are used in the hex dump. In a real implementation, this MUST be computed from the actual IV and ciphertext. - `entry_type`: `0x00` - `permissions`: `0o644` = `0x01A4` (LE: `A4 01`) #### Entry 3: `project/empty` (directory) Directory entries have no data. All crypto fields are zero-filled (identical pattern to Entry 1): - `entry_type`: `0x01` - `permissions`: `0o755` = `0x01ED` (LE: `ED 01`) - All size fields, data_offset, iv, hmac, sha256: zero-filled. ### 12.4 Archive Layout | Region | Start Offset | End Offset | Size | Description | |--------|-------------|------------|------|-------------| | Header | `0x0000` | `0x0027` | 40 bytes | Fixed header (version 2) | | TOC Entry 1 | `0x0028` | `0x009A` | 115 bytes | `project/src` directory metadata | | TOC Entry 2 | `0x009B` | `0x0115` | 123 bytes | `project/src/main.rs` file metadata | | TOC Entry 3 | `0x0116` | `0x018A` | 117 bytes | `project/empty` directory metadata | | Data Block 1 | `0x018B` | `0x01AA` | 32 bytes | `project/src/main.rs` ciphertext | | **Total** | | | **427 bytes** | | **Note:** Only 1 data block exists because 2 of the 3 entries are directories (no data). **Entry size verification:** ``` Entry 1: 104 + 11 ("project/src") = 115 bytes CHECK Entry 2: 104 + 19 ("project/src/main.rs") = 123 bytes CHECK Entry 3: 104 + 13 ("project/empty") = 117 bytes CHECK ``` **Offset verification:** ``` TOC offset = header_size = 40 (0x28) CHECK TOC size = 115 + 123 + 117 = 355 (0x163) CHECK Data Block 1 = toc_offset + toc_size = 40 + 355 = 395 (0x18B) CHECK Archive end = data_offset_1 + encrypted_size_1 = 395 + 32 = 427 (0x1AB) CHECK ``` ### 12.5 Header (Bytes 0x0000 - 0x0027) | Offset | Hex | Field | Value | |--------|-----|-------|-------| | `0x0000` | `00 EA 72 63` | magic | Custom magic bytes | | `0x0004` | `02` | version | 2 (v1.1) | | `0x0005` | `01` | flags | `0x01` = compression enabled | | `0x0006` | `03 00` | entry_count | 3 (LE) | | `0x0008` | `28 00 00 00` | toc_offset | 40 (LE) | | `0x000C` | `63 01 00 00` | toc_size | 355 (LE) | | `0x0010` | `00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00` | toc_iv | Zero-filled (TOC not encrypted) | | `0x0020` | `00 00 00 00 00 00 00 00` | reserved | Zero-filled | ### 12.6 TOC Entry 1: `project/src` -- directory (Bytes 0x0028 - 0x009A) | Offset | Hex | Field | Value | |--------|-----|-------|-------| | `0x0028` | `0B 00` | name_length | 11 (LE) | | `0x002A` | `70 72 6F 6A 65 63 74 2F 73 72 63` | name | "project/src" (UTF-8) | | `0x0035` | `01` | entry_type | `0x01` = directory | | `0x0036` | `ED 01` | permissions | `0o755` = `0x01ED` (LE) | | `0x0038` | `00 00 00 00` | original_size | 0 (directory) | | `0x003C` | `00 00 00 00` | compressed_size | 0 (directory) | | `0x0040` | `00 00 00 00` | encrypted_size | 0 (directory) | | `0x0044` | `00 00 00 00` | data_offset | 0 (directory -- no data block) | | `0x0048` | `00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00` | iv | Zero-filled (directory) | | `0x0058` | `00 ... (32 bytes of 0x00)` | hmac | Zero-filled (directory) | | `0x0078` | `00 ... (32 bytes of 0x00)` | sha256 | Zero-filled (directory) | | `0x0098` | `00` | compression_flag | 0 (directory) | | `0x0099` | `00 00` | padding_after | 0 | **Entry size verification:** `2 + 11 + 1 + 2 + 4 + 4 + 4 + 4 + 16 + 32 + 32 + 1 + 2 = 115 bytes`. Offset range: `0x0028` to `0x009A` = 115 bytes. CHECK. ### 12.7 TOC Entry 2: `project/src/main.rs` -- file (Bytes 0x009B - 0x0115) | Offset | Hex | Field | Value | |--------|-----|-------|-------| | `0x009B` | `13 00` | name_length | 19 (LE) | | `0x009D` | `70 72 6F 6A 65 63 74 2F 73 72 63 2F 6D 61 69 6E 2E 72 73` | name | "project/src/main.rs" (UTF-8) | | `0x00B0` | `00` | entry_type | `0x00` = file | | `0x00B1` | `A4 01` | permissions | `0o644` = `0x01A4` (LE) | | `0x00B3` | `0E 00 00 00` | original_size | 14 (LE) | | `0x00B7` | `1E 00 00 00` | compressed_size | 30 (LE) | | `0x00BB` | `20 00 00 00` | encrypted_size | 32 (LE) | | `0x00BF` | `8B 01 00 00` | data_offset | 395 = 0x18B (LE) | | `0x00C3` | `AA BB CC DD EE FF 00 11 22 33 44 55 66 77 88 99` | iv | Example IV for this file | | `0x00D3` | `C1 C1 C1 ... (32 bytes)` | hmac | Representative HMAC (actual depends on ciphertext) | | `0x00F3` | `53 6E 50 6B B9 09 14 C2 43 A1 2B 39 7B 9A 99 8F 85 AE 2C BD 9B A0 2D FD 03 A9 E1 55 CA 5C A0 F4` | sha256 | SHA-256 of "fn main() {}\n" | | `0x0113` | `01` | compression_flag | 1 (gzip) | | `0x0114` | `00 00` | padding_after | 0 (no decoy padding) | **Entry size verification:** `2 + 19 + 1 + 2 + 4 + 4 + 4 + 4 + 16 + 32 + 32 + 1 + 2 = 123 bytes`. Offset range: `0x009B` to `0x0115` = 123 bytes. CHECK. ### 12.8 TOC Entry 3: `project/empty` -- directory (Bytes 0x0116 - 0x018A) | Offset | Hex | Field | Value | |--------|-----|-------|-------| | `0x0116` | `0D 00` | name_length | 13 (LE) | | `0x0118` | `70 72 6F 6A 65 63 74 2F 65 6D 70 74 79` | name | "project/empty" (UTF-8) | | `0x0125` | `01` | entry_type | `0x01` = directory | | `0x0126` | `ED 01` | permissions | `0o755` = `0x01ED` (LE) | | `0x0128` | `00 00 00 00` | original_size | 0 (directory) | | `0x012C` | `00 00 00 00` | compressed_size | 0 (directory) | | `0x0130` | `00 00 00 00` | encrypted_size | 0 (directory) | | `0x0134` | `00 00 00 00` | data_offset | 0 (directory -- no data block) | | `0x0138` | `00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00` | iv | Zero-filled (directory) | | `0x0148` | `00 ... (32 bytes of 0x00)` | hmac | Zero-filled (directory) | | `0x0168` | `00 ... (32 bytes of 0x00)` | sha256 | Zero-filled (directory) | | `0x0188` | `00` | compression_flag | 0 (directory) | | `0x0189` | `00 00` | padding_after | 0 | **Entry size verification:** `2 + 13 + 1 + 2 + 4 + 4 + 4 + 4 + 16 + 32 + 32 + 1 + 2 = 117 bytes`. Offset range: `0x0116` to `0x018A` = 117 bytes. CHECK. ### 12.9 Data Block (Bytes 0x018B - 0x01AA) Only one data block exists in this archive -- for `project/src/main.rs` (the only file entry). Both directory entries have no data blocks. **Data Block 1** (bytes `0x018B` - `0x01AA`, 32 bytes): Ciphertext of gzip-compressed `"fn main() {}\n"`, encrypted with AES-256-CBC. Actual bytes depend on the gzip output (which includes timestamps) and the IV. Representative value: 32 bytes of ciphertext (`0xE7` repeated). ### 12.10 Complete Annotated Hex Dump The following hex dump shows the full 427-byte archive. HMAC values (`C1...`) and ciphertext (`E7...`) are representative placeholders. The SHA-256 hash is a real computed value. ``` Offset | Hex | ASCII | Annotation --------|--------------------------------------------------|------------------|------------------------------------------ 0x0000 | 00 EA 72 63 02 01 03 00 28 00 00 00 63 01 00 00 | ..rc....(...c... | Header: magic, ver=2, flags=0x01, count=3, toc_off=40, toc_sz=355 0x0010 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | Header: toc_iv (zero-filled, TOC not encrypted) 0x0020 | 00 00 00 00 00 00 00 00 0B 00 70 72 6F 6A 65 63 | ..........projec | Header: reserved | Entry 1: name_len=11, name="projec" 0x0030 | 74 2F 73 72 63 01 ED 01 00 00 00 00 00 00 00 00 | t/src........... | Entry 1: name="t/src", type=0x01(dir), perms=0o755, orig=0, comp=0 0x0040 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | Entry 1: enc=0, data_off=0, iv[0..7] 0x0050 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | Entry 1: iv[8..15], hmac[0..7] 0x0060 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | Entry 1: hmac[8..23] 0x0070 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | Entry 1: hmac[24..31], sha256[0..7] 0x0080 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | Entry 1: sha256[8..23] 0x0090 | 00 00 00 00 00 00 00 00 00 00 00 13 00 70 72 6F | .............pro | Entry 1: sha256[24..31], comp=0, pad=0 | Entry 2: name_len=19, name="pro" 0x00A0 | 6A 65 63 74 2F 73 72 63 2F 6D 61 69 6E 2E 72 73 | ject/src/main.rs | Entry 2: name="ject/src/main.rs" 0x00B0 | 00 A4 01 0E 00 00 00 1E 00 00 00 20 00 00 00 8B | ........... .... | Entry 2: type=0x00(file), perms=0o644, orig=14, comp=30, enc=32, data_off= 0x00C0 | 01 00 00 AA BB CC DD EE FF 00 11 22 33 44 55 66 | ..........."3DUf | Entry 2: =395(0x18B), iv[0..12] 0x00D0 | 77 88 99 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 | w............... | Entry 2: iv[13..15], hmac[0..12] 0x00E0 | C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 | ................ | Entry 2: hmac[13..28] 0x00F0 | C1 C1 C1 53 6E 50 6B B9 09 14 C2 43 A1 2B 39 7B | ...SnPk....C.+9{ | Entry 2: hmac[29..31], sha256[0..12] 0x0100 | 9A 99 8F 85 AE 2C BD 9B A0 2D FD 03 A9 E1 55 CA | .....,...-....U. | Entry 2: sha256[13..28] 0x0110 | 5C A0 F4 01 00 00 0D 00 70 72 6F 6A 65 63 74 2F | \.......project/ | Entry 2: sha256[29..31], comp=1, pad=0 | Entry 3: name_len=13, name="project/" 0x0120 | 65 6D 70 74 79 01 ED 01 00 00 00 00 00 00 00 00 | empty........... | Entry 3: name="empty", type=0x01(dir), perms=0o755, orig=0, comp=0 0x0130 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | Entry 3: enc=0, data_off=0, iv[0..7] 0x0140 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | Entry 3: iv[8..15], hmac[0..7] 0x0150 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | Entry 3: hmac[8..23] 0x0160 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | Entry 3: hmac[24..31], sha256[0..7] 0x0170 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | Entry 3: sha256[8..23] 0x0180 | 00 00 00 00 00 00 00 00 00 00 00 E7 E7 E7 E7 E7 | ................ | Entry 3: sha256[24..31], comp=0, pad=0 | Data Block 1: ciphertext[0..4] 0x0190 | E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 | ................ | Data Block 1: ciphertext[5..20] 0x01A0 | E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 | ........... | Data Block 1: ciphertext[21..31] ``` **Total: 427 bytes (0x01AB).** ### 12.11 Step-by-Step Shell Decode Walkthrough The following shell commands demonstrate decoding this archive using only `dd` and `xxd`, showing how the decoder handles both directory and file entries. The `read_le_u16` and `read_le_u32` functions are defined in the Appendix (Section 13). ```sh # ------------------------------------------------------- # Step 1: Read and verify magic bytes # ------------------------------------------------------- dd if=archive.bin bs=1 skip=0 count=4 2>/dev/null | xxd -p # Expected: 00ea7263 # ------------------------------------------------------- # Step 2: Read version # ------------------------------------------------------- dd if=archive.bin bs=1 skip=4 count=1 2>/dev/null | xxd -p # Expected: 02 (version 2 = v1.1 format) # ------------------------------------------------------- # Step 3: Read flags # ------------------------------------------------------- dd if=archive.bin bs=1 skip=5 count=1 2>/dev/null | xxd -p # Expected: 01 (compression enabled) # ------------------------------------------------------- # Step 4: Read entry count # ------------------------------------------------------- read_le_u16 archive.bin 6 # Expected: 3 # ------------------------------------------------------- # Step 5: Read TOC offset and size # ------------------------------------------------------- read_le_u32 archive.bin 8 # Expected: 40 read_le_u32 archive.bin 12 # Expected: 355 # ------------------------------------------------------- # Step 6: Parse TOC Entry 1 (offset 40) # ------------------------------------------------------- NAME_LEN=$(read_le_u16 archive.bin 40) # Expected: 11 dd if=archive.bin bs=1 skip=42 count=11 2>/dev/null # Expected: project/src # Read entry_type (1 byte after name) ENTRY_TYPE=$(dd if=archive.bin bs=1 skip=53 count=1 2>/dev/null | xxd -p) # Expected: 01 (directory) # Read permissions (2 bytes, LE) PERMS=$(read_le_u16 archive.bin 54) # Expected: 493 (= 0o755 = 0x01ED) # Directory entry: create directory and set permissions mkdir -p "output/project/src" chmod 755 "output/project/src" # Skip to next entry (no ciphertext to process) # ------------------------------------------------------- # Step 7: Parse TOC Entry 2 (offset 155 = 0x9B) # ------------------------------------------------------- NAME_LEN=$(read_le_u16 archive.bin 155) # Expected: 19 dd if=archive.bin bs=1 skip=157 count=19 2>/dev/null # Expected: project/src/main.rs # Read entry_type ENTRY_TYPE=$(dd if=archive.bin bs=1 skip=176 count=1 2>/dev/null | xxd -p) # Expected: 00 (file) # Read permissions PERMS=$(read_le_u16 archive.bin 177) # Expected: 420 (= 0o644 = 0x01A4) # Read sizes ORIG_SIZE=$(read_le_u32 archive.bin 179) # Expected: 14 COMP_SIZE=$(read_le_u32 archive.bin 183) # Expected: 30 ENC_SIZE=$(read_le_u32 archive.bin 187) # Expected: 32 DATA_OFF=$(read_le_u32 archive.bin 191) # Expected: 395 # Read IV (16 bytes at offset 195) IV_HEX=$(dd if=archive.bin bs=1 skip=195 count=16 2>/dev/null | xxd -p) # Expected: aabbccddeeff00112233445566778899 KEY_HEX="000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f" # Read HMAC (32 bytes at offset 211) for verification STORED_HMAC=$(dd if=archive.bin bs=1 skip=211 count=32 2>/dev/null | xxd -p) # Verify HMAC: HMAC-SHA-256(key, iv || ciphertext) COMPUTED_HMAC=$({ dd if=archive.bin bs=1 skip=195 count=16 2>/dev/null # IV dd if=archive.bin bs=1 skip=395 count=32 2>/dev/null # ciphertext } | openssl dgst -sha256 -mac HMAC -macopt "hexkey:${KEY_HEX}" -hex 2>/dev/null \ | awk '{print $NF}') # Compare COMPUTED_HMAC with STORED_HMAC # Extract and decrypt ciphertext dd if=archive.bin bs=1 skip=395 count=32 of=/tmp/file.enc 2>/dev/null openssl enc -d -aes-256-cbc -nosalt \ -K "${KEY_HEX}" \ -iv "${IV_HEX}" \ -in /tmp/file.enc -out /tmp/file.gz # Decompress (compression_flag = 1) gunzip -c /tmp/file.gz > "output/project/src/main.rs" # Set permissions chmod 644 "output/project/src/main.rs" # Verify SHA-256 sha256sum "output/project/src/main.rs" # Expected: 536e506bb90914c243a12b397b9a998f85ae2cbd9ba02dfd03a9e155ca5ca0f4 # ------------------------------------------------------- # Step 8: Parse TOC Entry 3 (offset 278 = 0x116) # ------------------------------------------------------- NAME_LEN=$(read_le_u16 archive.bin 278) # Expected: 13 dd if=archive.bin bs=1 skip=280 count=13 2>/dev/null # Expected: project/empty ENTRY_TYPE=$(dd if=archive.bin bs=1 skip=293 count=1 2>/dev/null | xxd -p) # Expected: 01 (directory) PERMS=$(read_le_u16 archive.bin 294) # Expected: 493 (= 0o755) # Directory entry: create directory and set permissions mkdir -p "output/project/empty" chmod 755 "output/project/empty" # Done -- no ciphertext to process # ------------------------------------------------------- # Result: output/ contains the full directory tree # ------------------------------------------------------- # output/ # project/ # src/ # main.rs (14 bytes, mode 644) # empty/ (empty dir, mode 755) ``` --- ## 13. Appendix: Shell Decoder Reference This appendix provides reference shell functions for decoding archives using only standard busybox commands. ### 13.1 Little-Endian Integer Reading ```sh # Read a little-endian u16 from a binary file at a byte offset. # Usage: read_le_u16 # Output: decimal integer value read_le_u16() { local file="$1" offset="$2" local hex=$(dd if="$file" bs=1 skip="$offset" count=2 2>/dev/null | xxd -p) local b0=${hex:0:2} b1=${hex:2:2} printf '%d' "0x${b1}${b0}" } # Read a little-endian u32 from a binary file at a byte offset. # Usage: read_le_u32 # Output: decimal integer value read_le_u32() { local file="$1" offset="$2" local hex=$(dd if="$file" bs=1 skip="$offset" count=4 2>/dev/null | xxd -p) local b0=${hex:0:2} b1=${hex:2:2} b2=${hex:4:2} b3=${hex:6:2} printf '%d' "0x${b3}${b2}${b1}${b0}" } ``` **Busybox compatibility note:** If `xxd` is not available, use `od` as a fallback: ```sh # Fallback using od instead of xxd read_le_u32_od() { local file="$1" offset="$2" local bytes=$(dd if="$file" bs=1 skip="$offset" count=4 2>/dev/null \ | od -A n -t x1 | tr -d ' \n') local b0=${bytes:0:2} b1=${bytes:2:2} b2=${bytes:4:2} b3=${bytes:6:2} printf '%d' "0x${b3}${b2}${b1}${b0}" } ``` ### 13.2 Read Raw Bytes as Hex ```sh # Read N bytes from file at offset as hex string (no spaces) # Usage: read_hex read_hex() { local file="$1" offset="$2" count="$3" dd if="$file" bs=1 skip="$offset" count="$count" 2>/dev/null | xxd -p | tr -d '\n' } ``` ### 13.3 HMAC-SHA-256 Verification ```sh # Verify HMAC-SHA-256 of IV || ciphertext. # Usage: verify_hmac # Returns: 0 if HMAC matches, 1 if not verify_hmac() { local file="$1" local iv_offset="$2" iv_length="$3" local data_offset="$4" data_length="$5" local expected="$6" key="$7" local actual=$( { dd if="$file" bs=1 skip="$iv_offset" count="$iv_length" 2>/dev/null dd if="$file" bs=1 skip="$data_offset" count="$data_length" 2>/dev/null } | openssl dgst -sha256 -mac HMAC -macopt "hexkey:${key}" -hex 2>/dev/null \ | awk '{print $NF}' ) [ "$actual" = "$expected" ] } ``` **Graceful degradation:** If the target busybox `openssl` does not support `-mac HMAC -macopt`, the shell decoder MAY skip HMAC verification. In this case, print a warning: ```sh # Check if openssl HMAC is available if ! echo -n "test" | openssl dgst -sha256 -mac HMAC -macopt hexkey:00 >/dev/null 2>&1; then echo "WARNING: openssl HMAC not available, skipping integrity verification" SKIP_HMAC=1 fi ``` ### 13.4 Single-File Decryption ```sh # Decrypt a single file from the archive. # Usage: decrypt_file decrypt_file() { local archive="$1" local data_offset="$2" encrypted_size="$3" local iv_hex="$4" key_hex="$5" local output="$6" is_compressed="$7" # Extract ciphertext dd if="$archive" bs=1 skip="$data_offset" count="$encrypted_size" 2>/dev/null \ | openssl enc -d -aes-256-cbc -nosalt -K "$key_hex" -iv "$iv_hex" \ > /tmp/_decrypted_$$ # Decompress if needed if [ "$is_compressed" = "1" ]; then gunzip -c /tmp/_decrypted_$$ > "$output" else mv /tmp/_decrypted_$$ "$output" fi rm -f /tmp/_decrypted_$$ } ``` ### 13.5 SHA-256 Verification ```sh # Verify SHA-256 of an extracted file. # Usage: verify_sha256 # Returns: 0 if matches, 1 if not verify_sha256() { local file="$1" expected="$2" local actual=$(sha256sum "$file" | awk '{print $1}') [ "$actual" = "$expected" ] } ``` ### 13.6 Kotlin Decoder Reference For Android implementations using `javax.crypto`: ```kotlin import java.io.ByteArrayInputStream import java.security.MessageDigest import java.util.zip.GZIPInputStream import javax.crypto.Cipher import javax.crypto.Mac import javax.crypto.spec.IvParameterSpec import javax.crypto.spec.SecretKeySpec /** * Decrypt a single file entry from the archive. * * @param ciphertext The encrypted data (encrypted_size bytes from the data block) * @param iv The 16-byte IV from the file table entry * @param key The 32-byte AES key * @return Decrypted data (after PKCS7 unpadding, which is automatic) */ fun decryptFileEntry(ciphertext: ByteArray, iv: ByteArray, key: ByteArray): ByteArray { val cipher = Cipher.getInstance("AES/CBC/PKCS5Padding") // Note: PKCS5Padding in Java/Android == PKCS7 for 16-byte blocks val secretKey = SecretKeySpec(key, "AES") val ivSpec = IvParameterSpec(iv) cipher.init(Cipher.DECRYPT_MODE, secretKey, ivSpec) return cipher.doFinal(ciphertext) } /** * Verify HMAC-SHA-256 of IV || ciphertext. * * @param iv The 16-byte IV * @param ciphertext The encrypted data * @param key The 32-byte key (same as AES key in v1) * @param expectedHmac The 32-byte HMAC from the file table entry * @return true if HMAC matches */ fun verifyHmac(iv: ByteArray, ciphertext: ByteArray, key: ByteArray, expectedHmac: ByteArray): Boolean { val mac = Mac.getInstance("HmacSHA256") mac.init(SecretKeySpec(key, "HmacSHA256")) mac.update(iv) mac.update(ciphertext) val computed = mac.doFinal() return computed.contentEquals(expectedHmac) } /** * Decompress gzip data. * * @param compressed Gzip-compressed data * @return Decompressed data */ fun decompressGzip(compressed: ByteArray): ByteArray { return GZIPInputStream(ByteArrayInputStream(compressed)).readBytes() } /** * Verify SHA-256 checksum of extracted content. * * @param data The decompressed file content * @param expectedSha256 The 32-byte SHA-256 from the file table entry * @return true if checksum matches */ fun verifySha256(data: ByteArray, expectedSha256: ByteArray): Boolean { val digest = MessageDigest.getInstance("SHA-256") val computed = digest.digest(data) return computed.contentEquals(expectedSha256) } ``` **Full decode flow in Kotlin:** ```kotlin // For each file entry: // 1. Read ciphertext from data_offset (encrypted_size bytes) // 2. Verify HMAC BEFORE decryption if (!verifyHmac(entry.iv, ciphertext, key, entry.hmac)) { throw SecurityException("HMAC verification failed for ${entry.name}") } // 3. Decrypt val compressed = decryptFileEntry(ciphertext, entry.iv, key) // 4. Decompress if needed val original = if (entry.compressionFlag == 1) decompressGzip(compressed) else compressed // 5. Verify SHA-256 if (!verifySha256(original, entry.sha256)) { throw SecurityException("SHA-256 verification failed for ${entry.name}") } // 6. Write to file File(outputDir, entry.name).writeBytes(original) ```