diff --git a/docs/FORMAT.md b/docs/FORMAT.md index 60dd26e..a4b745e 100644 --- a/docs/FORMAT.md +++ b/docs/FORMAT.md @@ -462,3 +462,570 @@ The following steps MUST be followed in order by all decoders: - Introduce HKDF-derived per-file keys (replacing single shared key). 5. **Backward compatibility:** Future versions SHOULD maintain the same magic bytes and the same position of the `version` field (offset `0x04`) so that decoders can read the version before deciding how to proceed. + +--- + +## 12. Worked Example + +This section constructs a complete 2-file archive byte by byte. All offsets, field sizes, and hex values are internally consistent and can be verified by summing field sizes. This example serves as a **golden reference** for implementation testing. + +### 12.1 Input Files + +| File | Name | Content | Size | +|------|------|---------|------| +| 1 | `hello.txt` | ASCII string `Hello` (bytes: `48 65 6C 6C 6F`) | 5 bytes | +| 2 | `data.bin` | 32 bytes of `0x01` repeated | 32 bytes | + +### 12.2 Parameters + +- **Key:** 32 bytes: `00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F` +- **Flags:** `0x01` (compression enabled, no obfuscation) +- **Version:** `1` + +### 12.3 Per-File Pipeline Walkthrough + +#### File 1: `hello.txt` + +**Step 1: SHA-256 checksum of original content** + +``` +SHA-256("Hello") = 185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969 +``` + +As bytes: +``` +18 5F 8D B3 22 71 FE 25 F5 61 A6 FC 93 8B 2E 26 +43 06 EC 30 4E DA 51 80 07 D1 76 48 26 38 19 69 +``` + +**Step 2: Gzip compression** + +Gzip output is implementation-dependent (timestamps, OS flags vary). For this example, we use a representative compressed size of **25 bytes**. The actual gzip output will differ between implementations, but the pipeline and sizes are computed from this value. + +- `compressed_size = 25` + +**Step 3: Compute encrypted_size (PKCS7 padding)** + +``` +encrypted_size = ((25 / 16) + 1) * 16 = ((1) + 1) * 16 = 32 bytes +``` + +PKCS7 padding adds `32 - 25 = 7` bytes of value `0x07`. + +**Step 4: AES-256-CBC encryption** + +- IV (randomly chosen for this example): `AA BB CC DD EE FF 00 11 22 33 44 55 66 77 88 99` +- Ciphertext: 32 bytes (actual value depends on the gzip output and IV; representative bytes used in the hex dump below) + +**Step 5: HMAC-SHA-256** + +``` +HMAC_input = IV (16 bytes) || ciphertext (32 bytes) = 48 bytes total +HMAC-SHA-256(key, HMAC_input) = <32 bytes> +``` + +The HMAC value depends on the actual ciphertext; representative bytes (`0xC1` repeated) are used in the hex dump. In a real implementation, this MUST be computed from the actual IV and ciphertext. + +#### File 2: `data.bin` + +**Step 1: SHA-256 checksum of original content** + +``` +SHA-256(0x01 * 32) = 72cd6e8422c407fb6d098690f1130b7ded7ec2f7f5e1d30bd9d521f015363793 +``` + +As bytes: +``` +72 CD 6E 84 22 C4 07 FB 6D 09 86 90 F1 13 0B 7D +ED 7E C2 F7 F5 E1 D3 0B D9 D5 21 F0 15 36 37 93 +``` + +**Step 2: Gzip compression** + +32 bytes of identical content compresses well. Representative compressed size: **22 bytes**. + +- `compressed_size = 22` + +**Step 3: Compute encrypted_size (PKCS7 padding)** + +``` +encrypted_size = ((22 / 16) + 1) * 16 = ((1) + 1) * 16 = 32 bytes +``` + +PKCS7 padding adds `32 - 22 = 10` bytes of value `0x0A`. + +**Step 4: AES-256-CBC encryption** + +- IV (randomly chosen for this example): `11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 00` +- Ciphertext: 32 bytes (representative) + +**Step 5: HMAC-SHA-256** + +``` +HMAC_input = IV (16 bytes) || ciphertext (32 bytes) = 48 bytes total +HMAC-SHA-256(key, HMAC_input) = <32 bytes> +``` + +Representative bytes (`0xD2` repeated) used in the hex dump. + +### 12.4 Archive Layout + +| Region | Start Offset | End Offset | Size | Description | +|--------|-------------|------------|------|-------------| +| Header | `0x0000` | `0x0027` | 40 bytes | Fixed header | +| TOC Entry 1 | `0x0028` | `0x0095` | 110 bytes | `hello.txt` metadata | +| TOC Entry 2 | `0x0096` | `0x0102` | 109 bytes | `data.bin` metadata | +| Data Block 1 | `0x0103` | `0x0122` | 32 bytes | `hello.txt` ciphertext | +| Data Block 2 | `0x0123` | `0x0142` | 32 bytes | `data.bin` ciphertext | +| **Total** | | | **323 bytes** | | + +**Offset verification:** + +``` +TOC offset = header_size = 40 (0x28) CHECK +TOC size = entry1_size + entry2_size = 110 + 109 = 219 (0xDB) CHECK +Data Block 1 = toc_offset + toc_size = 40 + 219 = 259 (0x103) CHECK +Data Block 2 = data_offset_1 + encrypted_size_1 = 259 + 32 = 291 (0x123) CHECK +Archive end = data_offset_2 + encrypted_size_2 = 291 + 32 = 323 (0x143) CHECK +``` + +### 12.5 Header (Bytes 0x0000 - 0x0027) + +| Offset | Hex | Field | Value | +|--------|-----|-------|-------| +| `0x0000` | `00 EA 72 63` | magic | Custom magic bytes | +| `0x0004` | `01` | version | 1 | +| `0x0005` | `01` | flags | `0x01` = compression enabled | +| `0x0006` | `02 00` | file_count | 2 (LE) | +| `0x0008` | `28 00 00 00` | toc_offset | 40 (LE) | +| `0x000C` | `DB 00 00 00` | toc_size | 219 (LE) | +| `0x0010` | `00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00` | toc_iv | Zero-filled (TOC not encrypted) | +| `0x0020` | `00 00 00 00 00 00 00 00` | reserved | Zero-filled | + +### 12.6 File Table Entry 1: `hello.txt` (Bytes 0x0028 - 0x0095) + +| Offset | Hex | Field | Value | +|--------|-----|-------|-------| +| `0x0028` | `09 00` | name_length | 9 (LE) | +| `0x002A` | `68 65 6C 6C 6F 2E 74 78 74` | name | "hello.txt" (UTF-8) | +| `0x0033` | `05 00 00 00` | original_size | 5 (LE) | +| `0x0037` | `19 00 00 00` | compressed_size | 25 (LE) | +| `0x003B` | `20 00 00 00` | encrypted_size | 32 (LE) | +| `0x003F` | `03 01 00 00` | data_offset | 259 = 0x103 (LE) | +| `0x0043` | `AA BB CC DD EE FF 00 11 22 33 44 55 66 77 88 99` | iv | Example IV for file 1 | +| `0x0053` | `C1 C1 C1 ... (32 bytes)` | hmac | Representative HMAC (actual depends on ciphertext) | +| `0x0073` | `18 5F 8D B3 22 71 FE 25 F5 61 A6 FC 93 8B 2E 26 43 06 EC 30 4E DA 51 80 07 D1 76 48 26 38 19 69` | sha256 | SHA-256 of "Hello" | +| `0x0093` | `01` | compression_flag | 1 (gzip) | +| `0x0094` | `00 00` | padding_after | 0 (no decoy padding) | + +**Entry size verification:** `2 + 9 + 4 + 4 + 4 + 4 + 16 + 32 + 32 + 1 + 2 = 110 bytes`. Offset range: `0x0028` to `0x0095` = 110 bytes. CHECK. + +### 12.7 File Table Entry 2: `data.bin` (Bytes 0x0096 - 0x0102) + +| Offset | Hex | Field | Value | +|--------|-----|-------|-------| +| `0x0096` | `08 00` | name_length | 8 (LE) | +| `0x0098` | `64 61 74 61 2E 62 69 6E` | name | "data.bin" (UTF-8) | +| `0x00A0` | `20 00 00 00` | original_size | 32 (LE) | +| `0x00A4` | `16 00 00 00` | compressed_size | 22 (LE) | +| `0x00A8` | `20 00 00 00` | encrypted_size | 32 (LE) | +| `0x00AC` | `23 01 00 00` | data_offset | 291 = 0x123 (LE) | +| `0x00B0` | `11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 00` | iv | Example IV for file 2 | +| `0x00C0` | `D2 D2 D2 ... (32 bytes)` | hmac | Representative HMAC (actual depends on ciphertext) | +| `0x00E0` | `72 CD 6E 84 22 C4 07 FB 6D 09 86 90 F1 13 0B 7D ED 7E C2 F7 F5 E1 D3 0B D9 D5 21 F0 15 36 37 93` | sha256 | SHA-256 of 32 x 0x01 | +| `0x0100` | `01` | compression_flag | 1 (gzip) | +| `0x0101` | `00 00` | padding_after | 0 (no decoy padding) | + +**Entry size verification:** `2 + 8 + 4 + 4 + 4 + 4 + 16 + 32 + 32 + 1 + 2 = 109 bytes`. Offset range: `0x0096` to `0x0102` = 109 bytes. CHECK. + +### 12.8 Data Blocks (Bytes 0x0103 - 0x0142) + +**Data Block 1** (bytes `0x0103` - `0x0122`, 32 bytes): + +Ciphertext of gzip-compressed "Hello", encrypted with AES-256-CBC. Actual bytes depend on the gzip output (which includes timestamps) and the IV. Representative value: 32 bytes of ciphertext. + +**Data Block 2** (bytes `0x0123` - `0x0142`, 32 bytes): + +Ciphertext of gzip-compressed `0x01 * 32`, encrypted with AES-256-CBC. Representative value: 32 bytes of ciphertext. + +### 12.9 Complete Annotated Hex Dump + +The following hex dump shows the full 323-byte archive. HMAC values (`C1...` and `D2...`) and ciphertext (`E7...` and `F8...`) are representative placeholders. SHA-256 hashes are real computed values. + +``` +Offset | Hex | ASCII | Annotation +--------|------------------------------------------------|------------------|------------------------------------------ +0x0000 | 00 EA 72 63 01 01 02 00 28 00 00 00 DB 00 00 00 | ..rc....(...... | Header: magic, ver=1, flags=0x01, count=2, toc_off=40, toc_sz=219 +0x0010 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | Header: toc_iv (zero-filled, TOC not encrypted) +0x0020 | 00 00 00 00 00 00 00 00 09 00 68 65 6C 6C 6F 2E | ..........hello. | Header: reserved | TOC Entry 1: name_len=9, name="hello." +0x0030 | 74 78 74 05 00 00 00 19 00 00 00 20 00 00 00 03 | txt........ .... | Entry 1: "txt", orig=5, comp=25, enc=32, data_off= +0x0040 | 01 00 00 AA BB CC DD EE FF 00 11 22 33 44 55 66 | ..........."3DUf | Entry 1: =259(0x103), iv[0..15] +0x0050 | 77 88 99 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 | w............... | Entry 1: iv[13..15], hmac[0..12] +0x0060 | C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 | ................ | Entry 1: hmac[13..28] +0x0070 | C1 C1 C1 18 5F 8D B3 22 71 FE 25 F5 61 A6 FC 93 | ...._.."q.%.a... | Entry 1: hmac[29..31], sha256[0..12] +0x0080 | 8B 2E 26 43 06 EC 30 4E DA 51 80 07 D1 76 48 26 | ..&C..0N.Q...vH& | Entry 1: sha256[13..28] +0x0090 | 38 19 69 01 00 00 08 00 64 61 74 61 2E 62 69 6E | 8.i.....data.bin | Entry 1: sha256[29..31], comp=1, pad=0 | Entry 2: name_len=8, name="data.bin" +0x00A0 | 20 00 00 00 16 00 00 00 20 00 00 00 23 01 00 00 | ....... ...#... | Entry 2: orig=32, comp=22, enc=32, data_off=291(0x123) +0x00B0 | 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 00 | ."3DUfw......... | Entry 2: iv[0..15] +0x00C0 | D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 | ................ | Entry 2: hmac[0..15] +0x00D0 | D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 D2 | ................ | Entry 2: hmac[16..31] +0x00E0 | 72 CD 6E 84 22 C4 07 FB 6D 09 86 90 F1 13 0B 7D | r.n."...m......} | Entry 2: sha256[0..15] +0x00F0 | ED 7E C2 F7 F5 E1 D3 0B D9 D5 21 F0 15 36 37 93 | .~........!..67. | Entry 2: sha256[16..31] +0x0100 | 01 00 00 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 | ................ | Entry 2: comp=1, pad=0 | Data Block 1: ciphertext[0..12] +0x0110 | E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 E7 | ................ | Data Block 1: ciphertext[13..28] +0x0120 | E7 E7 E7 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 | ................ | Data Block 1: ciphertext[29..31] | Data Block 2: ciphertext[0..12] +0x0130 | F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 | ................ | Data Block 2: ciphertext[13..28] +0x0140 | F8 F8 F8 | ... | Data Block 2: ciphertext[29..31] +``` + +**Total: 323 bytes (0x143).** + +### 12.10 Step-by-Step Shell Decode Walkthrough + +The following shell commands demonstrate decoding this archive using only `dd` and `xxd`. The `read_le_u16` and `read_le_u32` functions are defined in the Appendix (Section 13). + +```sh +# ------------------------------------------------------- +# Step 1: Read and verify magic bytes +# ------------------------------------------------------- +dd if=archive.bin bs=1 skip=0 count=4 2>/dev/null | xxd -p +# Expected: 00ea7263 + +# ------------------------------------------------------- +# Step 2: Read version +# ------------------------------------------------------- +dd if=archive.bin bs=1 skip=4 count=1 2>/dev/null | xxd -p +# Expected: 01 + +# ------------------------------------------------------- +# Step 3: Read flags +# ------------------------------------------------------- +dd if=archive.bin bs=1 skip=5 count=1 2>/dev/null | xxd -p +# Expected: 01 (compression enabled) + +# ------------------------------------------------------- +# Step 4: Read file count +# ------------------------------------------------------- +read_le_u16 archive.bin 6 +# Expected: 2 + +# ------------------------------------------------------- +# Step 5: Read TOC offset +# ------------------------------------------------------- +read_le_u32 archive.bin 8 +# Expected: 40 + +# ------------------------------------------------------- +# Step 6: Read TOC size +# ------------------------------------------------------- +read_le_u32 archive.bin 12 +# Expected: 219 + +# ------------------------------------------------------- +# Step 7: Read TOC Entry 1 -- name_length +# ------------------------------------------------------- +read_le_u16 archive.bin 40 +# Expected: 9 + +# ------------------------------------------------------- +# Step 8: Read TOC Entry 1 -- filename +# ------------------------------------------------------- +dd if=archive.bin bs=1 skip=42 count=9 2>/dev/null +# Expected: hello.txt + +# ------------------------------------------------------- +# Step 9: Read TOC Entry 1 -- original_size +# ------------------------------------------------------- +read_le_u32 archive.bin 51 +# Expected: 5 + +# ------------------------------------------------------- +# Step 10: Read TOC Entry 1 -- compressed_size +# ------------------------------------------------------- +read_le_u32 archive.bin 55 +# Expected: 25 + +# ------------------------------------------------------- +# Step 11: Read TOC Entry 1 -- encrypted_size +# ------------------------------------------------------- +read_le_u32 archive.bin 59 +# Expected: 32 + +# ------------------------------------------------------- +# Step 12: Read TOC Entry 1 -- data_offset +# ------------------------------------------------------- +read_le_u32 archive.bin 63 +# Expected: 259 + +# ------------------------------------------------------- +# Step 13: Read TOC Entry 1 -- IV (16 bytes) +# ------------------------------------------------------- +dd if=archive.bin bs=1 skip=67 count=16 2>/dev/null | xxd -p +# Expected: aabbccddeeff00112233445566778899 + +# ------------------------------------------------------- +# Step 14: Read TOC Entry 1 -- HMAC (32 bytes) +# ------------------------------------------------------- +dd if=archive.bin bs=1 skip=83 count=32 2>/dev/null | xxd -p +# (32 bytes of HMAC for verification) + +# ------------------------------------------------------- +# Step 15: Extract ciphertext for file 1 +# ------------------------------------------------------- +dd if=archive.bin bs=1 skip=259 count=32 of=/tmp/file1.enc 2>/dev/null + +# ------------------------------------------------------- +# Step 16: Verify HMAC for file 1 +# ------------------------------------------------------- +# Create HMAC input: IV (16 bytes) || ciphertext (32 bytes) +IV_HEX="aabbccddeeff00112233445566778899" +KEY_HEX="000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f" + +# Extract IV and ciphertext, concatenate, compute HMAC +{ + dd if=archive.bin bs=1 skip=67 count=16 2>/dev/null # IV + dd if=archive.bin bs=1 skip=259 count=32 2>/dev/null # ciphertext +} | openssl dgst -sha256 -mac HMAC -macopt "hexkey:${KEY_HEX}" -hex 2>/dev/null \ + | awk '{print $NF}' +# Compare output with stored HMAC from step 14 + +# ------------------------------------------------------- +# Step 17: Decrypt file 1 +# ------------------------------------------------------- +openssl enc -d -aes-256-cbc -nosalt \ + -K "${KEY_HEX}" \ + -iv "${IV_HEX}" \ + -in /tmp/file1.enc -out /tmp/file1.gz + +# ------------------------------------------------------- +# Step 18: Decompress file 1 +# ------------------------------------------------------- +gunzip -c /tmp/file1.gz > /tmp/hello.txt + +# ------------------------------------------------------- +# Step 19: Verify SHA-256 of extracted file +# ------------------------------------------------------- +sha256sum /tmp/hello.txt +# Expected: 185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969 +``` + +--- + +## 13. Appendix: Shell Decoder Reference + +This appendix provides reference shell functions for decoding archives using only standard busybox commands. + +### 13.1 Little-Endian Integer Reading + +```sh +# Read a little-endian u16 from a binary file at a byte offset. +# Usage: read_le_u16 +# Output: decimal integer value +read_le_u16() { + local file="$1" offset="$2" + local hex=$(dd if="$file" bs=1 skip="$offset" count=2 2>/dev/null | xxd -p) + local b0=${hex:0:2} b1=${hex:2:2} + printf '%d' "0x${b1}${b0}" +} + +# Read a little-endian u32 from a binary file at a byte offset. +# Usage: read_le_u32 +# Output: decimal integer value +read_le_u32() { + local file="$1" offset="$2" + local hex=$(dd if="$file" bs=1 skip="$offset" count=4 2>/dev/null | xxd -p) + local b0=${hex:0:2} b1=${hex:2:2} b2=${hex:4:2} b3=${hex:6:2} + printf '%d' "0x${b3}${b2}${b1}${b0}" +} +``` + +**Busybox compatibility note:** If `xxd` is not available, use `od` as a fallback: + +```sh +# Fallback using od instead of xxd +read_le_u32_od() { + local file="$1" offset="$2" + local bytes=$(dd if="$file" bs=1 skip="$offset" count=4 2>/dev/null \ + | od -A n -t x1 | tr -d ' \n') + local b0=${bytes:0:2} b1=${bytes:2:2} b2=${bytes:4:2} b3=${bytes:6:2} + printf '%d' "0x${b3}${b2}${b1}${b0}" +} +``` + +### 13.2 Read Raw Bytes as Hex + +```sh +# Read N bytes from file at offset as hex string (no spaces) +# Usage: read_hex +read_hex() { + local file="$1" offset="$2" count="$3" + dd if="$file" bs=1 skip="$offset" count="$count" 2>/dev/null | xxd -p | tr -d '\n' +} +``` + +### 13.3 HMAC-SHA-256 Verification + +```sh +# Verify HMAC-SHA-256 of IV || ciphertext. +# Usage: verify_hmac +# Returns: 0 if HMAC matches, 1 if not +verify_hmac() { + local file="$1" + local iv_offset="$2" iv_length="$3" + local data_offset="$4" data_length="$5" + local expected="$6" key="$7" + + local actual=$( + { + dd if="$file" bs=1 skip="$iv_offset" count="$iv_length" 2>/dev/null + dd if="$file" bs=1 skip="$data_offset" count="$data_length" 2>/dev/null + } | openssl dgst -sha256 -mac HMAC -macopt "hexkey:${key}" -hex 2>/dev/null \ + | awk '{print $NF}' + ) + + [ "$actual" = "$expected" ] +} +``` + +**Graceful degradation:** If the target busybox `openssl` does not support `-mac HMAC -macopt`, the shell decoder MAY skip HMAC verification. In this case, print a warning: + +```sh +# Check if openssl HMAC is available +if ! echo -n "test" | openssl dgst -sha256 -mac HMAC -macopt hexkey:00 >/dev/null 2>&1; then + echo "WARNING: openssl HMAC not available, skipping integrity verification" + SKIP_HMAC=1 +fi +``` + +### 13.4 Single-File Decryption + +```sh +# Decrypt a single file from the archive. +# Usage: decrypt_file +decrypt_file() { + local archive="$1" + local data_offset="$2" encrypted_size="$3" + local iv_hex="$4" key_hex="$5" + local output="$6" is_compressed="$7" + + # Extract ciphertext + dd if="$archive" bs=1 skip="$data_offset" count="$encrypted_size" 2>/dev/null \ + | openssl enc -d -aes-256-cbc -nosalt -K "$key_hex" -iv "$iv_hex" \ + > /tmp/_decrypted_$$ + + # Decompress if needed + if [ "$is_compressed" = "1" ]; then + gunzip -c /tmp/_decrypted_$$ > "$output" + else + mv /tmp/_decrypted_$$ "$output" + fi + + rm -f /tmp/_decrypted_$$ +} +``` + +### 13.5 SHA-256 Verification + +```sh +# Verify SHA-256 of an extracted file. +# Usage: verify_sha256 +# Returns: 0 if matches, 1 if not +verify_sha256() { + local file="$1" expected="$2" + local actual=$(sha256sum "$file" | awk '{print $1}') + [ "$actual" = "$expected" ] +} +``` + +### 13.6 Kotlin Decoder Reference + +For Android implementations using `javax.crypto`: + +```kotlin +import java.io.ByteArrayInputStream +import java.security.MessageDigest +import java.util.zip.GZIPInputStream +import javax.crypto.Cipher +import javax.crypto.Mac +import javax.crypto.spec.IvParameterSpec +import javax.crypto.spec.SecretKeySpec + +/** + * Decrypt a single file entry from the archive. + * + * @param ciphertext The encrypted data (encrypted_size bytes from the data block) + * @param iv The 16-byte IV from the file table entry + * @param key The 32-byte AES key + * @return Decrypted data (after PKCS7 unpadding, which is automatic) + */ +fun decryptFileEntry(ciphertext: ByteArray, iv: ByteArray, key: ByteArray): ByteArray { + val cipher = Cipher.getInstance("AES/CBC/PKCS5Padding") + // Note: PKCS5Padding in Java/Android == PKCS7 for 16-byte blocks + val secretKey = SecretKeySpec(key, "AES") + val ivSpec = IvParameterSpec(iv) + cipher.init(Cipher.DECRYPT_MODE, secretKey, ivSpec) + return cipher.doFinal(ciphertext) +} + +/** + * Verify HMAC-SHA-256 of IV || ciphertext. + * + * @param iv The 16-byte IV + * @param ciphertext The encrypted data + * @param key The 32-byte key (same as AES key in v1) + * @param expectedHmac The 32-byte HMAC from the file table entry + * @return true if HMAC matches + */ +fun verifyHmac(iv: ByteArray, ciphertext: ByteArray, key: ByteArray, expectedHmac: ByteArray): Boolean { + val mac = Mac.getInstance("HmacSHA256") + mac.init(SecretKeySpec(key, "HmacSHA256")) + mac.update(iv) + mac.update(ciphertext) + val computed = mac.doFinal() + return computed.contentEquals(expectedHmac) +} + +/** + * Decompress gzip data. + * + * @param compressed Gzip-compressed data + * @return Decompressed data + */ +fun decompressGzip(compressed: ByteArray): ByteArray { + return GZIPInputStream(ByteArrayInputStream(compressed)).readBytes() +} + +/** + * Verify SHA-256 checksum of extracted content. + * + * @param data The decompressed file content + * @param expectedSha256 The 32-byte SHA-256 from the file table entry + * @return true if checksum matches + */ +fun verifySha256(data: ByteArray, expectedSha256: ByteArray): Boolean { + val digest = MessageDigest.getInstance("SHA-256") + val computed = digest.digest(data) + return computed.contentEquals(expectedSha256) +} +``` + +**Full decode flow in Kotlin:** + +```kotlin +// For each file entry: +// 1. Read ciphertext from data_offset (encrypted_size bytes) +// 2. Verify HMAC BEFORE decryption +if (!verifyHmac(entry.iv, ciphertext, key, entry.hmac)) { + throw SecurityException("HMAC verification failed for ${entry.name}") +} +// 3. Decrypt +val compressed = decryptFileEntry(ciphertext, entry.iv, key) +// 4. Decompress if needed +val original = if (entry.compressionFlag == 1) decompressGzip(compressed) else compressed +// 5. Verify SHA-256 +if (!verifySha256(original, entry.sha256)) { + throw SecurityException("SHA-256 verification failed for ${entry.name}") +} +// 6. Write to file +File(outputDir, entry.name).writeBytes(original) +```