feat(01-01): write format specification with byte-level field definitions
- Archive header definition (40 bytes) with complete field table - File table entry definition (11 fields, variable-length per entry) - AES-256-CBC + HMAC-SHA-256 encryption pipeline with encrypt-then-MAC - PKCS7 padding formula with 8 worked examples - Gzip compression details with per-file flag - Obfuscation features: XOR header, encrypted TOC, decoy padding - Decode order of operations (full step-by-step) - Version compatibility rules Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
464
docs/FORMAT.md
Normal file
464
docs/FORMAT.md
Normal file
@@ -0,0 +1,464 @@
|
||||
# Encrypted Archive Binary Format Specification
|
||||
|
||||
**Version:** 1.0
|
||||
**Date:** 2026-02-24
|
||||
**Status:** Normative
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview and Design Goals](#1-overview-and-design-goals)
|
||||
2. [Notation Conventions](#2-notation-conventions)
|
||||
3. [Archive Structure Diagram](#3-archive-structure-diagram)
|
||||
4. [Archive Header Definition](#4-archive-header-definition)
|
||||
5. [File Table Entry Definition](#5-file-table-entry-definition)
|
||||
6. [Data Block Layout](#6-data-block-layout)
|
||||
7. [Encryption and Authentication Details](#7-encryption-and-authentication-details)
|
||||
8. [Compression Details](#8-compression-details)
|
||||
9. [Obfuscation Features](#9-obfuscation-features)
|
||||
10. [Decode Order of Operations](#10-decode-order-of-operations)
|
||||
11. [Version Compatibility Rules](#11-version-compatibility-rules)
|
||||
12. [Worked Example](#12-worked-example)
|
||||
13. [Appendix: Shell Decoder Reference](#13-appendix-shell-decoder-reference)
|
||||
|
||||
---
|
||||
|
||||
## 1. Overview and Design Goals
|
||||
|
||||
This document specifies the binary format for `encrypted_archive` -- a custom archive container designed to be **unrecognizable by standard tools**. Standard utilities (`file`, `binwalk`, `7z`, `tar`, `unzip`) must not be able to identify or extract the contents of an archive produced in this format.
|
||||
|
||||
### Target Decoders
|
||||
|
||||
Three independent implementations will build against this specification:
|
||||
|
||||
1. **Rust CLI archiver** (`encrypted_archive pack`/`unpack`) -- the reference encoder and primary decoder, runs on Linux/macOS.
|
||||
2. **Kotlin Android decoder** -- runs on Android 13 (Qualcomm SoC) using only `javax.crypto` and `java.util.zip`. Primary extraction path on the target device.
|
||||
3. **Busybox shell decoder** -- a fallback shell script using only standard busybox commands: `dd`, `xxd`, `openssl`, `gunzip`, and `sh`. Must work without external dependencies.
|
||||
|
||||
### Core Constraint
|
||||
|
||||
The shell decoder must be able to parse the archive format using `dd` (for byte extraction), `xxd` (for hex conversion), and `openssl enc` (for AES-CBC decryption with raw key mode: `-K`/`-iv`/`-nosalt`). This constraint drives several design choices:
|
||||
|
||||
- Fixed-size header at a known offset (no variable-length preamble before the TOC pointer)
|
||||
- Absolute offsets (no relative offset chains that require cumulative addition)
|
||||
- IVs stored in the file table, not embedded in data blocks (single `dd` call per extraction)
|
||||
- Little-endian integers (native byte order on ARM and x86)
|
||||
|
||||
---
|
||||
|
||||
## 2. Notation Conventions
|
||||
|
||||
| Convention | Meaning |
|
||||
|------------|---------|
|
||||
| **LE** | Little-endian byte order |
|
||||
| **u8** | Unsigned 8-bit integer (1 byte) |
|
||||
| **u16** | Unsigned 16-bit integer (2 bytes) |
|
||||
| **u32** | Unsigned 32-bit integer (4 bytes) |
|
||||
| **bytes** | Raw byte sequence (no endianness) |
|
||||
| Offset `0xNN` | Absolute byte offset from archive byte 0 |
|
||||
| Size | Always in bytes unless stated otherwise |
|
||||
| `\|\|` | Concatenation of byte sequences |
|
||||
|
||||
- All multi-byte integers are **little-endian (LE)**.
|
||||
- All sizes are in **bytes** unless stated otherwise.
|
||||
- All offsets are **absolute** from archive byte 0 (the first byte of the file).
|
||||
- Filenames are **UTF-8 encoded**, length-prefixed with a u16 byte count (NOT null-terminated).
|
||||
- Reserved fields are **zero-filled** and MUST be written as `0x00` bytes.
|
||||
|
||||
---
|
||||
|
||||
## 3. Archive Structure Diagram
|
||||
|
||||
```
|
||||
+=======================================+
|
||||
| ARCHIVE HEADER | Fixed 40 bytes
|
||||
| magic(4) | ver(1) | flags(1) |
|
||||
| file_count(2) | toc_offset(4) |
|
||||
| toc_size(4) | toc_iv(16) |
|
||||
| reserved(8) |
|
||||
+=======================================+
|
||||
| FILE TABLE (TOC) | Variable size
|
||||
| Entry 1: name, sizes, offset, | Optionally encrypted
|
||||
| iv, hmac, sha256, flags | (see Section 9.2)
|
||||
| Entry 2: ... |
|
||||
| ... |
|
||||
| Entry N: ... |
|
||||
+=======================================+
|
||||
| DATA BLOCK 1 | encrypted_size bytes
|
||||
| [ciphertext] |
|
||||
+---------------------------------------+
|
||||
| [DECOY PADDING 1] | Optional (see Section 9.3)
|
||||
+---------------------------------------+
|
||||
| DATA BLOCK 2 | encrypted_size bytes
|
||||
| [ciphertext] |
|
||||
+---------------------------------------+
|
||||
| [DECOY PADDING 2] | Optional (see Section 9.3)
|
||||
+---------------------------------------+
|
||||
| ... |
|
||||
+=======================================+
|
||||
```
|
||||
|
||||
The archive consists of three contiguous regions:
|
||||
|
||||
1. **Header** (fixed 40 bytes) -- contains magic bytes, version, flags, and a pointer to the file table.
|
||||
2. **File Table (TOC)** (variable size) -- contains one entry per archived file with all metadata needed for extraction.
|
||||
3. **Data Blocks** (variable size) -- contains the encrypted (and optionally compressed) file contents, one block per file, optionally separated by decoy padding.
|
||||
|
||||
---
|
||||
|
||||
## 4. Archive Header Definition
|
||||
|
||||
The header is a fixed-size 40-byte structure at offset 0x00.
|
||||
|
||||
| Offset | Size | Type | Endian | Field | Description |
|
||||
|--------|------|------|--------|-------|-------------|
|
||||
| `0x00` | 4 | bytes | - | `magic` | Custom magic bytes: `0x00 0xEA 0x72 0x63`. The leading `0x00` signals binary content; the remaining bytes (`0xEA 0x72 0x63`) do not match any known file signature. |
|
||||
| `0x04` | 1 | u8 | - | `version` | Format version. Value `1` for this specification (v1). |
|
||||
| `0x05` | 1 | u8 | - | `flags` | Feature flags bitfield (see below). |
|
||||
| `0x06` | 2 | u16 | LE | `file_count` | Number of files stored in the archive. |
|
||||
| `0x08` | 4 | u32 | LE | `toc_offset` | Absolute byte offset of the file table from archive start. |
|
||||
| `0x0C` | 4 | u32 | LE | `toc_size` | Size of the file table in bytes (if TOC encryption is on, this is the encrypted size including PKCS7 padding). |
|
||||
| `0x10` | 16 | bytes | - | `toc_iv` | Initialization vector for encrypted TOC. Zero-filled (`0x00` x 16) when TOC encryption flag (bit 1) is off. |
|
||||
| `0x20` | 8 | bytes | - | `reserved` | Reserved for future use. MUST be zero-filled. |
|
||||
|
||||
**Total header size: 40 bytes (0x28).**
|
||||
|
||||
### Flags Bitfield
|
||||
|
||||
| Bit | Mask | Name | Description |
|
||||
|-----|------|------|-------------|
|
||||
| 0 | `0x01` | `compression` | Per-file compression enabled. When set, files MAY be individually gzip-compressed (per-file `compression_flag` controls each file). When clear, all files are stored raw. |
|
||||
| 1 | `0x02` | `toc_encrypted` | File table is encrypted with AES-256-CBC using `toc_iv`. When clear, file table is stored as plaintext. |
|
||||
| 2 | `0x04` | `xor_header` | Header bytes are XOR-obfuscated (see Section 9.1). When clear, header is stored as-is. |
|
||||
| 3 | `0x08` | `decoy_padding` | Random decoy bytes are inserted after data blocks (see Section 9.3). When clear, `padding_after` in every file table entry is 0. |
|
||||
| 4-7 | `0xF0` | reserved | Reserved. MUST be `0`. |
|
||||
|
||||
---
|
||||
|
||||
## 5. File Table Entry Definition
|
||||
|
||||
The file table (TOC) is a contiguous sequence of variable-length entries, one per file. Entries are stored in the order files were added to the archive. There is no per-entry delimiter; entries are read sequentially using the `name_length` field to determine where each entry's variable-length name ends.
|
||||
|
||||
### Entry Field Table
|
||||
|
||||
| Field | Size | Type | Endian | Description |
|
||||
|-------|------|------|--------|-------------|
|
||||
| `name_length` | 2 | u16 | LE | Filename length in bytes (UTF-8 encoded byte count). |
|
||||
| `name` | `name_length` | bytes | - | Filename as UTF-8 bytes. NOT null-terminated. May contain path separators (`/`). |
|
||||
| `original_size` | 4 | u32 | LE | Original file size in bytes (before compression). |
|
||||
| `compressed_size` | 4 | u32 | LE | Size after gzip compression. Equals `original_size` if `compression_flag` is 0 (no compression). |
|
||||
| `encrypted_size` | 4 | u32 | LE | Size after AES-256-CBC encryption with PKCS7 padding. Formula: `((compressed_size / 16) + 1) * 16`. |
|
||||
| `data_offset` | 4 | u32 | LE | Absolute byte offset of this file's data block from archive start. |
|
||||
| `iv` | 16 | bytes | - | Random AES-256-CBC initialization vector for this file. |
|
||||
| `hmac` | 32 | bytes | - | HMAC-SHA-256 over `iv || ciphertext`. See Section 7 for details. |
|
||||
| `sha256` | 32 | bytes | - | SHA-256 hash of the original file content (before compression and encryption). |
|
||||
| `compression_flag` | 1 | u8 | - | `0` = raw (no compression), `1` = gzip compressed. |
|
||||
| `padding_after` | 2 | u16 | LE | Number of decoy padding bytes after this file's data block. Always `0` when flags bit 3 (decoy_padding) is off. |
|
||||
|
||||
### Entry Size Formula
|
||||
|
||||
Each file table entry has a total size of:
|
||||
|
||||
```
|
||||
entry_size = 2 + name_length + 4 + 4 + 4 + 4 + 16 + 32 + 32 + 1 + 2
|
||||
= 101 + name_length bytes
|
||||
```
|
||||
|
||||
### File Table Total Size
|
||||
|
||||
The total file table size is the sum of all entry sizes:
|
||||
|
||||
```
|
||||
toc_size = SUM(101 + name_length_i) for i in 0..file_count-1
|
||||
```
|
||||
|
||||
When TOC encryption (flags bit 1) is active, the encrypted TOC size includes PKCS7 padding:
|
||||
|
||||
```
|
||||
encrypted_toc_size = ((toc_size / 16) + 1) * 16
|
||||
```
|
||||
|
||||
The `toc_size` field in the header stores the **actual size on disk** (encrypted size if TOC encryption is on, plaintext size if off).
|
||||
|
||||
---
|
||||
|
||||
## 6. Data Block Layout
|
||||
|
||||
Each file has a single contiguous data block containing **only the ciphertext** (the AES-256-CBC encrypted output).
|
||||
|
||||
```
|
||||
[ciphertext: encrypted_size bytes]
|
||||
```
|
||||
|
||||
**Important design decisions:**
|
||||
|
||||
- The **IV is stored only in the file table entry**, not duplicated at the start of the data block. The data block contains only ciphertext. This simplifies `dd` extraction in the shell decoder: a single `dd` call with the correct offset and size extracts the complete ciphertext.
|
||||
- The **HMAC is stored only in the file table entry**, not appended to the data block. The decoder reads the HMAC from the TOC, then verifies against the data block contents.
|
||||
- If decoy padding is enabled (flags bit 3), `padding_after` bytes of random data follow the ciphertext. The decoder MUST skip these bytes. The next file's data block starts at offset `data_offset + encrypted_size + padding_after`.
|
||||
|
||||
### Data Block Ordering
|
||||
|
||||
Data blocks appear in the same order as file table entries. For file entry `i`:
|
||||
|
||||
```
|
||||
data_offset_0 = toc_offset + toc_size
|
||||
data_offset_i = data_offset_{i-1} + encrypted_size_{i-1} + padding_after_{i-1}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Encryption and Authentication Details
|
||||
|
||||
### Pipeline
|
||||
|
||||
Each file is processed through the following pipeline, in order:
|
||||
|
||||
```
|
||||
original_file
|
||||
|
|
||||
v
|
||||
[1. SHA-256 checksum] --> stored in file table entry as `sha256`
|
||||
|
|
||||
v
|
||||
[2. Gzip compress] (if compression_flag = 1) --> compressed_data
|
||||
| (size = compressed_size)
|
||||
v
|
||||
[3. PKCS7 pad] --> padded_data
|
||||
| (size = encrypted_size)
|
||||
v
|
||||
[4. AES-256-CBC encrypt] (with random IV) --> ciphertext
|
||||
| (size = encrypted_size)
|
||||
v
|
||||
[5. HMAC-SHA-256] (over IV || ciphertext) --> stored in file table entry as `hmac`
|
||||
```
|
||||
|
||||
### AES-256-CBC
|
||||
|
||||
- **Key:** 32 bytes (256 bits), hardcoded and shared across all three decoders.
|
||||
- **IV:** 16 bytes, randomly generated for each file. Stored in the file table entry `iv` field.
|
||||
- **Block size:** 16 bytes.
|
||||
- **Mode:** CBC (Cipher Block Chaining).
|
||||
- The same 32-byte key is used for all files in the archive.
|
||||
|
||||
### PKCS7 Padding
|
||||
|
||||
PKCS7 padding is applied to the compressed (or raw) data before encryption. PKCS7 **always adds at least 1 byte** of padding. If the input length is already a multiple of 16, a full 16-byte padding block is added.
|
||||
|
||||
**Formula:**
|
||||
|
||||
```
|
||||
encrypted_size = ((compressed_size / 16) + 1) * 16
|
||||
```
|
||||
|
||||
Where `/` is integer division (floor).
|
||||
|
||||
**Examples:**
|
||||
|
||||
| `compressed_size` | Padding bytes | `encrypted_size` |
|
||||
|-------------------|---------------|------------------|
|
||||
| 0 | 16 | 16 |
|
||||
| 1 | 15 | 16 |
|
||||
| 15 | 1 | 16 |
|
||||
| 16 | 16 | 32 |
|
||||
| 17 | 15 | 32 |
|
||||
| 31 | 1 | 32 |
|
||||
| 32 | 16 | 48 |
|
||||
| 100 | 12 | 112 |
|
||||
|
||||
### HMAC-SHA-256
|
||||
|
||||
- **Key:** The same 32-byte key used for AES-256-CBC encryption. (v1 uses a single key for both encryption and authentication. v2 will derive separate subkeys using HKDF.)
|
||||
- **Input:** The concatenation of the 16-byte IV and the ciphertext:
|
||||
|
||||
```
|
||||
HMAC_input = IV (16 bytes) || ciphertext (encrypted_size bytes)
|
||||
Total HMAC input length = 16 + encrypted_size bytes
|
||||
```
|
||||
|
||||
- **Output:** 32 bytes, stored in the file table entry `hmac` field.
|
||||
|
||||
### Encrypt-then-MAC
|
||||
|
||||
This format uses the **Encrypt-then-MAC** construction:
|
||||
|
||||
1. The HMAC is computed **after** encryption, over the IV and ciphertext.
|
||||
2. The decoder **MUST verify the HMAC before attempting decryption**. If the HMAC does not match, the decoder MUST reject the file without decrypting. This prevents padding oracle attacks and avoids processing tampered data.
|
||||
|
||||
### SHA-256 Integrity Checksum
|
||||
|
||||
- **Input:** The original file content (before compression, before encryption).
|
||||
- **Output:** 32 bytes, stored in the file table entry `sha256` field.
|
||||
- **Verification:** After the decoder decrypts and decompresses a file, it computes SHA-256 of the result and compares it to the stored `sha256`. A mismatch indicates data corruption or an incorrect key.
|
||||
|
||||
---
|
||||
|
||||
## 8. Compression Details
|
||||
|
||||
- **Algorithm:** Standard gzip (DEFLATE, RFC 1952).
|
||||
- **Granularity:** Per-file. Each file has its own `compression_flag` in the file table entry.
|
||||
- **Global flag:** The header flags bit 0 (`compression`) enables per-file compression. When this bit is clear, ALL files are stored raw regardless of individual `compression_flag` values.
|
||||
- **Recommendation:** Already-compressed files (APK, ZIP, PNG, JPEG) should use `compression_flag = 0` (raw) to avoid size inflation.
|
||||
|
||||
### Size Tracking
|
||||
|
||||
- `original_size`: Size of the file before any processing.
|
||||
- `compressed_size`: Size after gzip compression. If `compression_flag = 0`, then `compressed_size = original_size`.
|
||||
- `encrypted_size`: Size after AES-256-CBC with PKCS7 padding. Always `>= compressed_size`.
|
||||
|
||||
### Decompression in Each Decoder
|
||||
|
||||
| Decoder | Library/Command |
|
||||
|---------|-----------------|
|
||||
| Rust | `flate2` crate (`GzDecoder`) |
|
||||
| Kotlin | `java.util.zip.GZIPInputStream` |
|
||||
| Shell | `gunzip` (busybox) |
|
||||
|
||||
---
|
||||
|
||||
## 9. Obfuscation Features
|
||||
|
||||
These features are defined fully in this v1 specification but are intended for implementation in Phase 6 (after all three decoders work without obfuscation). Each feature is controlled by a flag bit in the header and can be activated independently.
|
||||
|
||||
### 9.1 XOR Header Obfuscation (flags bit 2, mask `0x04`)
|
||||
|
||||
When flags bit 2 is set, the entire 40-byte header is XOR-obfuscated with a fixed repeating 8-byte key.
|
||||
|
||||
**XOR Key:** `0xA5 0x3C 0x96 0x0F 0xE1 0x7B 0x4D 0xC8` (8 bytes, repeating)
|
||||
|
||||
**XOR Range:** Bytes `0x00` through `0x27` (the entire 40-byte header).
|
||||
|
||||
**Application:**
|
||||
|
||||
- XOR is applied **after** the header is fully constructed (all fields written).
|
||||
- The 8-byte key repeats cyclically across the 40 bytes: byte `i` of the header is XORed with `key[i % 8]`.
|
||||
|
||||
**Decoding:**
|
||||
|
||||
- The decoder reads the first 40 bytes and XORs them with the same repeating key (XOR is its own inverse).
|
||||
- After de-XOR, the decoder reads header fields normally.
|
||||
|
||||
**Bootstrapping problem:** When XOR obfuscation is active, the flags byte itself is XORed. The decoder MUST:
|
||||
|
||||
1. Always attempt de-XOR on the first 40 bytes.
|
||||
2. Read the flags byte from the de-XORed header.
|
||||
3. Check if bit 2 is set. If it is, the de-XOR was correct. If it is not, re-read the header from the original (un-XORed) bytes.
|
||||
|
||||
Alternatively, the decoder can check the magic bytes: if the first 4 bytes are `0x00 0xEA 0x72 0x63`, the header is not XOR-obfuscated. If they are not, attempt de-XOR and re-check.
|
||||
|
||||
**When flags bit 2 is 0:** The header is stored as-is (no XOR).
|
||||
|
||||
### 9.2 TOC Encryption (flags bit 1, mask `0x02`)
|
||||
|
||||
When flags bit 1 is set, the entire file table is encrypted with AES-256-CBC.
|
||||
|
||||
- **Key:** The same 32-byte key used for file encryption.
|
||||
- **IV:** The `toc_iv` field in the header (16 bytes, randomly generated).
|
||||
- **Input:** The serialized file table (all entries concatenated).
|
||||
- **Padding:** PKCS7 padding is applied to the entire serialized TOC.
|
||||
- **`toc_size` in header:** Stores the **encrypted** TOC size (including PKCS7 padding), not the plaintext size.
|
||||
|
||||
**Decoding:**
|
||||
|
||||
1. Read `toc_offset`, `toc_size`, and `toc_iv` from the (de-XORed) header.
|
||||
2. Read `toc_size` bytes starting at `toc_offset`.
|
||||
3. Decrypt with AES-256-CBC using `toc_iv` and the 32-byte key.
|
||||
4. Remove PKCS7 padding.
|
||||
5. Parse file table entries from the decrypted plaintext.
|
||||
|
||||
**When flags bit 1 is 0:** The file table is stored as plaintext. `toc_iv` is zero-filled but unused.
|
||||
|
||||
### 9.3 Decoy Padding (flags bit 3, mask `0x08`)
|
||||
|
||||
When flags bit 3 is set, random bytes are inserted after each file's data block.
|
||||
|
||||
- The number of random padding bytes for each file is stored in the file table entry `padding_after` field (u16 LE).
|
||||
- Padding bytes are cryptographically random and carry no meaningful data.
|
||||
- The decoder MUST skip `padding_after` bytes after reading the ciphertext of each file.
|
||||
- The padding disrupts size-based analysis: an observer cannot determine individual file sizes from the data block layout.
|
||||
|
||||
**Next data block offset:**
|
||||
|
||||
```
|
||||
next_data_offset = data_offset + encrypted_size + padding_after
|
||||
```
|
||||
|
||||
**When flags bit 3 is 0:** `padding_after` is `0` for every file table entry. No padding bytes exist between data blocks.
|
||||
|
||||
---
|
||||
|
||||
## 10. Decode Order of Operations
|
||||
|
||||
The following steps MUST be followed in order by all decoders:
|
||||
|
||||
```
|
||||
1. Read 40 bytes from offset 0x00.
|
||||
|
||||
2. Attempt XOR de-obfuscation:
|
||||
a. Check if bytes 0x00-0x03 equal magic (0x00 0xEA 0x72 0x63).
|
||||
b. If YES: header is not XOR-obfuscated. Use as-is.
|
||||
c. If NO: XOR bytes 0x00-0x27 with key (0xA5 0x3C 0x96 0x0F 0xE1 0x7B 0x4D 0xC8),
|
||||
repeating cyclically. Re-check magic. If still wrong, reject archive.
|
||||
|
||||
3. Parse header fields:
|
||||
- Verify magic == 0x00 0xEA 0x72 0x63
|
||||
- Read version (must be 1)
|
||||
- Read flags
|
||||
- Check for unknown flag bits (bits 4-7 must be 0; reject if not)
|
||||
- Read file_count
|
||||
- Read toc_offset, toc_size, toc_iv
|
||||
|
||||
4. Read TOC:
|
||||
a. Seek to toc_offset.
|
||||
b. Read toc_size bytes.
|
||||
c. If flags bit 1 (toc_encrypted) is set:
|
||||
- Decrypt TOC with AES-256-CBC using toc_iv and the 32-byte key.
|
||||
- Remove PKCS7 padding.
|
||||
d. Parse file_count entries sequentially from the (decrypted) TOC bytes.
|
||||
|
||||
5. For each file entry (i = 0 to file_count - 1):
|
||||
a. Read ciphertext:
|
||||
- Seek to data_offset.
|
||||
- Read encrypted_size bytes.
|
||||
|
||||
b. Verify HMAC:
|
||||
- Compute HMAC-SHA-256(key, iv || ciphertext).
|
||||
- Compare with stored hmac (32 bytes).
|
||||
- If mismatch: REJECT this file. Do NOT attempt decryption.
|
||||
|
||||
c. Decrypt:
|
||||
- Decrypt ciphertext with AES-256-CBC using entry's iv and the 32-byte key.
|
||||
- Remove PKCS7 padding.
|
||||
- Result = compressed_data (or raw data if compression_flag = 0).
|
||||
|
||||
d. Decompress (if compression_flag = 1):
|
||||
- Decompress with gzip.
|
||||
- Result = original file content.
|
||||
|
||||
e. Verify integrity:
|
||||
- Compute SHA-256 of the decompressed/raw result.
|
||||
- Compare with stored sha256 (32 bytes).
|
||||
- If mismatch: WARN (data corruption or wrong key).
|
||||
|
||||
f. Write to output:
|
||||
- Create output file using stored name.
|
||||
- Write the verified content.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 11. Version Compatibility Rules
|
||||
|
||||
1. **Version field:** The `version` field at offset `0x04` identifies the format version. This specification defines version `1`.
|
||||
|
||||
2. **Forward compatibility:** Decoders MUST reject archives with `version` greater than their supported version. A v1 decoder encountering `version = 2` MUST fail with a clear error message.
|
||||
|
||||
3. **Unknown flags:** Decoders MUST reject archives that have any reserved flag bits (bits 4-7) set to `1`. Unknown flags indicate features the decoder does not understand and cannot safely skip. Silent ignoring of unknown flags is prohibited.
|
||||
|
||||
4. **Future versions:** Version 2+ MAY:
|
||||
- Add fields after the `reserved` bytes in the header (growing header size).
|
||||
- Define new flag bits (bits 4-7).
|
||||
- Change the `reserved` field to carry metadata.
|
||||
- Introduce HKDF-derived per-file keys (replacing single shared key).
|
||||
|
||||
5. **Backward compatibility:** Future versions SHOULD maintain the same magic bytes and the same position of the `version` field (offset `0x04`) so that decoders can read the version before deciding how to proceed.
|
||||
Reference in New Issue
Block a user