# Phase 2: Core Archiver - Research **Researched:** 2026-02-24 **Domain:** Rust CLI binary with custom binary format, AES-256-CBC encryption, gzip compression, HMAC-SHA-256 authentication **Confidence:** HIGH ## Summary Phase 2 implements the core Rust CLI archiver from scratch (greenfield -- no existing source code). The tool must produce archives matching the FORMAT.md specification (v1) exactly: 40-byte fixed header, variable-length TOC with per-file metadata, and encrypted data blocks. The pipeline for each file is: SHA-256 hash -> gzip compress (optional) -> PKCS7 pad -> AES-256-CBC encrypt -> HMAC-SHA-256 authenticate. The Rust ecosystem has mature, well-tested crates for every component: `aes` + `cbc` for encryption, `hmac` + `sha2` for authentication and hashing, `flate2` for gzip, `clap` for CLI, `rand` for IV generation. All stable versions are compatible and compile together (verified). The full crypto pipeline (compress -> encrypt -> HMAC -> verify -> decrypt -> decompress -> verify SHA-256) was validated as a working Rust program during this research. **Primary recommendation:** Use stable RustCrypto crates (aes 0.8, cbc 0.1, hmac 0.12, sha2 0.10) rather than the 0.9/0.2/0.13/0.11 release candidates. The stable versions are battle-tested, have extensive documentation, and all compile together with Rust 1.93. Structure the project with clear module separation: `cli`, `format`, `crypto`, `compression`, `archive` (pack/unpack/inspect logic). ## Phase Requirements | ID | Description | Research Support | |----|-------------|-----------------| | FMT-01 | Custom binary format with non-standard magic bytes (not recognized by binwalk/file/7z) | Magic bytes `0x00 0xEA 0x72 0x63` defined in FORMAT.md; leading null byte prevents `file` recognition. Binary serialization uses Rust std `to_le_bytes()`/`from_le_bytes()` -- no external crate needed. | | FMT-02 | Version field (1 byte) for forward compatibility | Simple u8 at offset 0x04; reject version != 1. Trivial to implement. | | FMT-03 | File table with metadata: name, sizes, offset, IV, HMAC, SHA-256 | Variable-length TOC entries (101 + name_length bytes each). UTF-8 filenames, length-prefixed. All field types are standard Rust primitives. | | FMT-04 | Little-endian for all multi-byte fields | Rust std: `u16::to_le_bytes()`, `u32::to_le_bytes()`, `u16::from_le_bytes()`, `u32::from_le_bytes()`. No external crate needed. | | ENC-01 | AES-256-CBC encryption per file | `aes 0.8.4` + `cbc 0.1.2` crates. Type alias: `type Aes256CbcEnc = cbc::Encryptor`. Verified working. | | ENC-02 | HMAC-SHA-256 authentication (encrypt-then-MAC) per file | `hmac 0.12.1` + `sha2 0.10.9`. HMAC input = IV (16 bytes) \|\| ciphertext. Verified working. | | ENC-03 | Random 16-byte IV per file, stored in cleartext TOC | `rand 0.9.2`: `rand::rng().fill(&mut iv)`. ThreadRng is cryptographically secure (ChaCha-based with OS seeding). | | ENC-04 | Hardcoded 32-byte key | Const array `const KEY: [u8; 32] = [...]` in source. Same key for AES and HMAC in v1. | | ENC-05 | PKCS7 padding for AES-CBC | `cbc` crate handles PKCS7 via `encrypt_padded_mut::()`. Formula: `encrypted_size = ((compressed_size / 16) + 1) * 16`. Verified. | | CMP-01 | Gzip compression per file before encryption | `flate2 1.1.9`: `GzEncoder::new(Vec::new(), Compression::default())`. Use `GzBuilder::new().mtime(0)` for reproducible output in tests. | | CMP-02 | Per-file compression flag (skip for already-compressed files) | CLI `--no-compress` flag + extension-based auto-detection for `.apk`, `.zip`, `.png`, `.jpg`, `.jpeg`, `.gz`, `.bz2`, `.xz`, `.mp4`, `.mp3`. | | INT-01 | SHA-256 checksum per file (verify after decompression) | `sha2 0.10.9`: `Sha256::digest(&original_data)`. Computed BEFORE compression. Stored in TOC entry. | | CLI-01 | Rust CLI utility for archive creation (Linux/macOS) | `clap 4.5.60` with derive API. Binary target in `src/main.rs`. Standard cargo build. | | CLI-02 | Pack multiple files (text + APK) into one archive | `pack` subcommand accepts `Vec` input files + `-o` output path. Reads files into memory (per Out of Scope: no streaming). | | CLI-03 | Subcommands: pack, unpack, inspect | Three subcommands via clap `#[derive(Subcommand)]`. `inspect` reads header + TOC only, displays metadata without decrypting data blocks. | ## Standard Stack ### Core | Library | Version | Purpose | Why Standard | |---------|---------|---------|--------------| | `aes` | 0.8.4 | AES-256 block cipher | RustCrypto official. 96M+ downloads. Pure Rust with hardware acceleration (AES-NI). | | `cbc` | 0.1.2 | CBC mode of operation | RustCrypto official. Handles PKCS7 padding natively via `block_padding::Pkcs7`. | | `hmac` | 0.12.1 | HMAC-SHA-256 computation | RustCrypto official. Constant-time comparison via `verify_slice()`. | | `sha2` | 0.10.9 | SHA-256 hashing | RustCrypto official. Both one-shot (`Sha256::digest()`) and streaming APIs. | | `flate2` | 1.1.9 | Gzip compression/decompression | De facto standard. Uses miniz_oxide (pure Rust) by default. | | `clap` | 4.5.60 | CLI argument parsing | Industry standard. Derive API for subcommands. | | `rand` | 0.9.2 | Cryptographic random IV generation | `rand::rng()` returns ChaCha-based CSPRNG with OS seeding. | | `anyhow` | 1.0.102 | Error handling | Ergonomic `Result` with context. Standard for CLI apps. | ### Supporting | Library | Version | Purpose | When to Use | |---------|---------|---------|-------------| | (none -- std lib) | - | Little-endian serialization | `u16::to_le_bytes()`, `u32::from_le_bytes()` etc. Built into Rust std. | ### Alternatives Considered | Instead of | Could Use | Tradeoff | |------------|-----------|----------| | `aes` 0.8 + `cbc` 0.1 (stable) | `aes` 0.9-rc + `cbc` 0.2-rc (RC) | RC versions have newer API but are pre-release. Stable versions are battle-tested and fully compatible. Use stable. | | `byteorder` crate | Rust std `to_le_bytes()`/`from_le_bytes()` | std is sufficient since Rust 1.32. No external crate needed. | | `ring` (Google) | RustCrypto stack | `ring` does not expose AES-CBC. It focuses on AEAD modes (AES-GCM). Not suitable for this format. | | `openssl` crate | RustCrypto stack | Links to C library. RustCrypto is pure Rust, no system dependencies. Simpler cross-compilation. | | `serde` + `bincode` | Manual binary serialization | Format spec requires exact byte layout. Manual serialization gives precise control over every byte. Serde/bincode add unnecessary abstraction for a fixed binary format. | **Installation:** ```bash cargo init --name encrypted_archive cargo add aes@0.8 cbc@0.1 hmac@0.12 sha2@0.10 flate2@1.1 clap@4.5 --features clap/derive rand@0.9 anyhow@1.0 ``` ## Architecture Patterns ### Recommended Project Structure ``` encrypted_archive/ ├── Cargo.toml ├── src/ │ ├── main.rs # Entry point: clap CLI parsing, dispatch to commands │ ├── cli.rs # Clap derive structs (Cli, Commands enum) │ ├── format.rs # Binary format constants, header/TOC structs, serialization/deserialization │ ├── crypto.rs # encrypt_file(), decrypt_file(), compute_hmac(), verify_hmac() │ ├── compression.rs # compress(), decompress(), should_compress() │ ├── archive.rs # pack(), unpack(), inspect() -- orchestration logic │ └── key.rs # Hardcoded 32-byte key constant ├── docs/ │ └── FORMAT.md # Binary format specification (already exists) └── tests/ # Integration tests (Phase 3) ``` ### Pattern 1: Pipeline Processing per File **What:** Each file goes through a sequential pipeline: hash -> compress -> pad+encrypt -> HMAC **When to use:** Always during `pack` operation **Example:** ```rust // Source: Verified working pipeline from research validation use aes::cipher::{block_padding::Pkcs7, BlockEncryptMut, KeyIvInit}; use hmac::{Hmac, Mac}; use sha2::{Sha256, Digest}; use flate2::write::GzEncoder; use flate2::Compression; use std::io::Write; type Aes256CbcEnc = cbc::Encryptor; type HmacSha256 = Hmac; struct ProcessedFile { name: String, original_size: u32, compressed_size: u32, encrypted_size: u32, iv: [u8; 16], hmac: [u8; 32], sha256: [u8; 32], compression_flag: u8, ciphertext: Vec, } fn process_file(name: &str, data: &[u8], key: &[u8; 32], compress: bool) -> ProcessedFile { // Step 1: SHA-256 of original let sha256: [u8; 32] = Sha256::digest(data).into(); // Step 2: Compress (optional) let compressed = if compress { let mut encoder = GzEncoder::new(Vec::new(), Compression::default()); encoder.write_all(data).unwrap(); encoder.finish().unwrap() } else { data.to_vec() }; // Step 3: Generate random IV let mut iv = [0u8; 16]; rand::rng().fill(&mut iv); // Step 4: Encrypt with PKCS7 padding let encrypted_size = ((compressed.len() / 16) + 1) * 16; let mut buf = vec![0u8; encrypted_size]; buf[..compressed.len()].copy_from_slice(&compressed); let ciphertext = Aes256CbcEnc::new(key.into(), &iv.into()) .encrypt_padded_mut::(&mut buf, compressed.len()) .unwrap() .to_vec(); // Step 5: HMAC-SHA-256 over IV || ciphertext let mut mac = HmacSha256::new_from_slice(key).unwrap(); mac.update(&iv); mac.update(&ciphertext); let hmac: [u8; 32] = mac.finalize().into_bytes().into(); ProcessedFile { name: name.to_string(), original_size: data.len() as u32, compressed_size: compressed.len() as u32, encrypted_size: encrypted_size as u32, iv, hmac, sha256, compression_flag: if compress { 1 } else { 0 }, ciphertext, } } ``` ### Pattern 2: Two-Pass Archive Writing **What:** First pass processes all files to compute sizes and offsets; second pass writes the archive sequentially. **When to use:** Always during `pack`. The TOC must contain `data_offset` for each file, but data blocks come after the TOC. You must know TOC size before writing data blocks. **Example:** ```rust fn compute_offsets(files: &mut [ProcessedFile], file_count: u16) { let header_size: u32 = 40; // Compute TOC size let toc_size: u32 = files.iter() .map(|f| 101 + f.name.len() as u32) .sum(); let toc_offset = header_size; let mut data_offset = toc_offset + toc_size; // Assign data offsets for file in files.iter_mut() { file.data_offset = data_offset; data_offset += file.encrypted_size; // padding_after = 0 in Phase 2 (no decoy padding) } } ``` ### Pattern 3: CLI Subcommand Dispatch **What:** Use clap derive API with an enum of subcommands **When to use:** Always for the CLI entry point **Example:** ```rust // Source: Verified working clap derive pattern from research validation use clap::{Parser, Subcommand}; use std::path::PathBuf; #[derive(Parser)] #[command(name = "encrypted_archive")] #[command(about = "Custom encrypted archive tool")] struct Cli { #[command(subcommand)] command: Commands, } #[derive(Subcommand)] enum Commands { /// Pack files into an encrypted archive Pack { /// Input files to archive #[arg(required = true)] files: Vec, /// Output archive file #[arg(short, long)] output: PathBuf, /// Disable compression for specified files #[arg(long)] no_compress: Vec, }, /// Unpack an encrypted archive (for testing) Unpack { /// Archive file to unpack archive: PathBuf, /// Output directory #[arg(short, long, default_value = ".")] output_dir: PathBuf, }, /// Inspect archive metadata without decrypting Inspect { /// Archive file to inspect archive: PathBuf, }, } ``` ### Anti-Patterns to Avoid - **Streaming writes without knowing offsets:** The TOC contains `data_offset` for each file. You MUST compute all offsets before writing the TOC. Process all files first, then serialize. - **Using serde/bincode for binary format:** The format spec requires exact byte-level control. Manual serialization with `to_le_bytes()` is correct and simpler. - **Single large buffer for entire archive:** Process and encrypt files individually, write them sequentially. Each file should be processed independently. - **Reusing IVs:** Each file MUST have a unique random IV. Never reuse IVs across files or archive creations. - **MAC-then-encrypt:** The spec mandates encrypt-then-MAC. HMAC MUST be computed over `IV || ciphertext`, NOT over plaintext. ## Don't Hand-Roll | Problem | Don't Build | Use Instead | Why | |---------|-------------|-------------|-----| | AES-256-CBC encryption | Custom AES implementation | `aes 0.8` + `cbc 0.1` crates | Side-channel resistance, hardware acceleration, audited | | PKCS7 padding | Manual padding logic | `cbc` crate's `Pkcs7` padding (via `block_padding`) | Off-by-one errors in padding are security-critical | | HMAC-SHA-256 | Manual HMAC construction | `hmac 0.12` crate | Constant-time comparison, correct key scheduling | | SHA-256 hashing | Custom hash | `sha2 0.10` crate | Correctness, performance, hardware acceleration | | Gzip compression | Custom deflate | `flate2 1.1` crate | RFC 1952 compliance, performance, battle-tested | | CLI argument parsing | Manual arg parsing | `clap 4.5` with derive | Validation, help text, error messages, subcommands | | Random IV generation | Custom RNG | `rand 0.9` with `rand::rng()` | CSPRNG with OS seeding, no bias | | Little-endian serialization | Manual byte shifting | Rust std `to_le_bytes()`/`from_le_bytes()` | Built-in, zero-cost, correct | **Key insight:** Every component in the encryption pipeline is security-sensitive. Using audited, well-tested crates for crypto operations is not optional -- hand-rolled crypto is the single highest-risk anti-pattern in this domain. ## Common Pitfalls ### Pitfall 1: Buffer Sizing for `encrypt_padded_mut` **What goes wrong:** `PadError` at runtime because the buffer is too small for PKCS7-padded output. **Why it happens:** PKCS7 ALWAYS adds at least 1 byte. When input is a multiple of 16, a full 16-byte padding block is added. Formula: `((input_len / 16) + 1) * 16`. **How to avoid:** Always allocate `encrypted_size = ((compressed_size / 16) + 1) * 16` bytes for the encryption buffer. Copy compressed data to the start, then call `encrypt_padded_mut` with `compressed_size` as the plaintext length. **Warning signs:** `PadError` or `unwrap()` panic during encryption. ### Pitfall 2: Gzip Non-Determinism in Tests **What goes wrong:** Gzip output varies between runs (different `compressed_size`), making golden tests impossible. **Why it happens:** Gzip headers contain a timestamp (`mtime`) and OS byte that vary. **How to avoid:** Use `GzBuilder::new().mtime(0).write(Vec::new(), Compression::default())` to zero out the timestamp. The OS byte defaults to the build platform but is consistent on the same machine. **Warning signs:** `compressed_size` changes between test runs for identical input. ### Pitfall 3: Incorrect HMAC Scope **What goes wrong:** HMAC computed over wrong data (just ciphertext, or including TOC metadata). **Why it happens:** Ambiguity about what "encrypt-then-MAC" covers. **How to avoid:** FORMAT.md is explicit: `HMAC_input = IV (16 bytes) || ciphertext (encrypted_size bytes)`. Nothing else. The IV from the TOC entry, concatenated with the ciphertext from the data block. **Warning signs:** HMAC verification failures in other decoders (Kotlin, shell). ### Pitfall 4: TOC Offset Calculation Errors **What goes wrong:** Data blocks written at wrong offsets; decoders read garbage. **Why it happens:** Variable-length filename fields make TOC entry sizes differ. Off-by-one in offset arithmetic. **How to avoid:** Use the formula from FORMAT.md: `entry_size = 101 + name_length`. Total TOC size = sum of all entry sizes. First data block offset = `toc_offset + toc_size`. Each subsequent data block offset = previous offset + previous `encrypted_size`. **Warning signs:** `inspect` command shows corrupted filenames or impossible sizes. ### Pitfall 5: Endianness Errors **What goes wrong:** Multi-byte fields written in big-endian or native-endian instead of little-endian. **Why it happens:** Forgetting to convert, or using wrong conversion function. **How to avoid:** Always use `value.to_le_bytes()` when writing and `u32::from_le_bytes([b0, b1, b2, b3])` when reading. Never use `to_ne_bytes()` or `to_be_bytes()`. **Warning signs:** Values look "swapped" when inspecting hex dump. Shell decoder reads wrong numbers. ### Pitfall 6: UTF-8 Filename Length vs. Character Count **What goes wrong:** `name_length` field stores character count instead of byte count. **Why it happens:** Confusion between `str.len()` (byte count, correct) and `str.chars().count()` (character count, wrong). **How to avoid:** FORMAT.md specifies `name_length` as "Filename length in bytes (UTF-8 encoded byte count)". In Rust, `String::len()` returns byte count, which is correct. **Warning signs:** Non-ASCII filenames (Cyrillic) cause parsing errors in decoders. ### Pitfall 7: Forgetting Flags Byte **What goes wrong:** Archive header has wrong flags, decoders misinterpret format features. **Why it happens:** Phase 2 uses only bit 0 (compression). Bits 1-7 must be zero. **How to avoid:** Set `flags = 0x01` when any file uses compression (global flag), `flags = 0x00` when no files use compression. Bits 1-3 are for Phase 6 obfuscation features. Bits 4-7 MUST be zero. **Warning signs:** Decoders reject archive due to unknown flags. ## Code Examples Verified patterns from official sources and research validation: ### Binary Format Serialization (Header) ```rust // Source: FORMAT.md Section 4 + Rust std library fn write_header( writer: &mut impl std::io::Write, file_count: u16, toc_offset: u32, toc_size: u32, flags: u8, ) -> std::io::Result<()> { // Magic bytes writer.write_all(&[0x00, 0xEA, 0x72, 0x63])?; // Version writer.write_all(&[0x01])?; // Flags writer.write_all(&[flags])?; // File count (LE) writer.write_all(&file_count.to_le_bytes())?; // TOC offset (LE) writer.write_all(&toc_offset.to_le_bytes())?; // TOC size (LE) writer.write_all(&toc_size.to_le_bytes())?; // TOC IV (zero-filled, TOC not encrypted in Phase 2) writer.write_all(&[0u8; 16])?; // Reserved writer.write_all(&[0u8; 8])?; Ok(()) } ``` ### TOC Entry Serialization ```rust // Source: FORMAT.md Section 5 fn write_toc_entry( writer: &mut impl std::io::Write, file: &ProcessedFile, data_offset: u32, ) -> std::io::Result<()> { let name_bytes = file.name.as_bytes(); writer.write_all(&(name_bytes.len() as u16).to_le_bytes())?; writer.write_all(name_bytes)?; writer.write_all(&file.original_size.to_le_bytes())?; writer.write_all(&file.compressed_size.to_le_bytes())?; writer.write_all(&file.encrypted_size.to_le_bytes())?; writer.write_all(&data_offset.to_le_bytes())?; writer.write_all(&file.iv)?; writer.write_all(&file.hmac)?; writer.write_all(&file.sha256)?; writer.write_all(&[file.compression_flag])?; writer.write_all(&0u16.to_le_bytes())?; // padding_after = 0 Ok(()) } ``` ### Inspect Command (Read Header + TOC Only) ```rust // Source: FORMAT.md Section 10, steps 1-4 use std::io::{Read, Seek, SeekFrom}; fn read_header(reader: &mut impl Read) -> anyhow::Result
{ let mut buf = [0u8; 40]; reader.read_exact(&mut buf)?; // Verify magic anyhow::ensure!( buf[0..4] == [0x00, 0xEA, 0x72, 0x63], "Invalid magic bytes" ); let version = buf[4]; anyhow::ensure!(version == 1, "Unsupported version: {}", version); let flags = buf[5]; anyhow::ensure!(flags & 0xF0 == 0, "Unknown flags set: 0x{:02X}", flags); let file_count = u16::from_le_bytes([buf[6], buf[7]]); let toc_offset = u32::from_le_bytes([buf[8], buf[9], buf[10], buf[11]]); let toc_size = u32::from_le_bytes([buf[12], buf[13], buf[14], buf[15]]); Ok(Header { version, flags, file_count, toc_offset, toc_size }) } ``` ### Compression Decision Heuristic ```rust // Source: FORMAT.md Section 8 recommendation fn should_compress(filename: &str, no_compress_list: &[String]) -> bool { // Explicit exclusion from CLI if no_compress_list.iter().any(|nc| filename.ends_with(nc) || filename == nc) { return false; } // Auto-detect already-compressed formats let ext = filename.rsplit('.').next().unwrap_or("").to_lowercase(); !matches!( ext.as_str(), "apk" | "zip" | "gz" | "bz2" | "xz" | "zst" | "png" | "jpg" | "jpeg" | "gif" | "webp" | "mp4" | "mp3" | "aac" | "ogg" | "flac" | "7z" | "rar" | "jar" ) } ``` ## State of the Art | Old Approach | Current Approach | When Changed | Impact | |--------------|------------------|--------------|--------| | `block-modes 0.8` crate | `cbc 0.1` crate (separate crate per mode) | 2022 | `block-modes` is deprecated. Use `cbc` directly. | | `rand::thread_rng()` | `rand::rng()` | rand 0.9 (2025) | Function renamed. Same underlying ChaCha CSPRNG. | | `GenericArray` for keys/IVs | `.into()` conversion from `[u8; N]` | aes/cbc 0.8/0.1 | Can pass `&key.into()` directly from fixed arrays. | | `byteorder` crate | Rust std `to_le_bytes()`/`from_le_bytes()` | Rust 1.32 (2018) | No external crate needed for endian conversion. | **Deprecated/outdated:** - `block-modes` crate: Replaced by individual mode crates (`cbc`, `ecb`, `cfb`, `ofb`). Do NOT use `block-modes`. - `rand::thread_rng()`: Renamed to `rand::rng()` in 0.9. The old name is removed. - `crypto-mac` crate: Merged into `digest` 0.10. Use `hmac 0.12` which uses `digest 0.10` internally. ## Open Questions 1. **Hardcoded key value** - What we know: The key is 32 bytes, hardcoded, shared across all decoders. - What's unclear: The specific key bytes are not defined in FORMAT.md (only the worked example uses `00 01 02 ... 1F`). - Recommendation: Define a non-trivial key constant in `src/key.rs`. The planner should decide the actual key bytes or generate them randomly once. The worked example key is fine for testing but should be replaced for production. 2. **Error handling strategy for `unpack`** - What we know: FORMAT.md says "MUST reject" on HMAC failure, "MUST fail" on bad version. - What's unclear: Should `unpack` abort on first file error, or continue extracting other files? - Recommendation: Abort on header/TOC errors. For per-file errors (HMAC mismatch, SHA-256 mismatch), report the error but continue extracting remaining files (with a non-zero exit code at the end). 3. **Maximum file size constraint (u32)** - What we know: `original_size`, `compressed_size`, `encrypted_size` are all u32 (max ~4 GB). - What's unclear: Should the archiver check and reject files > 4 GB? - Recommendation: Yes, validate file sizes during `pack` and produce a clear error if any file exceeds `u32::MAX`. This is acceptable given the Out of Scope note ("files fit in memory"). ## Sources ### Primary (HIGH confidence) - `docs/FORMAT.md` v1.0 -- The normative specification for the binary format. All byte offsets, field sizes, and pipeline steps are from this document. - `docs.rs/aes/0.8.4` -- AES crate API documentation - `docs.rs/cbc/0.1.2` -- CBC mode crate API documentation and usage examples - `docs.rs/hmac/0.12.1` -- HMAC crate API documentation and usage examples - `docs.rs/sha2/0.10.9` -- SHA-2 crate API documentation - `docs.rs/flate2/1.1.9` -- flate2 crate API documentation (GzEncoder, GzDecoder, GzBuilder) - `docs.rs/clap/4.5.60` -- Clap CLI crate documentation - `docs.rs/rand/0.9.2` -- Rand crate documentation - **Research validation:** Full pipeline (compress -> encrypt -> HMAC -> verify -> decrypt -> decompress -> verify) was compiled and executed successfully as a Rust program during this research. ### Secondary (MEDIUM confidence) - `crates.io` version listings -- Latest stable versions verified via `cargo search` and crates.io API - `rust-random.github.io/book` -- Rand book confirming ThreadRng is ChaCha-based CSPRNG ### Tertiary (LOW confidence) - None. All findings are verified against official documentation and compilation tests. ## Metadata **Confidence breakdown:** - Standard stack: HIGH -- All crates verified via `cargo check`, full pipeline compiled and executed - Architecture: HIGH -- Follows standard Rust CLI patterns; FORMAT.md provides exact byte-level specification - Pitfalls: HIGH -- Common issues identified from official docs, GitHub issues, and practical validation **Research date:** 2026-02-24 **Valid until:** 2026-04-24 (stable crates, slow-moving ecosystem)