docs(02-core-archiver): create phase plan
This commit is contained in:
@@ -45,11 +45,11 @@ Plans:
|
||||
3. Each file in the archive is independently compressed (gzip) and encrypted (AES-256-CBC) with a unique random IV
|
||||
4. HMAC-SHA256 is computed over IV+ciphertext for each file (encrypt-then-MAC)
|
||||
5. Running `encrypted_archive inspect archive.bin` shows file count, names, and sizes without decrypting content
|
||||
**Plans**: TBD
|
||||
**Plans**: 2 plans
|
||||
|
||||
Plans:
|
||||
- [ ] 02-01: TBD
|
||||
- [ ] 02-02: TBD
|
||||
- [ ] 02-01-PLAN.md -- Project scaffolding, binary format types, crypto pipeline, and compression module
|
||||
- [ ] 02-02-PLAN.md -- Pack, inspect, and unpack commands with full archive orchestration
|
||||
|
||||
### Phase 3: Round-Trip Verification
|
||||
**Goal**: Proven byte-identical round-trips through the Rust unpack command, backed by golden test vectors
|
||||
|
||||
288
.planning/phases/02-core-archiver/02-01-PLAN.md
Normal file
288
.planning/phases/02-core-archiver/02-01-PLAN.md
Normal file
@@ -0,0 +1,288 @@
|
||||
---
|
||||
phase: 02-core-archiver
|
||||
plan: 01
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: []
|
||||
files_modified:
|
||||
- Cargo.toml
|
||||
- src/main.rs
|
||||
- src/cli.rs
|
||||
- src/key.rs
|
||||
- src/format.rs
|
||||
- src/crypto.rs
|
||||
- src/compression.rs
|
||||
autonomous: true
|
||||
requirements: [FMT-01, FMT-02, FMT-03, FMT-04, ENC-01, ENC-02, ENC-03, ENC-04, ENC-05, CMP-01, CMP-02, INT-01, CLI-01]
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "cargo build compiles the project with zero errors and zero warnings"
|
||||
- "All binary format types (Header, TocEntry) match FORMAT.md byte-for-byte field definitions"
|
||||
- "encrypt then decrypt with known key/IV produces original data"
|
||||
- "compress then decompress produces original data"
|
||||
- "HMAC computed over IV||ciphertext matches recomputation"
|
||||
- "SHA-256 of original data is correctly computed and stored"
|
||||
artifacts:
|
||||
- path: "Cargo.toml"
|
||||
provides: "Project manifest with all dependencies"
|
||||
contains: "aes"
|
||||
- path: "src/main.rs"
|
||||
provides: "CLI entry point with clap dispatch"
|
||||
contains: "clap"
|
||||
- path: "src/cli.rs"
|
||||
provides: "Clap derive structs for Pack/Unpack/Inspect subcommands"
|
||||
exports: ["Cli", "Commands"]
|
||||
- path: "src/key.rs"
|
||||
provides: "Hardcoded 32-byte encryption key"
|
||||
exports: ["KEY"]
|
||||
- path: "src/format.rs"
|
||||
provides: "Header and TocEntry structs with serialize/deserialize"
|
||||
exports: ["Header", "TocEntry", "MAGIC", "VERSION"]
|
||||
- path: "src/crypto.rs"
|
||||
provides: "AES-256-CBC encrypt/decrypt, HMAC-SHA-256, SHA-256"
|
||||
exports: ["encrypt_data", "decrypt_data", "compute_hmac", "verify_hmac", "sha256_hash"]
|
||||
- path: "src/compression.rs"
|
||||
provides: "Gzip compress/decompress and should_compress heuristic"
|
||||
exports: ["compress", "decompress", "should_compress"]
|
||||
key_links:
|
||||
- from: "src/crypto.rs"
|
||||
to: "src/key.rs"
|
||||
via: "imports KEY constant"
|
||||
pattern: "use crate::key::KEY"
|
||||
- from: "src/format.rs"
|
||||
to: "docs/FORMAT.md"
|
||||
via: "byte-for-byte field layout match"
|
||||
pattern: "0x00.*0xEA.*0x72.*0x63"
|
||||
- from: "src/main.rs"
|
||||
to: "src/cli.rs"
|
||||
via: "clap parse and dispatch"
|
||||
pattern: "Cli::parse"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Create the Rust project foundation with all library modules: CLI skeleton, binary format types, crypto pipeline, and compression.
|
||||
|
||||
Purpose: Establish the complete module structure and all building-block functions that the pack/unpack/inspect commands will orchestrate. Every individual operation (encrypt, decrypt, compress, decompress, HMAC, SHA-256, serialize header, serialize TOC entry) must work correctly in isolation before being wired together.
|
||||
|
||||
Output: A compiling Rust project with 7 source files covering CLI parsing, binary format types, cryptographic operations, compression, and the hardcoded key.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@/home/nick/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@/home/nick/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/PROJECT.md
|
||||
@.planning/ROADMAP.md
|
||||
@.planning/STATE.md
|
||||
@.planning/phases/02-core-archiver/02-RESEARCH.md
|
||||
@docs/FORMAT.md
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 1: Project scaffolding with Cargo, CLI skeleton, and key module</name>
|
||||
<files>Cargo.toml, src/main.rs, src/cli.rs, src/key.rs</files>
|
||||
<action>
|
||||
1. Initialize the Rust project:
|
||||
```
|
||||
cargo init --name encrypted_archive /home/nick/Projects/Rust/encrypted_archive
|
||||
```
|
||||
If Cargo.toml already exists, just update it.
|
||||
|
||||
2. Set up Cargo.toml with exact dependency versions from research:
|
||||
```toml
|
||||
[package]
|
||||
name = "encrypted_archive"
|
||||
version = "0.1.0"
|
||||
edition = "2021"
|
||||
|
||||
[dependencies]
|
||||
aes = "0.8"
|
||||
cbc = "0.1"
|
||||
hmac = "0.12"
|
||||
sha2 = "0.10"
|
||||
flate2 = "1.1"
|
||||
clap = { version = "4.5", features = ["derive"] }
|
||||
rand = "0.9"
|
||||
anyhow = "1.0"
|
||||
```
|
||||
|
||||
3. Create `src/cli.rs` with clap derive structs matching the research pattern:
|
||||
- `Cli` struct with `#[command(subcommand)]`
|
||||
- `Commands` enum with three variants:
|
||||
- `Pack { files: Vec<PathBuf>, output: PathBuf, no_compress: Vec<String> }`
|
||||
- `Unpack { archive: PathBuf, output_dir: PathBuf }` (output_dir defaults to ".")
|
||||
- `Inspect { archive: PathBuf }`
|
||||
- Use `#[arg(required = true)]` for files in Pack
|
||||
- Use `#[arg(short, long)]` for output paths
|
||||
- Use `#[arg(long)]` for no_compress (Vec<String> of filename patterns to skip compression for)
|
||||
|
||||
4. Create `src/key.rs` with the hardcoded 32-byte key:
|
||||
```rust
|
||||
/// Hardcoded 32-byte AES-256 key.
|
||||
/// Same key is used for AES-256-CBC encryption and HMAC-SHA-256 authentication (v1).
|
||||
/// v2 will derive separate subkeys using HKDF.
|
||||
pub const KEY: [u8; 32] = [
|
||||
0x7A, 0x35, 0xC1, 0xD9, 0x4F, 0xE8, 0x2B, 0x6A,
|
||||
0x91, 0x0D, 0xF3, 0x58, 0xBC, 0x74, 0xA6, 0x1E,
|
||||
0x42, 0x8F, 0xD0, 0x63, 0xE5, 0x17, 0x9B, 0x2C,
|
||||
0xFA, 0x84, 0x06, 0xCD, 0x3E, 0x79, 0xB5, 0x50,
|
||||
];
|
||||
```
|
||||
Use a non-trivial key (not the example key 00 01 02 ... 1F from FORMAT.md worked example).
|
||||
|
||||
5. Create `src/main.rs`:
|
||||
- Parse CLI with `Cli::parse()`
|
||||
- Match on `Commands` variants, calling placeholder functions that print "not implemented yet" and return `Ok(())`
|
||||
- Use `anyhow::Result<()>` as the return type for main
|
||||
- Declare modules: `mod cli; mod key; mod format; mod crypto; mod compression; mod archive;`
|
||||
|
||||
IMPORTANT: Use Context7 to verify clap 4.5 derive API before writing cli.rs. Call `mcp__context7__resolve-library-id` for "clap" and then `mcp__context7__query-docs` for the derive subcommand pattern.
|
||||
|
||||
Do NOT use `rand::thread_rng()` -- it was renamed to `rand::rng()` in rand 0.9.
|
||||
Do NOT use `block-modes` crate -- it is deprecated; use `cbc` directly.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/nick/Projects/Rust/encrypted_archive && cargo build 2>&1</automated>
|
||||
<manual>Verify Cargo.toml has correct dependencies, src/main.rs declares all modules, CLI help text shows pack/unpack/inspect</manual>
|
||||
<sampling_rate>run after this task commits</sampling_rate>
|
||||
</verify>
|
||||
<done>
|
||||
- cargo build succeeds with no errors
|
||||
- `cargo run -- --help` shows three subcommands: pack, unpack, inspect
|
||||
- `cargo run -- pack --help` shows files (required), --output, --no-compress arguments
|
||||
- All 7 module files exist (main.rs, cli.rs, key.rs, format.rs, crypto.rs, compression.rs, archive.rs) even if some are stubs
|
||||
</done>
|
||||
</task>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 2: Format types, crypto pipeline, and compression module</name>
|
||||
<files>src/format.rs, src/crypto.rs, src/compression.rs</files>
|
||||
<action>
|
||||
1. Create `src/format.rs` implementing the binary format from FORMAT.md:
|
||||
|
||||
Constants:
|
||||
```rust
|
||||
pub const MAGIC: [u8; 4] = [0x00, 0xEA, 0x72, 0x63];
|
||||
pub const VERSION: u8 = 1;
|
||||
pub const HEADER_SIZE: u32 = 40;
|
||||
```
|
||||
|
||||
Structs:
|
||||
- `Header` with fields: version (u8), flags (u8), file_count (u16), toc_offset (u32), toc_size (u32), toc_iv ([u8; 16]), reserved ([u8; 8])
|
||||
- `TocEntry` with fields: name (String), original_size (u32), compressed_size (u32), encrypted_size (u32), data_offset (u32), iv ([u8; 16]), hmac ([u8; 32]), sha256 ([u8; 32]), compression_flag (u8), padding_after (u16)
|
||||
|
||||
Serialization (write functions):
|
||||
- `write_header(writer: &mut impl Write, header: &Header) -> anyhow::Result<()>`:
|
||||
Writes all 40 bytes in exact FORMAT.md order. Magic bytes first, then version, flags, file_count (LE), toc_offset (LE), toc_size (LE), toc_iv (16 bytes), reserved (8 bytes of zero).
|
||||
- `write_toc_entry(writer: &mut impl Write, entry: &TocEntry) -> anyhow::Result<()>`:
|
||||
Writes: name_length (u16 LE) + name bytes + original_size (u32 LE) + compressed_size (u32 LE) + encrypted_size (u32 LE) + data_offset (u32 LE) + iv (16 bytes) + hmac (32 bytes) + sha256 (32 bytes) + compression_flag (u8) + padding_after (u16 LE).
|
||||
Entry size = 101 + name.len() bytes.
|
||||
|
||||
Deserialization (read functions):
|
||||
- `read_header(reader: &mut impl Read) -> anyhow::Result<Header>`:
|
||||
Reads 40 bytes, verifies magic == MAGIC, verifies version == 1, checks flags bits 4-7 are zero (reject if not). Returns parsed Header.
|
||||
- `read_toc_entry(reader: &mut impl Read) -> anyhow::Result<TocEntry>`:
|
||||
Reads name_length (u16 LE), then name_length bytes as UTF-8 string, then all fixed fields. Uses `from_le_bytes()` for all multi-byte integers.
|
||||
- `read_toc(reader: &mut impl Read, file_count: u16) -> anyhow::Result<Vec<TocEntry>>`:
|
||||
Reads file_count entries sequentially.
|
||||
|
||||
Helper:
|
||||
- `entry_size(entry: &TocEntry) -> u32`: Returns `101 + entry.name.len() as u32`
|
||||
- `compute_toc_size(entries: &[TocEntry]) -> u32`: Sum of all entry_size values
|
||||
|
||||
ALL multi-byte fields MUST use `to_le_bytes()` for writing and `from_le_bytes()` for reading. Do NOT use `to_ne_bytes()` or `to_be_bytes()`.
|
||||
Filenames: Use `name.len()` (byte count) NOT `name.chars().count()` (character count). FORMAT.md specifies byte count.
|
||||
|
||||
2. Create `src/crypto.rs` implementing the encryption pipeline:
|
||||
|
||||
Type aliases:
|
||||
```rust
|
||||
use aes::cipher::{block_padding::Pkcs7, BlockDecryptMut, BlockEncryptMut, KeyIvInit};
|
||||
type Aes256CbcEnc = cbc::Encryptor<aes::Aes256>;
|
||||
type Aes256CbcDec = cbc::Decryptor<aes::Aes256>;
|
||||
type HmacSha256 = hmac::Hmac<sha2::Sha256>;
|
||||
```
|
||||
|
||||
Functions:
|
||||
- `generate_iv() -> [u8; 16]`:
|
||||
Uses `rand::rng().fill(&mut iv)` (NOT thread_rng -- renamed in rand 0.9).
|
||||
- `encrypt_data(plaintext: &[u8], key: &[u8; 32], iv: &[u8; 16]) -> Vec<u8>`:
|
||||
Computes encrypted_size = ((plaintext.len() / 16) + 1) * 16.
|
||||
Allocates buffer of encrypted_size, copies plaintext to start.
|
||||
Calls `Aes256CbcEnc::new(key.into(), iv.into()).encrypt_padded_mut::<Pkcs7>(&mut buf, plaintext.len())`.
|
||||
Returns the buffer (full encrypted_size bytes).
|
||||
- `decrypt_data(ciphertext: &[u8], key: &[u8; 32], iv: &[u8; 16]) -> anyhow::Result<Vec<u8>>`:
|
||||
Allocates mutable buffer from ciphertext.
|
||||
Calls `Aes256CbcDec::new(key.into(), iv.into()).decrypt_padded_mut::<Pkcs7>(&mut buf)`.
|
||||
Returns decrypted data as Vec<u8>.
|
||||
- `compute_hmac(key: &[u8; 32], iv: &[u8; 16], ciphertext: &[u8]) -> [u8; 32]`:
|
||||
Creates HmacSha256, updates with iv then ciphertext. Returns `finalize().into_bytes().into()`.
|
||||
HMAC input = IV (16 bytes) || ciphertext (encrypted_size bytes). Nothing else.
|
||||
- `verify_hmac(key: &[u8; 32], iv: &[u8; 16], ciphertext: &[u8], expected: &[u8; 32]) -> bool`:
|
||||
Creates HmacSha256, updates with iv then ciphertext. Uses `verify_slice(expected)` for constant-time comparison. Returns true on success.
|
||||
- `sha256_hash(data: &[u8]) -> [u8; 32]`:
|
||||
Returns `sha2::Sha256::digest(data).into()`.
|
||||
|
||||
CRITICAL: Use `hmac::Mac` trait for `new_from_slice()`, `update()`, `finalize()`, `verify_slice()`.
|
||||
CRITICAL: encrypted_size formula: `((input_len / 16) + 1) * 16` -- PKCS7 ALWAYS adds at least 1 byte.
|
||||
|
||||
3. Create `src/compression.rs`:
|
||||
|
||||
Functions:
|
||||
- `compress(data: &[u8]) -> anyhow::Result<Vec<u8>>`:
|
||||
Uses `flate2::write::GzEncoder` with `Compression::default()`.
|
||||
IMPORTANT: Use `GzBuilder::new().mtime(0)` to zero the gzip timestamp for reproducible output in tests.
|
||||
Writes all data, finishes encoder, returns compressed bytes.
|
||||
- `decompress(data: &[u8]) -> anyhow::Result<Vec<u8>>`:
|
||||
Uses `flate2::read::GzDecoder`. Reads all bytes to a Vec<u8>.
|
||||
- `should_compress(filename: &str, no_compress_list: &[String]) -> bool`:
|
||||
Returns false if filename matches any entry in no_compress_list (by suffix or exact match).
|
||||
Returns false for known compressed extensions: apk, zip, gz, bz2, xz, zst, png, jpg, jpeg, gif, webp, mp4, mp3, aac, ogg, flac, 7z, rar, jar.
|
||||
Returns true otherwise.
|
||||
Uses `filename.rsplit('.').next()` for extension extraction.
|
||||
|
||||
IMPORTANT: Use Context7 to verify the `aes`, `cbc`, `hmac`, `sha2`, `flate2`, and `rand` crate APIs before writing. Resolve library IDs and query docs for encrypt/decrypt patterns, HMAC usage, and GzEncoder/GzDecoder usage.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/nick/Projects/Rust/encrypted_archive && cargo build 2>&1 && cargo run -- --help 2>&1</automated>
|
||||
<manual>Review format.rs field order against FORMAT.md sections 4 and 5 to confirm byte-level match</manual>
|
||||
<sampling_rate>run after this task commits</sampling_rate>
|
||||
</verify>
|
||||
<done>
|
||||
- cargo build succeeds with no errors
|
||||
- format.rs exports Header, TocEntry, MAGIC, VERSION, HEADER_SIZE, and all read/write functions
|
||||
- crypto.rs exports encrypt_data, decrypt_data, compute_hmac, verify_hmac, sha256_hash, generate_iv
|
||||
- compression.rs exports compress, decompress, should_compress
|
||||
- Header serialization writes exactly 40 bytes with correct field order per FORMAT.md Section 4
|
||||
- TocEntry serialization writes exactly (101 + name_length) bytes per FORMAT.md Section 5
|
||||
- All multi-byte integers use little-endian encoding
|
||||
- encrypt_data output size matches formula: ((input_len / 16) + 1) * 16
|
||||
</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
- `cargo build` succeeds with zero errors
|
||||
- `cargo run -- --help` shows pack, unpack, inspect subcommands
|
||||
- All 7 source files exist: main.rs, cli.rs, key.rs, format.rs, crypto.rs, compression.rs, archive.rs
|
||||
- format.rs Header struct has fields matching FORMAT.md Section 4 (magic, version, flags, file_count, toc_offset, toc_size, toc_iv, reserved)
|
||||
- format.rs TocEntry struct has fields matching FORMAT.md Section 5 (name, original_size, compressed_size, encrypted_size, data_offset, iv, hmac, sha256, compression_flag, padding_after)
|
||||
- crypto.rs uses `cbc::Encryptor<aes::Aes256>` (NOT deprecated block-modes)
|
||||
- crypto.rs uses `rand::rng()` (NOT thread_rng)
|
||||
- crypto.rs HMAC input is IV || ciphertext only
|
||||
- compression.rs uses `GzBuilder::new().mtime(0)` for reproducible gzip
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
A compiling Rust project with complete module structure where every building-block operation (format read/write, encrypt/decrypt, HMAC compute/verify, SHA-256 hash, compress/decompress) is implemented and ready for the pack/unpack/inspect commands to orchestrate.
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/02-core-archiver/02-01-SUMMARY.md`
|
||||
</output>
|
||||
225
.planning/phases/02-core-archiver/02-02-PLAN.md
Normal file
225
.planning/phases/02-core-archiver/02-02-PLAN.md
Normal file
@@ -0,0 +1,225 @@
|
||||
---
|
||||
phase: 02-core-archiver
|
||||
plan: 02
|
||||
type: execute
|
||||
wave: 2
|
||||
depends_on: ["02-01"]
|
||||
files_modified:
|
||||
- src/archive.rs
|
||||
- src/main.rs
|
||||
autonomous: true
|
||||
requirements: [CLI-02, CLI-03]
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "Running `encrypted_archive pack file1 file2 -o out.bin` produces a valid archive file"
|
||||
- "The output archive file starts with magic bytes 0x00 0xEA 0x72 0x63"
|
||||
- "Running `encrypted_archive inspect out.bin` shows file count, names, and sizes without decryption"
|
||||
- "Running `encrypted_archive unpack out.bin -o outdir/` extracts files identical to originals"
|
||||
- "Each file in the archive has a unique random IV (no IV reuse)"
|
||||
- "HMAC is verified before decryption during unpack (reject tampered files)"
|
||||
- "SHA-256 is verified after decompression during unpack (detect corruption)"
|
||||
- "The output file is not recognized by `file` command (no standard signatures)"
|
||||
- "Already-compressed files (APK) are stored without gzip compression"
|
||||
artifacts:
|
||||
- path: "src/archive.rs"
|
||||
provides: "pack(), unpack(), inspect() orchestration functions"
|
||||
exports: ["pack", "unpack", "inspect"]
|
||||
min_lines: 150
|
||||
- path: "src/main.rs"
|
||||
provides: "CLI dispatch wiring commands to archive functions"
|
||||
contains: "archive::pack"
|
||||
key_links:
|
||||
- from: "src/archive.rs"
|
||||
to: "src/format.rs"
|
||||
via: "writes header and TOC entries using format module"
|
||||
pattern: "format::write_header|format::write_toc_entry|format::read_header"
|
||||
- from: "src/archive.rs"
|
||||
to: "src/crypto.rs"
|
||||
via: "encrypts/decrypts file data and computes/verifies HMAC"
|
||||
pattern: "crypto::encrypt_data|crypto::decrypt_data|crypto::compute_hmac|crypto::verify_hmac"
|
||||
- from: "src/archive.rs"
|
||||
to: "src/compression.rs"
|
||||
via: "compresses/decompresses file data based on should_compress"
|
||||
pattern: "compression::compress|compression::decompress|compression::should_compress"
|
||||
- from: "src/main.rs"
|
||||
to: "src/archive.rs"
|
||||
via: "dispatches CLI commands to archive functions"
|
||||
pattern: "archive::pack|archive::unpack|archive::inspect"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Implement the three archive commands (pack, unpack, inspect) and wire them to the CLI, producing a fully functional encrypted archive tool.
|
||||
|
||||
Purpose: This is the core deliverable of Phase 2 -- the working archiver. Pack produces archives conforming to FORMAT.md, inspect reads metadata without decryption, and unpack extracts files with full HMAC and SHA-256 verification.
|
||||
|
||||
Output: A working `encrypted_archive` binary with pack, unpack, and inspect commands that produce and consume archives matching the FORMAT.md specification.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@/home/nick/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@/home/nick/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/PROJECT.md
|
||||
@.planning/ROADMAP.md
|
||||
@.planning/STATE.md
|
||||
@.planning/phases/02-core-archiver/02-RESEARCH.md
|
||||
@.planning/phases/02-core-archiver/02-01-SUMMARY.md
|
||||
@docs/FORMAT.md
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 1: Implement pack and inspect commands</name>
|
||||
<files>src/archive.rs, src/main.rs</files>
|
||||
<action>
|
||||
1. Implement `pack()` in `src/archive.rs`:
|
||||
|
||||
Signature: `pub fn pack(files: &[PathBuf], output: &Path, no_compress: &[String]) -> anyhow::Result<()>`
|
||||
|
||||
Algorithm (two-pass archive writing per research Pattern 2):
|
||||
|
||||
**Pass 1 -- Process all files:**
|
||||
For each input file:
|
||||
a. Read entire file into memory (per Out of Scope: no streaming).
|
||||
b. Validate file size <= u32::MAX. If exceeded, return error: "File too large: {name} ({size} bytes exceeds 4 GB limit)".
|
||||
c. Compute SHA-256 of original data: `crypto::sha256_hash(&data)`.
|
||||
d. Determine compression: `compression::should_compress(&filename, no_compress)`.
|
||||
e. If compressing: `compressed = compression::compress(&data)?`. Else: `compressed = data.clone()`, set compression_flag = 0.
|
||||
f. Generate random IV: `crypto::generate_iv()`.
|
||||
g. Encrypt: `ciphertext = crypto::encrypt_data(&compressed, &KEY, &iv)`.
|
||||
h. Compute HMAC: `hmac = crypto::compute_hmac(&KEY, &iv, &ciphertext)`.
|
||||
i. Store results in a `ProcessedFile` struct (or equivalent) with: name, original_size, compressed_size, encrypted_size, iv, hmac, sha256, compression_flag, ciphertext.
|
||||
|
||||
**Pass 2 -- Compute offsets and write archive:**
|
||||
a. Compute TOC size: sum of (101 + name.len()) for each file.
|
||||
b. Set toc_offset = HEADER_SIZE (40).
|
||||
c. Compute data offsets:
|
||||
- First file: data_offset = toc_offset + toc_size
|
||||
- Each subsequent: previous data_offset + previous encrypted_size
|
||||
d. Determine flags byte: if ANY file has compression_flag == 1, set bit 0 (0x01). Bits 1-3 are 0 (no obfuscation in Phase 2). Bits 4-7 MUST be 0.
|
||||
e. Create Header: version=1, flags, file_count, toc_offset, toc_size, toc_iv=[0u8;16], reserved=[0u8;8].
|
||||
f. Open output file for writing.
|
||||
g. Write header using `format::write_header()`.
|
||||
h. Write TOC entries using `format::write_toc_entry()` for each file (with computed data_offset).
|
||||
i. Write data blocks: for each file, write ciphertext bytes.
|
||||
j. Print summary: "Packed {N} files into {output} ({total_bytes} bytes)".
|
||||
|
||||
Import KEY from `crate::key::KEY`.
|
||||
|
||||
2. Implement `inspect()` in `src/archive.rs`:
|
||||
|
||||
Signature: `pub fn inspect(archive: &Path) -> anyhow::Result<()>`
|
||||
|
||||
Algorithm:
|
||||
a. Open archive file for reading.
|
||||
b. Read header using `format::read_header()`. This validates magic, version, flags.
|
||||
c. Read TOC entries using `format::read_toc()`.
|
||||
d. Print header info:
|
||||
```
|
||||
Archive: {filename}
|
||||
Version: {version}
|
||||
Flags: 0x{flags:02X}
|
||||
Files: {file_count}
|
||||
TOC offset: {toc_offset}
|
||||
TOC size: {toc_size}
|
||||
```
|
||||
e. For each file entry, print:
|
||||
```
|
||||
[{i}] {name}
|
||||
Original: {original_size} bytes
|
||||
Compressed: {compressed_size} bytes
|
||||
Encrypted: {encrypted_size} bytes
|
||||
Offset: {data_offset}
|
||||
Compression: {yes/no}
|
||||
IV: {hex}
|
||||
HMAC: {hex}
|
||||
SHA-256: {hex}
|
||||
```
|
||||
f. Print total: "Total original size: {sum} bytes".
|
||||
|
||||
3. Implement `unpack()` in `src/archive.rs`:
|
||||
|
||||
Signature: `pub fn unpack(archive: &Path, output_dir: &Path) -> anyhow::Result<()>`
|
||||
|
||||
Algorithm (follows FORMAT.md Section 10 decode order):
|
||||
a. Open archive file for reading.
|
||||
b. Read header using `format::read_header()`. Validates magic, version, flags.
|
||||
c. Read TOC entries using `format::read_toc()`.
|
||||
d. Create output directory if it doesn't exist.
|
||||
e. Track error count for per-file failures.
|
||||
f. For each file entry:
|
||||
- Seek to data_offset. Read encrypted_size bytes (ciphertext).
|
||||
- **Verify HMAC FIRST**: `crypto::verify_hmac(&KEY, &entry.iv, &ciphertext, &entry.hmac)`.
|
||||
If HMAC fails: print error "HMAC verification failed for {name}, skipping", increment error count, continue to next file.
|
||||
- Decrypt: `decrypted = crypto::decrypt_data(&ciphertext, &KEY, &entry.iv)?`.
|
||||
- Decompress if compression_flag == 1: `decompressed = compression::decompress(&decrypted)?`.
|
||||
Else: `decompressed = decrypted`.
|
||||
- **Verify SHA-256**: Compare `crypto::sha256_hash(&decompressed)` with `entry.sha256`.
|
||||
If mismatch: print warning "SHA-256 mismatch for {name} (data may be corrupted)", increment error count. Still write the file.
|
||||
- Create parent directories if name contains `/`.
|
||||
- Write decompressed data to `output_dir/entry.name`.
|
||||
- Print "Extracted: {name} ({original_size} bytes)".
|
||||
g. Print summary: "Extracted {success_count}/{file_count} files".
|
||||
h. If error_count > 0: return Err with message "{error_count} file(s) had verification errors".
|
||||
|
||||
Per research open question #2: Abort on header/TOC errors. For per-file errors (HMAC/SHA-256), report and continue with non-zero exit code.
|
||||
|
||||
4. Wire up `src/main.rs`:
|
||||
Replace placeholder functions with actual calls:
|
||||
```rust
|
||||
match cli.command {
|
||||
Commands::Pack { files, output, no_compress } => {
|
||||
archive::pack(&files, &output, &no_compress)?;
|
||||
}
|
||||
Commands::Unpack { archive, output_dir } => {
|
||||
archive::unpack(&archive, &output_dir)?;
|
||||
}
|
||||
Commands::Inspect { archive } => {
|
||||
archive::inspect(&archive)?;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
IMPORTANT: Use `std::io::Seek` and `SeekFrom::Start(offset as u64)` for seeking to data_offset during unpack.
|
||||
IMPORTANT: Use `std::fs::create_dir_all()` for creating output directories.
|
||||
IMPORTANT: filename from TocEntry may contain path separators -- sanitize to prevent directory traversal (reject names starting with `/` or containing `..`).
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/nick/Projects/Rust/encrypted_archive && cargo build 2>&1 && echo "--- Build OK ---" && echo "Hello, World!" > /tmp/ea_test_hello.txt && printf '\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10' > /tmp/ea_test_binary.bin && cargo run -- pack /tmp/ea_test_hello.txt /tmp/ea_test_binary.bin -o /tmp/ea_test_archive.bin 2>&1 && echo "--- Pack OK ---" && cargo run -- inspect /tmp/ea_test_archive.bin 2>&1 && echo "--- Inspect OK ---" && mkdir -p /tmp/ea_test_out && cargo run -- unpack /tmp/ea_test_archive.bin -o /tmp/ea_test_out 2>&1 && echo "--- Unpack OK ---" && diff /tmp/ea_test_hello.txt /tmp/ea_test_out/ea_test_hello.txt && diff /tmp/ea_test_binary.bin /tmp/ea_test_out/ea_test_binary.bin && echo "--- Round-trip OK: files identical ---" && file /tmp/ea_test_archive.bin 2>&1 && echo "--- file command check (should say 'data' not a known format) ---" && rm -rf /tmp/ea_test_*</automated>
|
||||
<manual>Verify inspect output shows correct file metadata, verify round-trip produces identical files</manual>
|
||||
<sampling_rate>run after this task commits, before declaring plan complete</sampling_rate>
|
||||
</verify>
|
||||
<done>
|
||||
- `encrypted_archive pack` accepts multiple input files and produces a single archive
|
||||
- `encrypted_archive inspect` displays file metadata (names, sizes, offsets, IVs, HMACs, SHA-256) without decryption
|
||||
- `encrypted_archive unpack` extracts all files, verifying HMAC before decryption and SHA-256 after decompression
|
||||
- Round-trip test: pack 2 files, unpack, diff shows files are byte-identical
|
||||
- Archive file is not recognized by `file` command (shows "data" not a known format)
|
||||
- HMAC verification happens before decryption (encrypt-then-MAC correctly implemented)
|
||||
- Files with compressed extensions (apk, zip, etc.) are stored without gzip compression
|
||||
- Error handling: HMAC failure skips file and continues, SHA-256 mismatch warns but still writes
|
||||
</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
- `cargo build` succeeds
|
||||
- Pack 2 test files (text + binary), unpack to different directory, diff shows byte-identical
|
||||
- `file archive.bin` does NOT show a recognized format (should show "data")
|
||||
- `inspect` shows correct file count, names, sizes, and hex-encoded IVs/HMACs/SHA-256
|
||||
- Compression auto-detection: `.apk` file should have compression_flag=0 in inspect output
|
||||
- Unpack with tampered archive (flip a byte in ciphertext) should report HMAC failure but continue
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
A fully functional `encrypted_archive` binary where `pack` creates archives matching FORMAT.md, `inspect` displays metadata, and `unpack` extracts with full HMAC and SHA-256 verification. Round-trip fidelity: packed files are byte-identical after unpacking.
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/02-core-archiver/02-02-SUMMARY.md`
|
||||
</output>
|
||||
Reference in New Issue
Block a user