Files
android-encrypted-archiver/.planning/phases/02-core-archiver/02-02-PLAN.md
2026-02-24 23:49:46 +03:00

226 lines
11 KiB
Markdown

---
phase: 02-core-archiver
plan: 02
type: execute
wave: 2
depends_on: ["02-01"]
files_modified:
- src/archive.rs
- src/main.rs
autonomous: true
requirements: [CLI-02, CLI-03]
must_haves:
truths:
- "Running `encrypted_archive pack file1 file2 -o out.bin` produces a valid archive file"
- "The output archive file starts with magic bytes 0x00 0xEA 0x72 0x63"
- "Running `encrypted_archive inspect out.bin` shows file count, names, and sizes without decryption"
- "Running `encrypted_archive unpack out.bin -o outdir/` extracts files identical to originals"
- "Each file in the archive has a unique random IV (no IV reuse)"
- "HMAC is verified before decryption during unpack (reject tampered files)"
- "SHA-256 is verified after decompression during unpack (detect corruption)"
- "The output file is not recognized by `file` command (no standard signatures)"
- "Already-compressed files (APK) are stored without gzip compression"
artifacts:
- path: "src/archive.rs"
provides: "pack(), unpack(), inspect() orchestration functions"
exports: ["pack", "unpack", "inspect"]
min_lines: 150
- path: "src/main.rs"
provides: "CLI dispatch wiring commands to archive functions"
contains: "archive::pack"
key_links:
- from: "src/archive.rs"
to: "src/format.rs"
via: "writes header and TOC entries using format module"
pattern: "format::write_header|format::write_toc_entry|format::read_header"
- from: "src/archive.rs"
to: "src/crypto.rs"
via: "encrypts/decrypts file data and computes/verifies HMAC"
pattern: "crypto::encrypt_data|crypto::decrypt_data|crypto::compute_hmac|crypto::verify_hmac"
- from: "src/archive.rs"
to: "src/compression.rs"
via: "compresses/decompresses file data based on should_compress"
pattern: "compression::compress|compression::decompress|compression::should_compress"
- from: "src/main.rs"
to: "src/archive.rs"
via: "dispatches CLI commands to archive functions"
pattern: "archive::pack|archive::unpack|archive::inspect"
---
<objective>
Implement the three archive commands (pack, unpack, inspect) and wire them to the CLI, producing a fully functional encrypted archive tool.
Purpose: This is the core deliverable of Phase 2 -- the working archiver. Pack produces archives conforming to FORMAT.md, inspect reads metadata without decryption, and unpack extracts files with full HMAC and SHA-256 verification.
Output: A working `encrypted_archive` binary with pack, unpack, and inspect commands that produce and consume archives matching the FORMAT.md specification.
</objective>
<execution_context>
@/home/nick/.claude/get-shit-done/workflows/execute-plan.md
@/home/nick/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/02-core-archiver/02-RESEARCH.md
@.planning/phases/02-core-archiver/02-01-SUMMARY.md
@docs/FORMAT.md
</context>
<tasks>
<task type="auto">
<name>Task 1: Implement pack and inspect commands</name>
<files>src/archive.rs, src/main.rs</files>
<action>
1. Implement `pack()` in `src/archive.rs`:
Signature: `pub fn pack(files: &[PathBuf], output: &Path, no_compress: &[String]) -> anyhow::Result<()>`
Algorithm (two-pass archive writing per research Pattern 2):
**Pass 1 -- Process all files:**
For each input file:
a. Read entire file into memory (per Out of Scope: no streaming).
b. Validate file size <= u32::MAX. If exceeded, return error: "File too large: {name} ({size} bytes exceeds 4 GB limit)".
c. Compute SHA-256 of original data: `crypto::sha256_hash(&data)`.
d. Determine compression: `compression::should_compress(&filename, no_compress)`.
e. If compressing: `compressed = compression::compress(&data)?`. Else: `compressed = data.clone()`, set compression_flag = 0.
f. Generate random IV: `crypto::generate_iv()`.
g. Encrypt: `ciphertext = crypto::encrypt_data(&compressed, &KEY, &iv)`.
h. Compute HMAC: `hmac = crypto::compute_hmac(&KEY, &iv, &ciphertext)`.
i. Store results in a `ProcessedFile` struct (or equivalent) with: name, original_size, compressed_size, encrypted_size, iv, hmac, sha256, compression_flag, ciphertext.
**Pass 2 -- Compute offsets and write archive:**
a. Compute TOC size: sum of (101 + name.len()) for each file.
b. Set toc_offset = HEADER_SIZE (40).
c. Compute data offsets:
- First file: data_offset = toc_offset + toc_size
- Each subsequent: previous data_offset + previous encrypted_size
d. Determine flags byte: if ANY file has compression_flag == 1, set bit 0 (0x01). Bits 1-3 are 0 (no obfuscation in Phase 2). Bits 4-7 MUST be 0.
e. Create Header: version=1, flags, file_count, toc_offset, toc_size, toc_iv=[0u8;16], reserved=[0u8;8].
f. Open output file for writing.
g. Write header using `format::write_header()`.
h. Write TOC entries using `format::write_toc_entry()` for each file (with computed data_offset).
i. Write data blocks: for each file, write ciphertext bytes.
j. Print summary: "Packed {N} files into {output} ({total_bytes} bytes)".
Import KEY from `crate::key::KEY`.
2. Implement `inspect()` in `src/archive.rs`:
Signature: `pub fn inspect(archive: &Path) -> anyhow::Result<()>`
Algorithm:
a. Open archive file for reading.
b. Read header using `format::read_header()`. This validates magic, version, flags.
c. Read TOC entries using `format::read_toc()`.
d. Print header info:
```
Archive: {filename}
Version: {version}
Flags: 0x{flags:02X}
Files: {file_count}
TOC offset: {toc_offset}
TOC size: {toc_size}
```
e. For each file entry, print:
```
[{i}] {name}
Original: {original_size} bytes
Compressed: {compressed_size} bytes
Encrypted: {encrypted_size} bytes
Offset: {data_offset}
Compression: {yes/no}
IV: {hex}
HMAC: {hex}
SHA-256: {hex}
```
f. Print total: "Total original size: {sum} bytes".
3. Implement `unpack()` in `src/archive.rs`:
Signature: `pub fn unpack(archive: &Path, output_dir: &Path) -> anyhow::Result<()>`
Algorithm (follows FORMAT.md Section 10 decode order):
a. Open archive file for reading.
b. Read header using `format::read_header()`. Validates magic, version, flags.
c. Read TOC entries using `format::read_toc()`.
d. Create output directory if it doesn't exist.
e. Track error count for per-file failures.
f. For each file entry:
- Seek to data_offset. Read encrypted_size bytes (ciphertext).
- **Verify HMAC FIRST**: `crypto::verify_hmac(&KEY, &entry.iv, &ciphertext, &entry.hmac)`.
If HMAC fails: print error "HMAC verification failed for {name}, skipping", increment error count, continue to next file.
- Decrypt: `decrypted = crypto::decrypt_data(&ciphertext, &KEY, &entry.iv)?`.
- Decompress if compression_flag == 1: `decompressed = compression::decompress(&decrypted)?`.
Else: `decompressed = decrypted`.
- **Verify SHA-256**: Compare `crypto::sha256_hash(&decompressed)` with `entry.sha256`.
If mismatch: print warning "SHA-256 mismatch for {name} (data may be corrupted)", increment error count. Still write the file.
- Create parent directories if name contains `/`.
- Write decompressed data to `output_dir/entry.name`.
- Print "Extracted: {name} ({original_size} bytes)".
g. Print summary: "Extracted {success_count}/{file_count} files".
h. If error_count > 0: return Err with message "{error_count} file(s) had verification errors".
Per research open question #2: Abort on header/TOC errors. For per-file errors (HMAC/SHA-256), report and continue with non-zero exit code.
4. Wire up `src/main.rs`:
Replace placeholder functions with actual calls:
```rust
match cli.command {
Commands::Pack { files, output, no_compress } => {
archive::pack(&files, &output, &no_compress)?;
}
Commands::Unpack { archive, output_dir } => {
archive::unpack(&archive, &output_dir)?;
}
Commands::Inspect { archive } => {
archive::inspect(&archive)?;
}
}
```
IMPORTANT: Use `std::io::Seek` and `SeekFrom::Start(offset as u64)` for seeking to data_offset during unpack.
IMPORTANT: Use `std::fs::create_dir_all()` for creating output directories.
IMPORTANT: filename from TocEntry may contain path separators -- sanitize to prevent directory traversal (reject names starting with `/` or containing `..`).
</action>
<verify>
<automated>cd /home/nick/Projects/Rust/encrypted_archive && cargo build 2>&1 && echo "--- Build OK ---" && echo "Hello, World!" > /tmp/ea_test_hello.txt && printf '\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10' > /tmp/ea_test_binary.bin && cargo run -- pack /tmp/ea_test_hello.txt /tmp/ea_test_binary.bin -o /tmp/ea_test_archive.bin 2>&1 && echo "--- Pack OK ---" && cargo run -- inspect /tmp/ea_test_archive.bin 2>&1 && echo "--- Inspect OK ---" && mkdir -p /tmp/ea_test_out && cargo run -- unpack /tmp/ea_test_archive.bin -o /tmp/ea_test_out 2>&1 && echo "--- Unpack OK ---" && diff /tmp/ea_test_hello.txt /tmp/ea_test_out/ea_test_hello.txt && diff /tmp/ea_test_binary.bin /tmp/ea_test_out/ea_test_binary.bin && echo "--- Round-trip OK: files identical ---" && file /tmp/ea_test_archive.bin 2>&1 && echo "--- file command check (should say 'data' not a known format) ---" && rm -rf /tmp/ea_test_*</automated>
<manual>Verify inspect output shows correct file metadata, verify round-trip produces identical files</manual>
<sampling_rate>run after this task commits, before declaring plan complete</sampling_rate>
</verify>
<done>
- `encrypted_archive pack` accepts multiple input files and produces a single archive
- `encrypted_archive inspect` displays file metadata (names, sizes, offsets, IVs, HMACs, SHA-256) without decryption
- `encrypted_archive unpack` extracts all files, verifying HMAC before decryption and SHA-256 after decompression
- Round-trip test: pack 2 files, unpack, diff shows files are byte-identical
- Archive file is not recognized by `file` command (shows "data" not a known format)
- HMAC verification happens before decryption (encrypt-then-MAC correctly implemented)
- Files with compressed extensions (apk, zip, etc.) are stored without gzip compression
- Error handling: HMAC failure skips file and continues, SHA-256 mismatch warns but still writes
</done>
</task>
</tasks>
<verification>
- `cargo build` succeeds
- Pack 2 test files (text + binary), unpack to different directory, diff shows byte-identical
- `file archive.bin` does NOT show a recognized format (should show "data")
- `inspect` shows correct file count, names, sizes, and hex-encoded IVs/HMACs/SHA-256
- Compression auto-detection: `.apk` file should have compression_flag=0 in inspect output
- Unpack with tampered archive (flip a byte in ciphertext) should report HMAC failure but continue
</verification>
<success_criteria>
A fully functional `encrypted_archive` binary where `pack` creates archives matching FORMAT.md, `inspect` displays metadata, and `unpack` extracts with full HMAC and SHA-256 verification. Round-trip fidelity: packed files are byte-identical after unpacking.
</success_criteria>
<output>
After completion, create `.planning/phases/02-core-archiver/02-02-SUMMARY.md`
</output>