diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index fa7c8d2..d3079bf 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -141,7 +141,10 @@ Plans: 2. FORMAT.md defines the Unix permissions field (2 bytes, u16 little-endian) in TOC entries with bit layout matching POSIX mode_t lower 12 bits 3. FORMAT.md specifies that entry names are relative paths using `/` as separator (e.g., `dir/subdir/file.txt`), replacing the previous filename-only convention 4. FORMAT.md includes an updated worked example showing a directory archive with at least one nested directory, one file, and one empty directory -**Plans**: TBD +**Plans**: 1 plan + +Plans: +- [ ] 07-01-PLAN.md -- Update TOC entry definition (entry_type, permissions, path semantics) and worked example with directory archive ### Phase 8: Rust Directory Archiver **Goal**: `pack` accepts directories and recursively archives them with full path hierarchy and permissions; `unpack` restores the complete directory tree @@ -202,7 +205,7 @@ Phases execute in numeric order: 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 9 -> 10 | 4. Kotlin Decoder | v1.0 | 1/1 | Complete | 2026-02-25 | | 5. Shell Decoder | v1.0 | 2/2 | Complete | 2026-02-25 | | 6. Obfuscation Hardening | v1.0 | 2/2 | Complete | 2026-02-25 | -| 7. Format Spec Update | v1.1 | 0/TBD | Not started | - | +| 7. Format Spec Update | v1.1 | 0/1 | Planned | - | | 8. Rust Directory Archiver | v1.1 | 0/TBD | Not started | - | | 9. Kotlin Decoder Update | v1.1 | 0/TBD | Not started | - | | 10. Shell Decoder Update | v1.1 | 0/TBD | Not started | - | diff --git a/.planning/phases/07-format-spec-update/07-01-PLAN.md b/.planning/phases/07-format-spec-update/07-01-PLAN.md new file mode 100644 index 0000000..9e4c029 --- /dev/null +++ b/.planning/phases/07-format-spec-update/07-01-PLAN.md @@ -0,0 +1,273 @@ +--- +phase: 07-format-spec-update +plan: 01 +type: execute +wave: 1 +depends_on: [] +files_modified: [docs/FORMAT.md] +autonomous: true +requirements: [FMT-09, FMT-10, FMT-11, FMT-12] + +must_haves: + truths: + - "FORMAT.md defines entry_type field (1 byte, u8) in File Table Entry: 0x00=file, 0x01=directory" + - "FORMAT.md defines permissions field (2 bytes, u16 LE) in File Table Entry with POSIX mode_t lower 12 bits" + - "FORMAT.md specifies entry names are relative paths using / separator (e.g. dir/subdir/file.txt)" + - "FORMAT.md worked example includes a directory archive with nested directory, file inside it, and empty directory" + - "FORMAT.md version field is bumped to 2 reflecting the v1.1 format changes" + - "Entry size formula is updated to include entry_type (1 byte) and permissions (2 bytes)" + artifacts: + - path: "docs/FORMAT.md" + provides: "Complete v1.1 binary format specification" + contains: "entry_type.*u8" + key_links: + - from: "docs/FORMAT.md Section 5 (File Table Entry)" + to: "docs/FORMAT.md Section 12 (Worked Example)" + via: "New TOC fields (entry_type, permissions) appear in both definition and worked example" + pattern: "entry_type.*permissions" +--- + + +Update FORMAT.md to fully document the v1.1 TOC entry layout with entry type, permission bits, and relative path semantics. + +Purpose: All three decoders (Rust, Kotlin, Shell) need an unambiguous specification to build their v1.1 directory support against. This phase updates the normative format document before any code changes. + +Output: Updated `docs/FORMAT.md` with v1.1 TOC entry fields and a new worked example showing a directory archive. + + + +@/home/nick/.claude/get-shit-done/workflows/execute-plan.md +@/home/nick/.claude/get-shit-done/templates/summary.md + + + +@.planning/PROJECT.md +@.planning/ROADMAP.md +@.planning/STATE.md +@docs/FORMAT.md + +Key decisions from STATE.md: +- v1.1: No backward compatibility with v1.0 archives (format version bump to 2) +- v1.1: Only mode bits (no uid/gid, no timestamps, no symlinks) +- v1.0: Filename-only entry names -- v1.1 changes this to relative paths with `/` separator + +Existing FORMAT.md patterns (from Phase 1): +- Field table pattern: offset, size, type, endian, field name, description for every binary structure +- Worked example pattern: concrete inputs -> pipeline walkthrough -> hex dump -> shell decode commands +- Entry size formula: `101 + name_length bytes` per entry +- All offsets absolute from archive byte 0 + + + + + + Task 1: Update TOC entry definition with entry_type, permissions, and path semantics + docs/FORMAT.md + +Update docs/FORMAT.md with the following changes. Preserve the existing document structure and style conventions (field tables, notation, etc.). + +**1. Version bump (Section 1 and header):** +- Change document version from "1.0" to "1.1" in the front matter +- Note that format version field in archives is now `2` (header byte at offset 0x04) +- In Section 11 (Version Compatibility), add that v2 introduces entry_type and permissions fields + +**2. Section 2 (Notation Conventions):** +- Update the filenames note: change "Filenames are UTF-8 encoded" to "Entry names are UTF-8 encoded relative paths using `/` as the path separator (e.g., `dir/subdir/file.txt`). Names MUST NOT start with `/` or contain `..` components. For top-level files, the name is just the filename (e.g., `readme.txt`)." + +**3. Section 3 (Archive Structure Diagram):** +- Update the TOC description comment: entries now represent files AND directories + +**4. Section 4 (Archive Header):** +- Change version field description: "Format version. Value `2` for this specification (v1.1). Value `1` for legacy v1.0 (no directory support)." +- In the `file_count` field, rename to `entry_count` and update description: "Number of entries (files and directories) stored in the archive." +- Update the toc_offset, toc_size field descriptions to reference "entry table" where they say "file table" + +**5. Section 5 (File Table Entry Definition) -- the core change:** + +Rename section title to "Table of Contents (TOC) Entry Definition" for clarity. + +Add two new fields to the Entry Field Table AFTER `name` and BEFORE `original_size`: + +| Field | Size | Type | Endian | Description | +|-------|------|------|--------|-------------| +| `entry_type` | 1 | u8 | - | Entry type: `0x00` = regular file, `0x01` = directory. Directories have `original_size`, `compressed_size`, and `encrypted_size` all set to 0 and no corresponding data block. | +| `permissions` | 2 | u16 | LE | Unix permission bits (lower 12 bits of POSIX `mode_t`). Bit layout: `[suid(1)][sgid(1)][sticky(1)][owner_rwx(3)][group_rwx(3)][other_rwx(3)]`. Example: `0o755` = `0x01ED` = owner rwx, group r-x, other r-x. Stored as u16 LE. | + +Add a subsection "### Entry Type Values" with a table: + +| Value | Name | Description | +|-------|------|-------------| +| `0x00` | File | Regular file. Has associated data block with ciphertext. All size fields and data_offset are meaningful. | +| `0x01` | Directory | Directory entry. `original_size`, `compressed_size`, `encrypted_size` are all 0. `data_offset` is 0. `iv` is zero-filled. `hmac` is zero-filled. `sha256` is zero-filled. `compression_flag` is 0. No data block exists for this entry. | + +Add a subsection "### Permission Bits Layout" with a table: + +| Bits | Mask | Name | Description | +|------|------|------|-------------| +| 11 | `0o4000` | setuid | Set user ID on execution | +| 10 | `0o2000` | setgid | Set group ID on execution | +| 9 | `0o1000` | sticky | Sticky bit | +| 8-6 | `0o0700` | owner | Owner read(4)/write(2)/execute(1) | +| 5-3 | `0o0070` | group | Group read(4)/write(2)/execute(1) | +| 2-0 | `0o0007` | other | Other read(4)/write(2)/execute(1) | + +Common examples: `0o755` (rwxr-xr-x) = `0x01ED`, `0o644` (rw-r--r--) = `0x01A4`, `0o700` (rwx------) = `0x01C0`. + +Add a subsection "### Entry Name Semantics" explaining: +- Names are relative paths from the archive root, using `/` as separator +- Example: a file at `project/src/main.rs` has name `project/src/main.rs` +- A directory entry for `project/src/` has name `project/src` (no trailing slash) +- Names MUST NOT start with `/` (no absolute paths) +- Names MUST NOT contain `..` components (no directory traversal) +- The encoder MUST sort entries so that directory entries appear before any files within them (parent-before-child ordering). This allows the decoder to `mkdir -p` or create directories in a single sequential pass. + +**6. Update Entry Size Formula:** +- Old: `entry_size = 101 + name_length bytes` +- New: `entry_size = 104 + name_length bytes` (added 1 byte entry_type + 2 bytes permissions = +3) + +**7. Section 6 (Data Block Layout):** +- Add note: "Directory entries (entry_type = 0x01) have no data block. The decoder MUST skip directory entries when processing data blocks." + +**8. Section 10 (Decode Order of Operations):** +- In step 3, update version check: "Read version (must be 2 for v1.1)" +- In step 5, add substep before reading ciphertext: "Check entry_type. If 0x01 (directory): create the directory using the entry name as a relative path, apply permissions, and skip to the next entry (no ciphertext to read)." +- In step 5f (Write to output), add: "Create parent directories as needed (using the path components of the entry name). Apply permissions from the entry's `permissions` field." + + + grep -c "entry_type" docs/FORMAT.md | xargs test 5 -le + + +- Section 5 has entry_type (u8) and permissions (u16 LE) fields in the Entry Field Table +- Entry type values table documents 0x00=file, 0x01=directory +- Permission bits layout table with POSIX mode_t lower 12 bits +- Entry name semantics subsection specifies relative paths with `/` separator +- Entry size formula updated to 104 + name_length +- Decode order updated for directory handling +- Version bumped to 2 + + + + + Task 2: Write updated worked example with directory archive + docs/FORMAT.md + +Replace Section 12 (Worked Example) in docs/FORMAT.md with a new worked example that demonstrates the v1.1 directory archive format. Keep the old example as Section 12.1 with a note "(v1.0, retained for reference)" and add the new example as Section 12.2. + +Actually, to avoid confusion, REPLACE the entire worked example with a new v1.1 example. The v1.0 example is no longer valid (version field changed, entry format changed). + +**New Worked Example: Directory Archive** + +Use the following input structure: + +``` +project/ + project/src/ (directory, mode 0755) + project/src/main.rs (file, mode 0644, content: "fn main() {}\n" = 14 bytes) + project/empty/ (empty directory, mode 0755) +``` + +This demonstrates: +- A nested directory (`project/src/`) +- A file inside a nested directory (`project/src/main.rs`) +- An empty directory (`project/empty/`) +- Three entry types total: 2 directories + 1 file + +**Parameters:** +- Key: same 32 bytes as v1.0 example (00 01 02 ... 1F) +- Flags: `0x01` (compression enabled, no obfuscation -- keep example simple) +- Version: `2` + +**Per-entry walkthrough:** + +Entry 1: `project/src` (directory) +- entry_type: 0x01 +- permissions: 0o755 = 0x01ED (LE: ED 01) +- name: "project/src" (11 bytes) +- original_size: 0, compressed_size: 0, encrypted_size: 0 +- data_offset: 0, iv: zero-filled, hmac: zero-filled, sha256: zero-filled +- compression_flag: 0, padding_after: 0 + +Entry 2: `project/src/main.rs` (file) +- entry_type: 0x00 +- permissions: 0o644 = 0x01A4 (LE: A4 01) +- name: "project/src/main.rs" (19 bytes) +- original_size: 14 +- SHA-256 of "fn main() {}\n": compute the real hash +- compressed_size: representative (e.g., 30 bytes for small gzip output) +- encrypted_size: ((30/16)+1)*16 = 32 +- IV: representative (e.g., AA BB CC DD EE FF 00 11 22 33 44 55 66 77 88 99) +- hmac: representative, sha256: real value +- compression_flag: 1, padding_after: 0 + +Entry 3: `project/empty` (directory) +- entry_type: 0x01 +- permissions: 0o755 = 0x01ED (LE: ED 01) +- name: "project/empty" (13 bytes) +- All sizes 0, data_offset 0, iv/hmac/sha256 zero-filled +- compression_flag: 0, padding_after: 0 + +**Layout table:** +Compute all offsets using the new entry size formula (104 + name_length per entry): +- Header: 40 bytes (0x00 - 0x27) +- TOC Entry 1: 104 + 11 = 115 bytes +- TOC Entry 2: 104 + 19 = 123 bytes +- TOC Entry 3: 104 + 13 = 117 bytes +- TOC total: 115 + 123 + 117 = 355 bytes +- Data block 1 (only file entry): starts at 40 + 355 = 395, size = 32 bytes +- Archive total: 395 + 32 = 427 bytes + +**Include:** +1. Input description table (entries, types, permissions, content) +2. Parameters (key, flags, version) +3. Per-entry pipeline walkthrough (SHA-256 for the file, show directory entries have all-zero crypto fields) +4. Archive layout offset table with CHECK verification +5. Header hex table (version=2, entry_count=3) +6. Each TOC entry hex table showing entry_type and permissions fields +7. Data block hex (only 1 block for the single file) +8. Complete annotated hex dump +9. Updated shell decode walkthrough showing directory handling: "if entry_type is 0x01, mkdir -p and chmod, then skip to next entry" + +**Style:** Follow exact same conventions as v1.0 worked example -- field tables, offset verification formulas, annotated hex dump format, shell decode walkthrough. + + + grep -c "project/src/main.rs" docs/FORMAT.md | xargs test 3 -le + + +- Worked example shows 3 entries: 2 directories (project/src, project/empty) and 1 file (project/src/main.rs) +- Each entry shows entry_type and permissions fields in hex tables +- Directory entries show all-zero crypto fields (iv, hmac, sha256, sizes) +- File entry shows full crypto pipeline (SHA-256, gzip, PKCS7, AES-CBC, HMAC) +- Archive layout table has internally consistent offsets verified by formulas +- Annotated hex dump covers all bytes +- Shell decode walkthrough handles directory entries (mkdir -p + chmod) + + + + + + +After both tasks complete, verify: + +1. `grep -c "entry_type" docs/FORMAT.md` returns >= 5 (field table + entry type values + worked example + decode order) +2. `grep -c "permissions" docs/FORMAT.md` returns >= 5 (field table + permission bits layout + worked example entries) +3. `grep "entry_size = 104" docs/FORMAT.md` returns the updated formula +4. `grep "project/src/main.rs" docs/FORMAT.md` returns matches in the worked example +5. `grep "project/empty" docs/FORMAT.md` returns matches showing the empty directory entry +6. `grep "version.*2" docs/FORMAT.md` returns the bumped version +7. No stale v1.0 references (check that entry_size formula no longer says 101) + + + +1. FORMAT.md Section 5 defines entry_type (1 byte, u8) and permissions (2 bytes, u16 LE) fields in the TOC entry +2. Entry type values table distinguishes files (0x00) from directories (0x01) with clear rules for zero-filled fields on directories +3. Permission bits table matches POSIX mode_t lower 12 bits with examples (0o755, 0o644) +4. Entry names documented as relative paths with `/` separator, no leading `/`, no `..` +5. Worked example includes nested directory, file, and empty directory with correct offsets +6. Entry size formula is 104 + name_length (was 101 + name_length) +7. Version bumped to 2 +8. Decode order of operations updated for directory entry handling + + + +After completion, create `.planning/phases/07-format-spec-update/07-01-SUMMARY.md` +