13 KiB
phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves
| phase | plan | type | wave | depends_on | files_modified | autonomous | requirements | must_haves | |||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 07-format-spec-update | 01 | execute | 1 |
|
true |
|
|
Purpose: All three decoders (Rust, Kotlin, Shell) need an unambiguous specification to build their v1.1 directory support against. This phase updates the normative format document before any code changes.
Output: Updated docs/FORMAT.md with v1.1 TOC entry fields and a new worked example showing a directory archive.
<execution_context> @/home/nick/.claude/get-shit-done/workflows/execute-plan.md @/home/nick/.claude/get-shit-done/templates/summary.md </execution_context>
@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @docs/FORMAT.mdKey decisions from STATE.md:
- v1.1: No backward compatibility with v1.0 archives (format version bump to 2)
- v1.1: Only mode bits (no uid/gid, no timestamps, no symlinks)
- v1.0: Filename-only entry names -- v1.1 changes this to relative paths with
/separator
Existing FORMAT.md patterns (from Phase 1):
- Field table pattern: offset, size, type, endian, field name, description for every binary structure
- Worked example pattern: concrete inputs -> pipeline walkthrough -> hex dump -> shell decode commands
- Entry size formula:
101 + name_length bytesper entry - All offsets absolute from archive byte 0
1. Version bump (Section 1 and header):
- Change document version from "1.0" to "1.1" in the front matter
- Note that format version field in archives is now
2(header byte at offset 0x04) - In Section 11 (Version Compatibility), add that v2 introduces entry_type and permissions fields
2. Section 2 (Notation Conventions):
- Update the filenames note: change "Filenames are UTF-8 encoded" to "Entry names are UTF-8 encoded relative paths using
/as the path separator (e.g.,dir/subdir/file.txt). Names MUST NOT start with/or contain..components. For top-level files, the name is just the filename (e.g.,readme.txt)."
3. Section 3 (Archive Structure Diagram):
- Update the TOC description comment: entries now represent files AND directories
4. Section 4 (Archive Header):
- Change version field description: "Format version. Value
2for this specification (v1.1). Value1for legacy v1.0 (no directory support)." - In the
file_countfield, rename toentry_countand update description: "Number of entries (files and directories) stored in the archive." - Update the toc_offset, toc_size field descriptions to reference "entry table" where they say "file table"
5. Section 5 (File Table Entry Definition) -- the core change:
Rename section title to "Table of Contents (TOC) Entry Definition" for clarity.
Add two new fields to the Entry Field Table AFTER name and BEFORE original_size:
| Field | Size | Type | Endian | Description |
|---|---|---|---|---|
entry_type |
1 | u8 | - | Entry type: 0x00 = regular file, 0x01 = directory. Directories have original_size, compressed_size, and encrypted_size all set to 0 and no corresponding data block. |
permissions |
2 | u16 | LE | Unix permission bits (lower 12 bits of POSIX mode_t). Bit layout: [suid(1)][sgid(1)][sticky(1)][owner_rwx(3)][group_rwx(3)][other_rwx(3)]. Example: 0o755 = 0x01ED = owner rwx, group r-x, other r-x. Stored as u16 LE. |
Add a subsection "### Entry Type Values" with a table:
| Value | Name | Description |
|---|---|---|
0x00 |
File | Regular file. Has associated data block with ciphertext. All size fields and data_offset are meaningful. |
0x01 |
Directory | Directory entry. original_size, compressed_size, encrypted_size are all 0. data_offset is 0. iv is zero-filled. hmac is zero-filled. sha256 is zero-filled. compression_flag is 0. No data block exists for this entry. |
Add a subsection "### Permission Bits Layout" with a table:
| Bits | Mask | Name | Description |
|---|---|---|---|
| 11 | 0o4000 |
setuid | Set user ID on execution |
| 10 | 0o2000 |
setgid | Set group ID on execution |
| 9 | 0o1000 |
sticky | Sticky bit |
| 8-6 | 0o0700 |
owner | Owner read(4)/write(2)/execute(1) |
| 5-3 | 0o0070 |
group | Group read(4)/write(2)/execute(1) |
| 2-0 | 0o0007 |
other | Other read(4)/write(2)/execute(1) |
Common examples: 0o755 (rwxr-xr-x) = 0x01ED, 0o644 (rw-r--r--) = 0x01A4, 0o700 (rwx------) = 0x01C0.
Add a subsection "### Entry Name Semantics" explaining:
- Names are relative paths from the archive root, using
/as separator - Example: a file at
project/src/main.rshas nameproject/src/main.rs - A directory entry for
project/src/has nameproject/src(no trailing slash) - Names MUST NOT start with
/(no absolute paths) - Names MUST NOT contain
..components (no directory traversal) - The encoder MUST sort entries so that directory entries appear before any files within them (parent-before-child ordering). This allows the decoder to
mkdir -por create directories in a single sequential pass.
6. Update Entry Size Formula:
- Old:
entry_size = 101 + name_length bytes - New:
entry_size = 104 + name_length bytes(added 1 byte entry_type + 2 bytes permissions = +3)
7. Section 6 (Data Block Layout):
- Add note: "Directory entries (entry_type = 0x01) have no data block. The decoder MUST skip directory entries when processing data blocks."
8. Section 10 (Decode Order of Operations):
- In step 3, update version check: "Read version (must be 2 for v1.1)"
- In step 5, add substep before reading ciphertext: "Check entry_type. If 0x01 (directory): create the directory using the entry name as a relative path, apply permissions, and skip to the next entry (no ciphertext to read)."
- In step 5f (Write to output), add: "Create parent directories as needed (using the path components of the entry name). Apply permissions from the entry's
permissionsfield." grep -c "entry_type" docs/FORMAT.md | xargs test 5 -le - Section 5 has entry_type (u8) and permissions (u16 LE) fields in the Entry Field Table
- Entry type values table documents 0x00=file, 0x01=directory
- Permission bits layout table with POSIX mode_t lower 12 bits
- Entry name semantics subsection specifies relative paths with
/separator - Entry size formula updated to 104 + name_length
- Decode order updated for directory handling
- Version bumped to 2
Actually, to avoid confusion, REPLACE the entire worked example with a new v1.1 example. The v1.0 example is no longer valid (version field changed, entry format changed).
New Worked Example: Directory Archive
Use the following input structure:
project/
project/src/ (directory, mode 0755)
project/src/main.rs (file, mode 0644, content: "fn main() {}\n" = 14 bytes)
project/empty/ (empty directory, mode 0755)
This demonstrates:
- A nested directory (
project/src/) - A file inside a nested directory (
project/src/main.rs) - An empty directory (
project/empty/) - Three entry types total: 2 directories + 1 file
Parameters:
- Key: same 32 bytes as v1.0 example (00 01 02 ... 1F)
- Flags:
0x01(compression enabled, no obfuscation -- keep example simple) - Version:
2
Per-entry walkthrough:
Entry 1: project/src (directory)
- entry_type: 0x01
- permissions: 0o755 = 0x01ED (LE: ED 01)
- name: "project/src" (11 bytes)
- original_size: 0, compressed_size: 0, encrypted_size: 0
- data_offset: 0, iv: zero-filled, hmac: zero-filled, sha256: zero-filled
- compression_flag: 0, padding_after: 0
Entry 2: project/src/main.rs (file)
- entry_type: 0x00
- permissions: 0o644 = 0x01A4 (LE: A4 01)
- name: "project/src/main.rs" (19 bytes)
- original_size: 14
- SHA-256 of "fn main() {}\n": compute the real hash
- compressed_size: representative (e.g., 30 bytes for small gzip output)
- encrypted_size: ((30/16)+1)*16 = 32
- IV: representative (e.g., AA BB CC DD EE FF 00 11 22 33 44 55 66 77 88 99)
- hmac: representative, sha256: real value
- compression_flag: 1, padding_after: 0
Entry 3: project/empty (directory)
- entry_type: 0x01
- permissions: 0o755 = 0x01ED (LE: ED 01)
- name: "project/empty" (13 bytes)
- All sizes 0, data_offset 0, iv/hmac/sha256 zero-filled
- compression_flag: 0, padding_after: 0
Layout table: Compute all offsets using the new entry size formula (104 + name_length per entry):
- Header: 40 bytes (0x00 - 0x27)
- TOC Entry 1: 104 + 11 = 115 bytes
- TOC Entry 2: 104 + 19 = 123 bytes
- TOC Entry 3: 104 + 13 = 117 bytes
- TOC total: 115 + 123 + 117 = 355 bytes
- Data block 1 (only file entry): starts at 40 + 355 = 395, size = 32 bytes
- Archive total: 395 + 32 = 427 bytes
Include:
- Input description table (entries, types, permissions, content)
- Parameters (key, flags, version)
- Per-entry pipeline walkthrough (SHA-256 for the file, show directory entries have all-zero crypto fields)
- Archive layout offset table with CHECK verification
- Header hex table (version=2, entry_count=3)
- Each TOC entry hex table showing entry_type and permissions fields
- Data block hex (only 1 block for the single file)
- Complete annotated hex dump
- Updated shell decode walkthrough showing directory handling: "if entry_type is 0x01, mkdir -p and chmod, then skip to next entry"
Style: Follow exact same conventions as v1.0 worked example -- field tables, offset verification formulas, annotated hex dump format, shell decode walkthrough. grep -c "project/src/main.rs" docs/FORMAT.md | xargs test 3 -le
- Worked example shows 3 entries: 2 directories (project/src, project/empty) and 1 file (project/src/main.rs)
- Each entry shows entry_type and permissions fields in hex tables
- Directory entries show all-zero crypto fields (iv, hmac, sha256, sizes)
- File entry shows full crypto pipeline (SHA-256, gzip, PKCS7, AES-CBC, HMAC)
- Archive layout table has internally consistent offsets verified by formulas
- Annotated hex dump covers all bytes
- Shell decode walkthrough handles directory entries (mkdir -p + chmod)
grep -c "entry_type" docs/FORMAT.mdreturns >= 5 (field table + entry type values + worked example + decode order)grep -c "permissions" docs/FORMAT.mdreturns >= 5 (field table + permission bits layout + worked example entries)grep "entry_size = 104" docs/FORMAT.mdreturns the updated formulagrep "project/src/main.rs" docs/FORMAT.mdreturns matches in the worked examplegrep "project/empty" docs/FORMAT.mdreturns matches showing the empty directory entrygrep "version.*2" docs/FORMAT.mdreturns the bumped version- No stale v1.0 references (check that entry_size formula no longer says 101)
<success_criteria>
- FORMAT.md Section 5 defines entry_type (1 byte, u8) and permissions (2 bytes, u16 LE) fields in the TOC entry
- Entry type values table distinguishes files (0x00) from directories (0x01) with clear rules for zero-filled fields on directories
- Permission bits table matches POSIX mode_t lower 12 bits with examples (0o755, 0o644)
- Entry names documented as relative paths with
/separator, no leading/, no.. - Worked example includes nested directory, file, and empty directory with correct offsets
- Entry size formula is 104 + name_length (was 101 + name_length)
- Version bumped to 2
- Decode order of operations updated for directory entry handling </success_criteria>