--- phase: 04-kotlin-decoder plan: 01 type: execute wave: 1 depends_on: [] files_modified: - kotlin/ArchiveDecoder.kt - kotlin/test_decoder.sh autonomous: true requirements: - KOT-01 - KOT-02 - KOT-03 - KOT-04 must_haves: truths: - "Kotlin decoder parses 40-byte archive header with magic byte verification, version check, and flags validation" - "Kotlin decoder parses variable-length TOC entries sequentially with correct little-endian integer decoding" - "Kotlin decoder verifies HMAC-SHA256 BEFORE attempting decryption for each file" - "Kotlin decoder decrypts AES-256-CBC ciphertext using javax.crypto with PKCS5Padding (auto-removes PKCS7)" - "Kotlin decoder decompresses gzip data only when compression_flag == 1" - "Kotlin decoder verifies SHA-256 checksum after decompression" - "Kotlin decoder uses the exact same 32-byte hardcoded key as src/key.rs" - "Cross-validation script creates archives with Rust CLI and verifies Kotlin decoder produces byte-identical output" artifacts: - path: "kotlin/ArchiveDecoder.kt" provides: "Complete archive decoder: header parsing, TOC parsing, HMAC verify, AES decrypt, gzip decompress, SHA-256 verify, CLI main()" min_lines: 200 - path: "kotlin/test_decoder.sh" provides: "Cross-validation script: pack with Rust, decode with Kotlin, SHA-256 comparison" min_lines: 50 key_links: - from: "kotlin/ArchiveDecoder.kt" to: "src/key.rs" via: "Identical 32-byte key constant" pattern: "0x7A.*0x35.*0xC1.*0xD9" - from: "kotlin/ArchiveDecoder.kt" to: "docs/FORMAT.md Section 4-5" via: "Binary format parsing matching exact field offsets and sizes" pattern: "MAGIC.*0x00.*0xEA.*0x72.*0x63" - from: "kotlin/test_decoder.sh" to: "kotlin/ArchiveDecoder.kt" via: "Compiles and runs decoder JAR" pattern: "kotlinc.*ArchiveDecoder" --- Implement the complete Kotlin archive decoder and cross-validation test script. Purpose: Provide the Android 13 extraction path (primary decoder on target device) using only javax.crypto and java.util.zip -- no native libraries or third-party dependencies. Also create a cross-validation script that proves byte-identical output against the Rust archiver. Output: `kotlin/ArchiveDecoder.kt` (standalone decoder with CLI main), `kotlin/test_decoder.sh` (cross-validation script) @/home/nick/.claude/get-shit-done/workflows/execute-plan.md @/home/nick/.claude/get-shit-done/templates/summary.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/04-kotlin-decoder/04-RESEARCH.md # Critical references for binary format parsing @docs/FORMAT.md @src/key.rs @src/format.rs Task 1: Implement ArchiveDecoder.kt with full decode pipeline kotlin/ArchiveDecoder.kt Create `kotlin/ArchiveDecoder.kt` as a single self-contained Kotlin file with a `main()` function for CLI usage. **IMPORTANT: Before writing ANY code, the executor MUST use Context7 to verify current Kotlin/JVM API usage:** - `mcp__context7__resolve-library-id` for "kotlin" to confirm byte array handling, toByte() semantics - Do NOT rely on training data -- verify `ByteBuffer`, `Cipher`, `Mac`, `MessageDigest` API signatures **Structure (single file, top to bottom):** 1. **Imports** -- Only Android SDK / JVM standard library: - `java.io.ByteArrayInputStream`, `java.io.File`, `java.io.RandomAccessFile` - `java.nio.ByteBuffer`, `java.nio.ByteOrder` - `java.security.MessageDigest` - `java.util.zip.GZIPInputStream` - `javax.crypto.Cipher`, `javax.crypto.Mac` - `javax.crypto.spec.IvParameterSpec`, `javax.crypto.spec.SecretKeySpec` 2. **Constants:** - `MAGIC` = `byteArrayOf(0x00, 0xEA.toByte(), 0x72, 0x63)` -- match FORMAT.md Section 4 - `HEADER_SIZE` = 40 - `KEY` = exact 32-byte array from `src/key.rs`: ``` 0x7A, 0x35, 0xC1.toByte(), 0xD9.toByte(), 0x4F, 0xE8.toByte(), 0x2B, 0x6A, 0x91.toByte(), 0x0D, 0xF3.toByte(), 0x58, 0xBC.toByte(), 0x74, 0xA6.toByte(), 0x1E, 0x42, 0x8F.toByte(), 0xD0.toByte(), 0x63, 0xE5.toByte(), 0x17, 0x9B.toByte(), 0x2C, 0xFA.toByte(), 0x84.toByte(), 0x06, 0xCD.toByte(), 0x3E, 0x79, 0xB5.toByte(), 0x50, ``` - Note: ALL byte values > 0x7F need `.toByte()` cast (Kotlin signed bytes) 3. **Data classes:** - `ArchiveHeader` (version: Int, flags: Int, fileCount: Int, tocOffset: Long, tocSize: Long, tocIv: ByteArray) - `TocEntry` (name: String, originalSize: Long, compressedSize: Long, encryptedSize: Int, dataOffset: Long, iv: ByteArray, hmac: ByteArray, sha256: ByteArray, compressionFlag: Int, paddingAfter: Int) 4. **Helper functions for LE integer reading (using ByteBuffer):** - `readLeU16(data: ByteArray, offset: Int): Int` -- `ByteBuffer.wrap(data, offset, 2).order(LITTLE_ENDIAN).short.toInt() and 0xFFFF` - `readLeU32(data: ByteArray, offset: Int): Long` -- `ByteBuffer.wrap(data, offset, 4).order(LITTLE_ENDIAN).int.toLong() and 0xFFFFFFFFL` 5. **parseHeader(data: ByteArray): ArchiveHeader** -- FORMAT.md Section 4: - Verify magic bytes (data[0..3] vs MAGIC) - Read version at offset 4, require == 1 - Read flags at offset 5, reject if bits 4-7 are non-zero - Read file_count (u16 LE at offset 6) - Read toc_offset (u32 LE at offset 8) - Read toc_size (u32 LE at offset 12) - Read toc_iv (16 bytes at offset 16) - Use `require()` for all validation failures 6. **parseTocEntry(data: ByteArray, offset: Int): Pair** -- FORMAT.md Section 5: - Read name_length (u16 LE) - Read name (UTF-8 bytes, `String(data, pos, nameLength, Charsets.UTF_8)`) - Read fixed fields sequentially: original_size(u32), compressed_size(u32), encrypted_size(u32), data_offset(u32) - Read iv (16 bytes via copyOfRange) - Read hmac (32 bytes via copyOfRange) - Read sha256 (32 bytes via copyOfRange) - Read compression_flag (u8) - Read padding_after (u16 LE) - Return Pair(entry, newOffset) - Entry size formula: 101 + name_length bytes 7. **parseToc(data: ByteArray, fileCount: Int): List** -- parse all entries sequentially, assert cursor == data.size at end 8. **Crypto utility functions:** - `verifyHmac(iv: ByteArray, ciphertext: ByteArray, key: ByteArray, expectedHmac: ByteArray): Boolean` - `Mac.getInstance("HmacSHA256")`, init with `SecretKeySpec(key, "HmacSHA256")` - `mac.update(iv)`, `mac.update(ciphertext)`, `mac.doFinal()` - Compare with `contentEquals()` (NOT `==`) - `decryptAesCbc(ciphertext: ByteArray, iv: ByteArray, key: ByteArray): ByteArray` - `Cipher.getInstance("AES/CBC/PKCS5Padding")` -- NOT PKCS7Padding, NOT specify provider - `cipher.init(Cipher.DECRYPT_MODE, SecretKeySpec(key, "AES"), IvParameterSpec(iv))` - `cipher.doFinal(ciphertext)` -- auto-removes PKCS7 padding - `decompressGzip(compressed: ByteArray): ByteArray` - `GZIPInputStream(ByteArrayInputStream(compressed)).readBytes()` - `verifySha256(data: ByteArray, expectedSha256: ByteArray): Boolean` - `MessageDigest.getInstance("SHA-256").digest(data)` - Compare with `contentEquals()` 9. **decode(archivePath: String, outputDir: String)** -- main decode orchestration: - Open archive with `RandomAccessFile(archivePath, "r")` - Read 40-byte header, call parseHeader() - Seek to tocOffset, read tocSize bytes, call parseToc() - For each TocEntry: a. Seek to dataOffset, read encryptedSize bytes (ciphertext) b. **HMAC FIRST**: call verifyHmac(). If fails: print "HMAC failed for {name}, skipping" to stderr, continue c. Decrypt: call decryptAesCbc() d. Decompress: if compressionFlag == 1, call decompressGzip(); else use decrypted directly e. Verify SHA-256: call verifySha256(). If fails: print "WARNING: SHA-256 mismatch for {name}" to stderr (but still write) f. Write output: `File(outputDir, entry.name).writeBytes(original)` g. Print "Extracted: {name} ({originalSize} bytes)" to stdout - Close RandomAccessFile - Error handling: HMAC failure skips file (continue), SHA-256 mismatch warns but writes (matching Rust behavior from Phase 2 decisions) 10. **main(args: Array)** -- CLI entry: - Usage: `java -jar ArchiveDecoder.jar ` - Validate args.size == 2 - Create output directory if not exists: `File(args[1]).mkdirs()` - Call decode(args[0], args[1]) - Print summary: "Done: {N} files extracted" **Critical anti-patterns to avoid (from RESEARCH.md):** - Do NOT use BouncyCastle provider ("BC") -- deprecated on Android - Do NOT use `==` for ByteArray comparison -- use `contentEquals()` - Do NOT forget `.toByte()` for hex literals > 0x7F - Do NOT manually remove PKCS7 padding -- `cipher.doFinal()` handles it - Do NOT decrypt before HMAC verification -- verify HMAC FIRST - Do NOT try to decompress when `compressionFlag == 0` grep -c "contentEquals" kotlin/ArchiveDecoder.kt | xargs test 4 -le && grep -c "PKCS5Padding" kotlin/ArchiveDecoder.kt | xargs test 1 -le && grep -c "HmacSHA256" kotlin/ArchiveDecoder.kt | xargs test 1 -le && grep "0x7A" kotlin/ArchiveDecoder.kt | grep -q "0x50" && echo "PASS: Key structural checks" || echo "FAIL" Verify ArchiveDecoder.kt contains: (1) magic byte check, (2) HMAC verification before decryption, (3) correct key bytes matching src/key.rs, (4) PKCS5Padding not PKCS7Padding, (5) contentEquals for all byte comparisons kotlin/ArchiveDecoder.kt exists with 200+ lines, contains complete decode pipeline: header parsing, TOC parsing, HMAC-then-decrypt flow, gzip decompression, SHA-256 verification, CLI main(). Uses only javax.crypto + java.util.zip + java.nio -- zero third-party dependencies. Task 2: Create cross-validation test script kotlin/test_decoder.sh Create `kotlin/test_decoder.sh` -- a shell script that: 1. **Checks prerequisites:** - Verify `kotlinc` is available (or print install instructions and exit) - Verify `java` is available - Verify `cargo` is available - Build Rust archiver if needed: `cargo build --release -q` 2. **Compile Kotlin decoder:** - `kotlinc kotlin/ArchiveDecoder.kt -include-runtime -d kotlin/ArchiveDecoder.jar 2>&1` - Check exit code 3. **Test case 1: Single text file** - Create temp dir - Create `hello.txt` with content "Hello, World!" - Pack: `./target/release/encrypted_archive pack hello.txt -o test1.archive` - Decode: `java -jar kotlin/ArchiveDecoder.jar test1.archive output1/` - Verify: `sha256sum` of extracted file matches original - Print PASS/FAIL 4. **Test case 2: Multiple files with mixed content** - Create `text.txt` with multiline UTF-8 text (including Cyrillic characters) - Create `binary.bin` with `dd if=/dev/urandom bs=1024 count=10` - Pack both files: `./target/release/encrypted_archive pack text.txt binary.bin -o test2.archive` - Decode: `java -jar kotlin/ArchiveDecoder.jar test2.archive output2/` - Verify SHA-256 of each extracted file matches original - Print PASS/FAIL for each file 5. **Test case 3: Already-compressed file (APK-like, --no-compress)** - Create `fake.apk` with random bytes - Pack with no-compress: `./target/release/encrypted_archive pack --no-compress fake.apk -o test3.archive` - Decode: `java -jar kotlin/ArchiveDecoder.jar test3.archive output3/` - Verify byte-identical - Print PASS/FAIL 6. **Test case 4: Empty file** - Create empty `empty.txt` (touch) - Pack: `./target/release/encrypted_archive pack empty.txt -o test4.archive` - Decode: `java -jar kotlin/ArchiveDecoder.jar test4.archive output4/` - Verify empty output file - Print PASS/FAIL 7. **Summary:** - Print total PASS/FAIL count - Clean up temp files - Exit with 0 if all pass, 1 if any fail **Script should:** - Use `set -euo pipefail` at the top - Use absolute paths relative to `$SCRIPT_DIR` (dirname of script) - Create all temp files in a temp directory cleaned up on exit (trap) - Use colored output (green PASS, red FAIL) if terminal supports it - Be executable (chmod +x header with #!/usr/bin/env bash) **Note:** The script requires `kotlinc` and `java` to be installed. It should detect their absence and print helpful installation instructions (e.g., `sdk install kotlin` or `apt install default-jdk`) rather than failing cryptically. bash -n kotlin/test_decoder.sh && test -x kotlin/test_decoder.sh && grep -q "sha256sum" kotlin/test_decoder.sh && grep -q "kotlinc" kotlin/test_decoder.sh && echo "PASS: Script is valid bash and contains key commands" || echo "FAIL" Review test_decoder.sh covers: text file, binary file, Cyrillic names, no-compress mode, empty file. Check it verifies SHA-256 of each extracted file against original. kotlin/test_decoder.sh exists, is executable, contains 5+ test cases covering text/binary/Cyrillic/no-compress/empty files, creates archives with Rust CLI, decodes with Kotlin JAR, and verifies byte-identical output via SHA-256 comparison. 1. `kotlin/ArchiveDecoder.kt` exists with complete decode pipeline (header + TOC + HMAC + AES + gzip + SHA-256) 2. `kotlin/ArchiveDecoder.kt` uses only javax.crypto, java.util.zip, java.nio, java.security, java.io (no third-party) 3. `kotlin/ArchiveDecoder.kt` hardcoded key matches `src/key.rs` exactly 4. `kotlin/ArchiveDecoder.kt` verifies HMAC before decryption (grep confirms `verifyHmac` call precedes `decryptAesCbc` call) 5. `kotlin/test_decoder.sh` is valid bash (bash -n passes), executable, covers 4+ test scenarios 6. If kotlinc/java are available: `bash kotlin/test_decoder.sh` passes all tests - ArchiveDecoder.kt is a complete, standalone Kotlin decoder that can extract files from archives produced by the Rust archiver - Decoder uses only Android SDK built-in libraries (KOT-01, KOT-02) - HMAC verification happens before decryption for every file (KOT-03) - SHA-256 checksum verification happens after decompression (KOT-04) - Cross-validation test script is ready to run when kotlinc/java are installed After completion, create `.planning/phases/04-kotlin-decoder/04-01-SUMMARY.md`