android-encrypted-archiver/.planning/phases/06-obfuscation-hardening/06-02-PLAN.md at 0cd76d7a32a068c3272ee5f5fcbba3fcdace51ef

NikitolProject/android-encrypted-archiver

Fork 0

Files

NikitolProject 0cd76d7a32 docs(06-obfuscation-hardening): create phase plan

2026-02-25 02:12:16 +03:00

13 KiB

Raw Blame History

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves

phase

plan

type

wave

depends_on

files_modified

autonomous

requirements

must_haves

06-obfuscation-hardening

execute

06-01

kotlin/ArchiveDecoder.kt

shell/decode.sh

kotlin/test_decoder.sh

shell/test_decoder.sh

true

FMT-06

FMT-07

FMT-08

truths

artifacts

key_links

Kotlin decoder extracts files from obfuscated archives (XOR header + encrypted TOC + decoy padding) producing byte-identical output

Shell decoder extracts files from obfuscated archives producing byte-identical output

All 6 Kotlin cross-validation tests pass (Rust pack with obfuscation -> Kotlin decode -> SHA-256 match)

All 6 Shell cross-validation tests pass (Rust pack with obfuscation -> Shell decode -> SHA-256 match)

Both decoders handle XOR bootstrapping (check magic, if mismatch XOR 40 bytes and re-check)

Both decoders decrypt encrypted TOC before parsing entries when flags bit 1 is set

path	provides	contains
kotlin/ArchiveDecoder.kt	XOR_KEY constant, xorHeader() function, TOC decryption, updated decode() with obfuscation support	XOR_KEY

path	provides	contains
shell/decode.sh	XOR de-obfuscation loop, TOC decryption via openssl, updated TOC parsing from decrypted temp file	XOR_KEY_HEX

path	provides
kotlin/test_decoder.sh	Cross-validation tests using obfuscated archives

path	provides
shell/test_decoder.sh	Cross-validation tests using obfuscated archives

from	to	via	pattern
kotlin/ArchiveDecoder.kt decode()	xorHeader()	XOR bootstrapping on header bytes before parseHeader	xorHeader

from	to	via	pattern
kotlin/ArchiveDecoder.kt decode()	decryptAesCbc()	Encrypted TOC bytes decrypted with toc_iv before parseToc	decryptAesCbc.*toc

from	to	via	pattern
shell/decode.sh	openssl enc -d	Encrypted TOC extracted to temp file, decrypted, then parsed from decrypted file	openssl enc.*toc

Update Kotlin and Shell decoders to handle obfuscated archives (XOR header + encrypted TOC + decoy padding) and verify all three decoders produce byte-identical output via cross-validation tests.

Purpose: Complete the obfuscation hardening by ensuring all decoder implementations correctly handle the new format. This is the final piece -- the Rust archiver (Plan 01) produces obfuscated archives, and now all decoders must read them.

Output: Updated ArchiveDecoder.kt and decode.sh with obfuscation support. All cross-validation tests pass.

<execution_context> @/home/nick/.claude/get-shit-done/workflows/execute-plan.md @/home/nick/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/06-obfuscation-hardening/06-RESEARCH.md @.planning/phases/06-obfuscation-hardening/06-01-SUMMARY.md @docs/FORMAT.md (Sections 9.1-9.3 and Section 10) @kotlin/ArchiveDecoder.kt @shell/decode.sh @kotlin/test_decoder.sh @shell/test_decoder.sh Task 1: Update Kotlin decoder with XOR header + encrypted TOC support kotlin/ArchiveDecoder.kt, kotlin/test_decoder.sh Update ArchiveDecoder.kt to handle obfuscated archives. Follow the decoder order from FORMAT.md Section 10 and 06-RESEARCH.md patterns.

Add XOR_KEY constant and xorHeader() function:

val XOR_KEY = byteArrayOf(
    0xA5.toByte(), 0x3C, 0x96.toByte(), 0x0F,
    0xE1.toByte(), 0x7B, 0x4D, 0xC8.toByte()
)

fun xorHeader(buf: ByteArray) {
    for (i in 0 until minOf(buf.size, 40)) {
        buf[i] = ((buf[i].toInt() and 0xFF) xor (XOR_KEY[i % 8].toInt() and 0xFF)).toByte()
    }
}

Note: MUST use and 0xFF on BOTH operands to avoid Kotlin signed byte issues (06-RESEARCH.md Pitfall 4).

Update decode() function:

XOR bootstrapping (after reading 40-byte headerBytes):
- Check if first 4 bytes match MAGIC.
- If NO match: call xorHeader(headerBytes).
- Then call parseHeader(headerBytes) (which validates magic).
TOC decryption (before parsing TOC entries):
- After parsing header, check header.flags and 0x02 != 0 (bit 1 = TOC encrypted).
- If set: seek to header.tocOffset, read header.tocSize.toInt() bytes, decrypt with decryptAesCbc(encryptedToc, header.tocIv, KEY).
- Parse TOC from decrypted bytes: parseToc(decryptedToc, header.fileCount).
- If NOT set (backward compat): read raw TOC bytes as before and parse directly.
parseToc() adjustment for encrypted TOC:
- Currently parseToc() asserts pos == data.size. After TOC encryption, the decrypted buffer may have PKCS7 padding bytes stripped, so the size should match the sum of entry sizes. Keep the assertion -- it validates that the decrypted plaintext is correct.
Decoy padding requires NO decoder changes -- decoders already use absolute data_offset from TOC entries to seek to each file's ciphertext. Padding is naturally skipped.

Re-run cross-validation tests (kotlin/test_decoder.sh). The test script already:

Builds the Rust archiver (cargo build --release)
Creates test files, packs with Rust, decodes with Kotlin, compares SHA-256
Now the Rust archiver produces obfuscated archives, so the Kotlin decoder must handle them.

No changes needed to test_decoder.sh unless the test script has hardcoded assumptions about archive format. Read it first and verify. cd /home/nick/Projects/Rust/encrypted_archive && bash kotlin/test_decoder.sh 2>&1 | tail -10 Check that kotlin/ArchiveDecoder.kt contains xorHeader function and TOC decryption logic Kotlin decoder handles XOR-obfuscated headers, encrypted TOC, and archives with decoy padding. All 6 cross-validation tests pass (Rust pack -> Kotlin decode -> SHA-256 match).

Task 2: Update Shell decoder with XOR header + encrypted TOC support shell/decode.sh, shell/test_decoder.sh Update decode.sh to handle obfuscated archives. This is the most complex change because shell has no native XOR and TOC parsing must switch from reading the archive file to reading a decrypted temp file.

1. Add XOR de-obfuscation (after reading magic, before parsing header fields):

Replace the current magic check block (lines ~108-113) with XOR bootstrapping:

XOR_KEY_HEX="a53c960fe17b4dc8"

# Read 40-byte header as hex string (80 hex chars)
raw_header_hex=$(read_hex "$ARCHIVE" 0 40)
magic_hex=$(printf '%.8s' "$raw_header_hex")

if [ "$magic_hex" != "00ea7263" ]; then
    # Attempt XOR de-obfuscation
    header_hex=""
    byte_idx=0
    while [ "$byte_idx" -lt 40 ]; do
        hex_pos=$((byte_idx * 2))
        # Extract this byte from raw header (2 hex chars)
        raw_byte=$(printf '%s' "$raw_header_hex" | cut -c$((hex_pos + 1))-$((hex_pos + 2)))
        # Extract key byte (cyclic)
        key_pos=$(( (byte_idx % 8) * 2 ))
        key_byte=$(printf '%s' "$XOR_KEY_HEX" | cut -c$((key_pos + 1))-$((key_pos + 2)))
        # XOR
        xored=$(printf '%02x' "$(( 0x$raw_byte ^ 0x$key_byte ))")
        header_hex="${header_hex}${xored}"
        byte_idx=$((byte_idx + 1))
    done

    # Verify magic after XOR
    magic_hex=$(printf '%.8s' "$header_hex")
    if [ "$magic_hex" != "00ea7263" ]; then
        printf 'Invalid archive: bad magic bytes\n' >&2
        exit 1
    fi
else
    header_hex="$raw_header_hex"
fi

# Write de-XORed header to temp file for field parsing
printf '%s' "$header_hex" | xxd -r -p > "$TMPDIR/header.bin"

If xxd is not available (HAS_XXD=0), use an od-based approach to write the binary header from hex. For the xxd -r -p replacement when only od is available, use printf with octal escapes or a python one-liner. However, since the existing code already checks for xxd availability and falls back to od for reading, check if xxd -r -p is available. If not, use:

# Fallback: write binary from hex using printf with octal
i=0
: > "$TMPDIR/header.bin"
while [ $i -lt 80 ]; do
    byte_hex=$(printf '%s' "$header_hex" | cut -c$((i + 1))-$((i + 2)))
    printf "\\$(printf '%03o' "0x$byte_hex")" >> "$TMPDIR/header.bin"
    i=$((i + 2))
done

2. Parse header fields from temp file instead of archive:

Change all header field reads to use $TMPDIR/header.bin:

version_hex=$(read_hex "$TMPDIR/header.bin" 4 1)
version=$(printf '%d' "0x${version_hex}")
flags_hex=$(read_hex "$TMPDIR/header.bin" 5 1)
flags=$(printf '%d' "0x${flags_hex}")
file_count=$(read_le_u16 "$TMPDIR/header.bin" 6)
toc_offset=$(read_le_u32 "$TMPDIR/header.bin" 8)
toc_size=$(read_le_u32 "$TMPDIR/header.bin" 12)
toc_iv_hex=$(read_hex "$TMPDIR/header.bin" 16 16)

3. TOC decryption (when flags bit 1 is set):

After reading header fields, check TOC encryption flag:

toc_encrypted=$(( flags & 2 ))

if [ "$toc_encrypted" -ne 0 ]; then
    # Extract encrypted TOC to temp file
    dd if="$ARCHIVE" bs=1 skip="$toc_offset" count="$toc_size" of="$TMPDIR/toc_enc.bin" 2>/dev/null

    # Decrypt TOC
    openssl enc -d -aes-256-cbc -nosalt \
        -K "$KEY_HEX" -iv "$toc_iv_hex" \
        -in "$TMPDIR/toc_enc.bin" -out "$TMPDIR/toc_dec.bin"

    TOC_FILE="$TMPDIR/toc_dec.bin"
    TOC_BASE_OFFSET=0
else
    TOC_FILE="$ARCHIVE"
    TOC_BASE_OFFSET=$toc_offset
fi

4. Update TOC parsing loop to use TOC_FILE and TOC_BASE_OFFSET:

Change pos=$toc_offset to pos=$TOC_BASE_OFFSET.

Change ALL references to "$ARCHIVE" in the TOC field reads to "$TOC_FILE":

read_le_u16 "$TOC_FILE" "$pos" instead of read_le_u16 "$ARCHIVE" "$pos"
dd if="$TOC_FILE" ... for filename read
read_le_u32 "$TOC_FILE" "$pos" for all u32 fields
read_hex "$TOC_FILE" "$pos" N for IV, HMAC, SHA-256, compression_flag

This is the biggest refactor (06-RESEARCH.md Pitfall 1). Every field read in the TOC loop (lines ~141-183) must change from $ARCHIVE to $TOC_FILE.

IMPORTANT HMAC exception: The HMAC verification reads IV bytes from $ARCHIVE at $iv_toc_pos (the absolute archive position). After TOC encryption, IV is stored in the TOC entries (which are now in the decrypted file). The HMAC input is still IV || ciphertext from the archive data block. So for HMAC computation:

IV comes from the TOC entry (already parsed as $iv_hex).
Ciphertext comes from $ARCHIVE at $data_offset.
The HMAC input must be constructed from the parsed iv_hex and the raw ciphertext from the archive.

Change the HMAC verification to construct IV from the parsed hex variable instead of reading from the archive at the TOC position:

computed_hmac=$( {
    printf '%s' "$iv_hex" | xxd -r -p
    cat "$TMPDIR/ct.bin"
} | openssl dgst -sha256 -mac HMAC -macopt "hexkey:${KEY_HEX}" -hex 2>/dev/null | awk '{print $NF}' )

With od fallback for xxd -r -p if needed.

5. No changes needed for decoy padding: The decoder uses data_offset from TOC entries (absolute offsets), so padding between blocks is naturally skipped.

Re-run cross-validation tests (shell/test_decoder.sh). No changes should be needed to the test script since it already tests Rust pack -> Shell decode -> SHA-256 comparison. cd /home/nick/Projects/Rust/encrypted_archive && sh shell/test_decoder.sh 2>&1 | tail -10 Check that decode.sh has XOR_KEY_HEX variable, XOR loop, and TOC decryption section Shell decoder handles XOR-obfuscated headers, encrypted TOC, and archives with decoy padding. All 6 cross-validation tests pass (Rust pack -> Shell decode -> SHA-256 match). HMAC verification works with IV from parsed TOC entry.

1. `bash kotlin/test_decoder.sh` -- all 6 Kotlin cross-validation tests pass 2. `sh shell/test_decoder.sh` -- all 6 Shell cross-validation tests pass 3. Kotlin decoder correctly applies XOR bootstrapping + TOC decryption 4. Shell decoder correctly applies XOR bootstrapping + TOC decryption from temp file 5. Both decoders produce byte-identical output to Rust unpack on the same obfuscated archive 6. `strings obfuscated_archive.bin | grep -i "hello\|test\|file"` returns nothing (no plaintext metadata leaks)

<success_criteria>

All three decoders (Rust, Kotlin, Shell) produce byte-identical output from obfuscated archives
12 cross-validation tests pass (6 Kotlin + 6 Shell)
Phase 6 success criteria from ROADMAP.md are fully met:
1. File table encrypted with its own IV -- hex dump reveals no plaintext metadata
2. Headers XOR-obfuscated -- no recognizable structure in first 256 bytes
3. Random decoy padding between blocks -- file boundaries not detectable
4. All three decoders still produce byte-identical output </success_criteria>

After completion, create `.planning/phases/06-obfuscation-hardening/06-02-SUMMARY.md`

13 KiB Raw Blame History

13 KiB

Raw Blame History