docs: add project research
This commit is contained in:
124
.planning/research/FEATURES.md
Normal file
124
.planning/research/FEATURES.md
Normal file
@@ -0,0 +1,124 @@
|
||||
# Feature Landscape
|
||||
|
||||
**Domain:** Custom encrypted archiver with proprietary binary format
|
||||
**Researched:** 2026-02-24
|
||||
**Confidence:** MEDIUM (based on domain knowledge of archive formats, encryption patterns, and Android constraints)
|
||||
|
||||
## Table Stakes
|
||||
|
||||
Features that are mandatory for this product to function correctly. Missing any of these means the archive is either non-functional, insecure against casual inspection, or unreliable.
|
||||
|
||||
| Feature | Why Expected | Complexity | Notes |
|
||||
|---------|--------------|------------|-------|
|
||||
| Multi-file packing/unpacking | Core purpose: bundle texts + APKs into one archive | Medium | Need file table/index structure; handle varied sizes from few KB to tens of MB |
|
||||
| AES-256-CBC encryption | Without real encryption, any hex editor reveals content | Medium | busybox openssl supports `aes-256-cbc`; Android javax.crypto supports AES natively |
|
||||
| HMAC-SHA256 integrity (encrypt-then-MAC) | Detect corruption and tampering | Medium | busybox `openssl dgst -sha256 -hmac`; Kotlin `Mac("HmacSHA256")` |
|
||||
| Compression before encryption | Reduce archive size; compression after encryption is ineffective (encrypted data has max entropy) | Low | Use deflate/gzip; must compress BEFORE encrypt |
|
||||
| Hardcoded key embedding | Project requirement: no user-entered passwords, key baked into dearchiver code | Low | Key in Kotlin code and shell script; rotate key means new build |
|
||||
| Custom magic bytes | Standard magic bytes (PK, 7z, etc.) let file/binwalk identify format; custom bytes prevent this | Low | Use random-looking bytes, not human-readable strings; avoid patterns that match known formats |
|
||||
| Round-trip fidelity | Unpacked files must be byte-identical to originals | Low | Verified via checksums; critical for APKs (signature breaks if single byte changes) |
|
||||
| CLI interface for packing | Archiver runs on Linux/macOS developer machine | Low | Standard CLI: `encrypted_archive pack -o output.bin file1.txt file2.apk` |
|
||||
| Kotlin unpacker (Android 13) | Primary dearchiver path on target device | High | Pure JVM, no native libs; must handle javax.crypto for AES |
|
||||
| Busybox shell unpacker (fallback) | Backup when Kotlin app unavailable | High | Only dd, xxd, openssl, sh; format must be simple enough for positional extraction |
|
||||
| File metadata preservation (name, size) | Unpacker must know which bytes belong to which file and what to name them | Low | Stored in file table; at minimum: filename, original size, compressed size, offset |
|
||||
|
||||
## Differentiators
|
||||
|
||||
Features that exceed baseline expectations and provide meaningful protection or usability improvements. Not all are needed for MVP, but they strengthen the product.
|
||||
|
||||
| Feature | Value Proposition | Complexity | Notes |
|
||||
|---------|-------------------|------------|-------|
|
||||
| Format obfuscation: fake headers | Misleads automated analysis tools (binwalk, foremost) that scan for patterns | Medium | Insert decoy headers resembling JPEG, PNG, or random formats at predictable offsets; casual user sees "corrupted image" not "encrypted archive" |
|
||||
| Format obfuscation: shuffled blocks | File data blocks stored out-of-order with a scramble map | Medium | Prevents sequential extraction even if encryption is somehow bypassed; adds complexity to busybox unpacker |
|
||||
| Format obfuscation: randomized padding | Variable-length random padding between blocks | Low | Makes block boundaries unpredictable to static analysis; minimal implementation cost |
|
||||
| Version field in header | Forward-compatible format evolution | Low | Single byte version; unpackers check version and reject incompatible archives gracefully |
|
||||
| Per-file encryption with derived keys | Each file encrypted with unique key derived from master key + file index/salt | Medium | Limits damage if one file's plaintext is known (known-plaintext attack on specific block) |
|
||||
| Progress reporting during pack/unpack | UX for large archives (tens of MB of APKs) | Low | CLI progress bar; Kotlin callback for UI integration |
|
||||
| Dry-run / validation mode | Check archive integrity without full extraction | Low | Verify checksums and structure without writing files to disk; useful for debugging on device |
|
||||
| Configurable compression level | Trade speed vs size for different content types (APKs are already compressed, texts compress well) | Low | APKs benefit little from compression; allow per-file or global setting |
|
||||
| Salt / IV per archive | Each archive uses random IV/nonce even with same key; prevents identical plaintext producing identical ciphertext | Low | Standard crypto practice; 16-byte IV for AES-CBC; must be stored in archive header (unencrypted) |
|
||||
| Error messages that do not leak format info | Unpacker errors say "invalid archive" not "checksum mismatch at block 3 offset 0x4A2" | Low | Defense in depth: even error messages should not help reverse engineering |
|
||||
|
||||
## Anti-Features
|
||||
|
||||
Features to explicitly NOT build. Each would add complexity without matching the project's threat model or constraints.
|
||||
|
||||
| Anti-Feature | Why Avoid | What to Do Instead |
|
||||
|--------------|-----------|-------------------|
|
||||
| Password-based key derivation (PBKDF2/Argon2) | Project explicitly uses hardcoded key; password entry UX is unwanted on car head unit | Embed key directly in Kotlin/shell code; accept that key extraction from APK is possible for determined attackers |
|
||||
| GUI for archiver | Scope creep; CLI is sufficient for developer workflow (pack on laptop, deploy to device) | Well-designed CLI with clear flags and help text |
|
||||
| Windows archiver support | Out of scope per project constraints; Rust cross-compiles easily IF needed later | Linux/macOS only; document that WSL works if Windows user needs it |
|
||||
| Streaming/pipe support | Files are small enough (KB to tens of MB) to fit in memory; streaming adds format complexity that breaks busybox compatibility | Load entire file into memory for pack/unpack; document max file size assumption |
|
||||
| Nested/recursive archives | No use case: archive contains flat list of texts and APKs | Single-level file list only |
|
||||
| File permissions / ownership metadata | Android target manages its own permissions; Unix permissions from build machine are irrelevant | Store only filename and size; ignore mode/owner/timestamps |
|
||||
| Compression algorithm selection at runtime | Over-engineering; one good default is sufficient | Use deflate/gzip -- available everywhere: Rust, Kotlin, busybox; hardcode the choice |
|
||||
| Public-key / asymmetric encryption | Massive complexity increase for no benefit given hardcoded key model | Symmetric encryption only (AES-256) |
|
||||
| Self-extracting archives | Target is Android, not desktop; shell script IS the extractor | Separate archive file + separate unpacker (Kotlin app or shell script) |
|
||||
| DRM or license enforcement | Not the purpose; this is content bundling protection, not DRM | Simple encryption is sufficient for the threat model |
|
||||
| File deduplication within archive | Archive contains distinct files (texts and different APKs); dedup adds complexity with near-zero benefit | Pack files as-is |
|
||||
| Encryption of filenames in file table | Nice in theory but busybox shell unpacker needs to know filenames to extract; encrypting the file table massively complicates the shell path | Store filenames inside the encrypted payload (entire payload is encrypted, so filenames are protected by archive-level encryption) |
|
||||
|
||||
## Feature Dependencies
|
||||
|
||||
```
|
||||
Compression --> Encryption --> Format Assembly (compression MUST happen before encryption)
|
||||
|
|
||||
v
|
||||
Integrity Checks (HMAC over encrypted blocks)
|
||||
|
||||
Custom Magic Bytes --> Format Header Design
|
||||
Version Field --> Format Header Design
|
||||
Salt/IV Storage --> Format Header Design
|
||||
|
||||
File Metadata (name, size) --> File Table Structure --> Format Assembly
|
||||
|
||||
Format Assembly --> CLI Packer (Rust)
|
||||
|
||||
Format Specification --> Kotlin Unpacker
|
||||
Format Specification --> Busybox Shell Unpacker
|
||||
|
||||
Per-file Key Derivation --> Requires Format Specification to include file index/salt
|
||||
Fake Headers --> Requires Format Assembly to insert decoys at correct positions
|
||||
Shuffled Blocks --> Requires File Table to store block ordering map
|
||||
```
|
||||
|
||||
**Critical dependency chain:**
|
||||
```
|
||||
Format Spec (on paper)
|
||||
--> Rust Packer (implements spec)
|
||||
--> Kotlin Unpacker (reads spec)
|
||||
--> Shell Unpacker (reads spec)
|
||||
--> Round-trip tests (validates all three agree)
|
||||
```
|
||||
|
||||
The format specification must be finalized BEFORE any implementation begins, because three independent implementations (Rust, Kotlin, shell) must produce identical results.
|
||||
|
||||
## MVP Recommendation
|
||||
|
||||
**Prioritize (Phase 1 -- must ship):**
|
||||
|
||||
1. **Format specification document** -- Define header, file table, block layout, magic bytes, version field, IV/salt placement
|
||||
2. **Compression + Encryption pipeline** -- Compress with gzip, encrypt with AES-256-CBC, authenticate with HMAC-SHA256
|
||||
3. **Rust CLI packer** -- Pack multiple files into the custom format
|
||||
4. **Integrity verification via HMAC-SHA256** -- Encrypt-then-MAC for both integrity and authenticity
|
||||
5. **Kotlin unpacker** -- Primary extraction path on Android 13. Pure JVM using javax.crypto
|
||||
6. **Busybox shell unpacker** -- Fallback extraction. This constrains the format to be simple
|
||||
7. **Round-trip tests** -- Verify Rust-pack, Kotlin-unpack, shell-unpack all produce identical output
|
||||
|
||||
**Defer (Phase 2 -- after MVP works):**
|
||||
|
||||
- **Fake headers / decoy data** -- Obfuscation layer; adds no functional value, purely anti-analysis
|
||||
- **Shuffled blocks** -- Significant complexity, especially for busybox
|
||||
- **Progress reporting** -- Nice UX but not blocking
|
||||
- **Configurable compression** -- Start with one setting that works; optimize later
|
||||
- **Dry-run / validation mode** -- Useful for debugging but not for initial delivery
|
||||
- **Per-file derived keys** -- Defense-in-depth for later
|
||||
|
||||
**Key MVP constraint:** The busybox shell unpacker is the most constraining component. Every format decision must be validated against "can busybox dd/xxd/openssl do this?" If the answer is no, the feature must be deferred or redesigned.
|
||||
|
||||
## Sources
|
||||
|
||||
- Domain knowledge of archive format design (ZIP, tar, 7z format specifications)
|
||||
- Domain knowledge of cryptographic best practices (NIST, libsodium documentation patterns)
|
||||
- Domain knowledge of Android crypto APIs (javax.crypto, OpenSSL CLI)
|
||||
- Domain knowledge of busybox utility capabilities
|
||||
Reference in New Issue
Block a user