Files
android-encrypted-archiver/.planning/research/FEATURES.md
2026-02-24 22:51:05 +03:00

10 KiB

Feature Landscape

Domain: Custom encrypted archiver with proprietary binary format Researched: 2026-02-24 Confidence: MEDIUM (based on domain knowledge of archive formats, encryption patterns, and Android constraints)

Table Stakes

Features that are mandatory for this product to function correctly. Missing any of these means the archive is either non-functional, insecure against casual inspection, or unreliable.

Feature Why Expected Complexity Notes
Multi-file packing/unpacking Core purpose: bundle texts + APKs into one archive Medium Need file table/index structure; handle varied sizes from few KB to tens of MB
AES-256-CBC encryption Without real encryption, any hex editor reveals content Medium busybox openssl supports aes-256-cbc; Android javax.crypto supports AES natively
HMAC-SHA256 integrity (encrypt-then-MAC) Detect corruption and tampering Medium busybox openssl dgst -sha256 -hmac; Kotlin Mac("HmacSHA256")
Compression before encryption Reduce archive size; compression after encryption is ineffective (encrypted data has max entropy) Low Use deflate/gzip; must compress BEFORE encrypt
Hardcoded key embedding Project requirement: no user-entered passwords, key baked into dearchiver code Low Key in Kotlin code and shell script; rotate key means new build
Custom magic bytes Standard magic bytes (PK, 7z, etc.) let file/binwalk identify format; custom bytes prevent this Low Use random-looking bytes, not human-readable strings; avoid patterns that match known formats
Round-trip fidelity Unpacked files must be byte-identical to originals Low Verified via checksums; critical for APKs (signature breaks if single byte changes)
CLI interface for packing Archiver runs on Linux/macOS developer machine Low Standard CLI: encrypted_archive pack -o output.bin file1.txt file2.apk
Kotlin unpacker (Android 13) Primary dearchiver path on target device High Pure JVM, no native libs; must handle javax.crypto for AES
Busybox shell unpacker (fallback) Backup when Kotlin app unavailable High Only dd, xxd, openssl, sh; format must be simple enough for positional extraction
File metadata preservation (name, size) Unpacker must know which bytes belong to which file and what to name them Low Stored in file table; at minimum: filename, original size, compressed size, offset

Differentiators

Features that exceed baseline expectations and provide meaningful protection or usability improvements. Not all are needed for MVP, but they strengthen the product.

Feature Value Proposition Complexity Notes
Format obfuscation: fake headers Misleads automated analysis tools (binwalk, foremost) that scan for patterns Medium Insert decoy headers resembling JPEG, PNG, or random formats at predictable offsets; casual user sees "corrupted image" not "encrypted archive"
Format obfuscation: shuffled blocks File data blocks stored out-of-order with a scramble map Medium Prevents sequential extraction even if encryption is somehow bypassed; adds complexity to busybox unpacker
Format obfuscation: randomized padding Variable-length random padding between blocks Low Makes block boundaries unpredictable to static analysis; minimal implementation cost
Version field in header Forward-compatible format evolution Low Single byte version; unpackers check version and reject incompatible archives gracefully
Per-file encryption with derived keys Each file encrypted with unique key derived from master key + file index/salt Medium Limits damage if one file's plaintext is known (known-plaintext attack on specific block)
Progress reporting during pack/unpack UX for large archives (tens of MB of APKs) Low CLI progress bar; Kotlin callback for UI integration
Dry-run / validation mode Check archive integrity without full extraction Low Verify checksums and structure without writing files to disk; useful for debugging on device
Configurable compression level Trade speed vs size for different content types (APKs are already compressed, texts compress well) Low APKs benefit little from compression; allow per-file or global setting
Salt / IV per archive Each archive uses random IV/nonce even with same key; prevents identical plaintext producing identical ciphertext Low Standard crypto practice; 16-byte IV for AES-CBC; must be stored in archive header (unencrypted)
Error messages that do not leak format info Unpacker errors say "invalid archive" not "checksum mismatch at block 3 offset 0x4A2" Low Defense in depth: even error messages should not help reverse engineering

Anti-Features

Features to explicitly NOT build. Each would add complexity without matching the project's threat model or constraints.

Anti-Feature Why Avoid What to Do Instead
Password-based key derivation (PBKDF2/Argon2) Project explicitly uses hardcoded key; password entry UX is unwanted on car head unit Embed key directly in Kotlin/shell code; accept that key extraction from APK is possible for determined attackers
GUI for archiver Scope creep; CLI is sufficient for developer workflow (pack on laptop, deploy to device) Well-designed CLI with clear flags and help text
Windows archiver support Out of scope per project constraints; Rust cross-compiles easily IF needed later Linux/macOS only; document that WSL works if Windows user needs it
Streaming/pipe support Files are small enough (KB to tens of MB) to fit in memory; streaming adds format complexity that breaks busybox compatibility Load entire file into memory for pack/unpack; document max file size assumption
Nested/recursive archives No use case: archive contains flat list of texts and APKs Single-level file list only
File permissions / ownership metadata Android target manages its own permissions; Unix permissions from build machine are irrelevant Store only filename and size; ignore mode/owner/timestamps
Compression algorithm selection at runtime Over-engineering; one good default is sufficient Use deflate/gzip -- available everywhere: Rust, Kotlin, busybox; hardcode the choice
Public-key / asymmetric encryption Massive complexity increase for no benefit given hardcoded key model Symmetric encryption only (AES-256)
Self-extracting archives Target is Android, not desktop; shell script IS the extractor Separate archive file + separate unpacker (Kotlin app or shell script)
DRM or license enforcement Not the purpose; this is content bundling protection, not DRM Simple encryption is sufficient for the threat model
File deduplication within archive Archive contains distinct files (texts and different APKs); dedup adds complexity with near-zero benefit Pack files as-is
Encryption of filenames in file table Nice in theory but busybox shell unpacker needs to know filenames to extract; encrypting the file table massively complicates the shell path Store filenames inside the encrypted payload (entire payload is encrypted, so filenames are protected by archive-level encryption)

Feature Dependencies

Compression --> Encryption --> Format Assembly (compression MUST happen before encryption)
                                     |
                                     v
                              Integrity Checks (HMAC over encrypted blocks)

Custom Magic Bytes --> Format Header Design
Version Field --> Format Header Design
Salt/IV Storage --> Format Header Design

File Metadata (name, size) --> File Table Structure --> Format Assembly

Format Assembly --> CLI Packer (Rust)

Format Specification --> Kotlin Unpacker
Format Specification --> Busybox Shell Unpacker

Per-file Key Derivation --> Requires Format Specification to include file index/salt
Fake Headers --> Requires Format Assembly to insert decoys at correct positions
Shuffled Blocks --> Requires File Table to store block ordering map

Critical dependency chain:

Format Spec (on paper)
  --> Rust Packer (implements spec)
    --> Kotlin Unpacker (reads spec)
    --> Shell Unpacker (reads spec)
      --> Round-trip tests (validates all three agree)

The format specification must be finalized BEFORE any implementation begins, because three independent implementations (Rust, Kotlin, shell) must produce identical results.

MVP Recommendation

Prioritize (Phase 1 -- must ship):

  1. Format specification document -- Define header, file table, block layout, magic bytes, version field, IV/salt placement
  2. Compression + Encryption pipeline -- Compress with gzip, encrypt with AES-256-CBC, authenticate with HMAC-SHA256
  3. Rust CLI packer -- Pack multiple files into the custom format
  4. Integrity verification via HMAC-SHA256 -- Encrypt-then-MAC for both integrity and authenticity
  5. Kotlin unpacker -- Primary extraction path on Android 13. Pure JVM using javax.crypto
  6. Busybox shell unpacker -- Fallback extraction. This constrains the format to be simple
  7. Round-trip tests -- Verify Rust-pack, Kotlin-unpack, shell-unpack all produce identical output

Defer (Phase 2 -- after MVP works):

  • Fake headers / decoy data -- Obfuscation layer; adds no functional value, purely anti-analysis
  • Shuffled blocks -- Significant complexity, especially for busybox
  • Progress reporting -- Nice UX but not blocking
  • Configurable compression -- Start with one setting that works; optimize later
  • Dry-run / validation mode -- Useful for debugging but not for initial delivery
  • Per-file derived keys -- Defense-in-depth for later

Key MVP constraint: The busybox shell unpacker is the most constraining component. Every format decision must be validated against "can busybox dd/xxd/openssl do this?" If the answer is no, the feature must be deferred or redesigned.

Sources

  • Domain knowledge of archive format design (ZIP, tar, 7z format specifications)
  • Domain knowledge of cryptographic best practices (NIST, libsodium documentation patterns)
  • Domain knowledge of Android crypto APIs (javax.crypto, OpenSSL CLI)
  • Domain knowledge of busybox utility capabilities