docs: add project research

This commit is contained in:
NikitolProject
2026-02-24 22:51:05 +03:00
parent 914d88458a
commit 40dcfd4ac0
5 changed files with 841 additions and 0 deletions

View File

@@ -0,0 +1,124 @@
# Feature Landscape
**Domain:** Custom encrypted archiver with proprietary binary format
**Researched:** 2026-02-24
**Confidence:** MEDIUM (based on domain knowledge of archive formats, encryption patterns, and Android constraints)
## Table Stakes
Features that are mandatory for this product to function correctly. Missing any of these means the archive is either non-functional, insecure against casual inspection, or unreliable.
| Feature | Why Expected | Complexity | Notes |
|---------|--------------|------------|-------|
| Multi-file packing/unpacking | Core purpose: bundle texts + APKs into one archive | Medium | Need file table/index structure; handle varied sizes from few KB to tens of MB |
| AES-256-CBC encryption | Without real encryption, any hex editor reveals content | Medium | busybox openssl supports `aes-256-cbc`; Android javax.crypto supports AES natively |
| HMAC-SHA256 integrity (encrypt-then-MAC) | Detect corruption and tampering | Medium | busybox `openssl dgst -sha256 -hmac`; Kotlin `Mac("HmacSHA256")` |
| Compression before encryption | Reduce archive size; compression after encryption is ineffective (encrypted data has max entropy) | Low | Use deflate/gzip; must compress BEFORE encrypt |
| Hardcoded key embedding | Project requirement: no user-entered passwords, key baked into dearchiver code | Low | Key in Kotlin code and shell script; rotate key means new build |
| Custom magic bytes | Standard magic bytes (PK, 7z, etc.) let file/binwalk identify format; custom bytes prevent this | Low | Use random-looking bytes, not human-readable strings; avoid patterns that match known formats |
| Round-trip fidelity | Unpacked files must be byte-identical to originals | Low | Verified via checksums; critical for APKs (signature breaks if single byte changes) |
| CLI interface for packing | Archiver runs on Linux/macOS developer machine | Low | Standard CLI: `encrypted_archive pack -o output.bin file1.txt file2.apk` |
| Kotlin unpacker (Android 13) | Primary dearchiver path on target device | High | Pure JVM, no native libs; must handle javax.crypto for AES |
| Busybox shell unpacker (fallback) | Backup when Kotlin app unavailable | High | Only dd, xxd, openssl, sh; format must be simple enough for positional extraction |
| File metadata preservation (name, size) | Unpacker must know which bytes belong to which file and what to name them | Low | Stored in file table; at minimum: filename, original size, compressed size, offset |
## Differentiators
Features that exceed baseline expectations and provide meaningful protection or usability improvements. Not all are needed for MVP, but they strengthen the product.
| Feature | Value Proposition | Complexity | Notes |
|---------|-------------------|------------|-------|
| Format obfuscation: fake headers | Misleads automated analysis tools (binwalk, foremost) that scan for patterns | Medium | Insert decoy headers resembling JPEG, PNG, or random formats at predictable offsets; casual user sees "corrupted image" not "encrypted archive" |
| Format obfuscation: shuffled blocks | File data blocks stored out-of-order with a scramble map | Medium | Prevents sequential extraction even if encryption is somehow bypassed; adds complexity to busybox unpacker |
| Format obfuscation: randomized padding | Variable-length random padding between blocks | Low | Makes block boundaries unpredictable to static analysis; minimal implementation cost |
| Version field in header | Forward-compatible format evolution | Low | Single byte version; unpackers check version and reject incompatible archives gracefully |
| Per-file encryption with derived keys | Each file encrypted with unique key derived from master key + file index/salt | Medium | Limits damage if one file's plaintext is known (known-plaintext attack on specific block) |
| Progress reporting during pack/unpack | UX for large archives (tens of MB of APKs) | Low | CLI progress bar; Kotlin callback for UI integration |
| Dry-run / validation mode | Check archive integrity without full extraction | Low | Verify checksums and structure without writing files to disk; useful for debugging on device |
| Configurable compression level | Trade speed vs size for different content types (APKs are already compressed, texts compress well) | Low | APKs benefit little from compression; allow per-file or global setting |
| Salt / IV per archive | Each archive uses random IV/nonce even with same key; prevents identical plaintext producing identical ciphertext | Low | Standard crypto practice; 16-byte IV for AES-CBC; must be stored in archive header (unencrypted) |
| Error messages that do not leak format info | Unpacker errors say "invalid archive" not "checksum mismatch at block 3 offset 0x4A2" | Low | Defense in depth: even error messages should not help reverse engineering |
## Anti-Features
Features to explicitly NOT build. Each would add complexity without matching the project's threat model or constraints.
| Anti-Feature | Why Avoid | What to Do Instead |
|--------------|-----------|-------------------|
| Password-based key derivation (PBKDF2/Argon2) | Project explicitly uses hardcoded key; password entry UX is unwanted on car head unit | Embed key directly in Kotlin/shell code; accept that key extraction from APK is possible for determined attackers |
| GUI for archiver | Scope creep; CLI is sufficient for developer workflow (pack on laptop, deploy to device) | Well-designed CLI with clear flags and help text |
| Windows archiver support | Out of scope per project constraints; Rust cross-compiles easily IF needed later | Linux/macOS only; document that WSL works if Windows user needs it |
| Streaming/pipe support | Files are small enough (KB to tens of MB) to fit in memory; streaming adds format complexity that breaks busybox compatibility | Load entire file into memory for pack/unpack; document max file size assumption |
| Nested/recursive archives | No use case: archive contains flat list of texts and APKs | Single-level file list only |
| File permissions / ownership metadata | Android target manages its own permissions; Unix permissions from build machine are irrelevant | Store only filename and size; ignore mode/owner/timestamps |
| Compression algorithm selection at runtime | Over-engineering; one good default is sufficient | Use deflate/gzip -- available everywhere: Rust, Kotlin, busybox; hardcode the choice |
| Public-key / asymmetric encryption | Massive complexity increase for no benefit given hardcoded key model | Symmetric encryption only (AES-256) |
| Self-extracting archives | Target is Android, not desktop; shell script IS the extractor | Separate archive file + separate unpacker (Kotlin app or shell script) |
| DRM or license enforcement | Not the purpose; this is content bundling protection, not DRM | Simple encryption is sufficient for the threat model |
| File deduplication within archive | Archive contains distinct files (texts and different APKs); dedup adds complexity with near-zero benefit | Pack files as-is |
| Encryption of filenames in file table | Nice in theory but busybox shell unpacker needs to know filenames to extract; encrypting the file table massively complicates the shell path | Store filenames inside the encrypted payload (entire payload is encrypted, so filenames are protected by archive-level encryption) |
## Feature Dependencies
```
Compression --> Encryption --> Format Assembly (compression MUST happen before encryption)
|
v
Integrity Checks (HMAC over encrypted blocks)
Custom Magic Bytes --> Format Header Design
Version Field --> Format Header Design
Salt/IV Storage --> Format Header Design
File Metadata (name, size) --> File Table Structure --> Format Assembly
Format Assembly --> CLI Packer (Rust)
Format Specification --> Kotlin Unpacker
Format Specification --> Busybox Shell Unpacker
Per-file Key Derivation --> Requires Format Specification to include file index/salt
Fake Headers --> Requires Format Assembly to insert decoys at correct positions
Shuffled Blocks --> Requires File Table to store block ordering map
```
**Critical dependency chain:**
```
Format Spec (on paper)
--> Rust Packer (implements spec)
--> Kotlin Unpacker (reads spec)
--> Shell Unpacker (reads spec)
--> Round-trip tests (validates all three agree)
```
The format specification must be finalized BEFORE any implementation begins, because three independent implementations (Rust, Kotlin, shell) must produce identical results.
## MVP Recommendation
**Prioritize (Phase 1 -- must ship):**
1. **Format specification document** -- Define header, file table, block layout, magic bytes, version field, IV/salt placement
2. **Compression + Encryption pipeline** -- Compress with gzip, encrypt with AES-256-CBC, authenticate with HMAC-SHA256
3. **Rust CLI packer** -- Pack multiple files into the custom format
4. **Integrity verification via HMAC-SHA256** -- Encrypt-then-MAC for both integrity and authenticity
5. **Kotlin unpacker** -- Primary extraction path on Android 13. Pure JVM using javax.crypto
6. **Busybox shell unpacker** -- Fallback extraction. This constrains the format to be simple
7. **Round-trip tests** -- Verify Rust-pack, Kotlin-unpack, shell-unpack all produce identical output
**Defer (Phase 2 -- after MVP works):**
- **Fake headers / decoy data** -- Obfuscation layer; adds no functional value, purely anti-analysis
- **Shuffled blocks** -- Significant complexity, especially for busybox
- **Progress reporting** -- Nice UX but not blocking
- **Configurable compression** -- Start with one setting that works; optimize later
- **Dry-run / validation mode** -- Useful for debugging but not for initial delivery
- **Per-file derived keys** -- Defense-in-depth for later
**Key MVP constraint:** The busybox shell unpacker is the most constraining component. Every format decision must be validated against "can busybox dd/xxd/openssl do this?" If the answer is no, the feature must be deferred or redesigned.
## Sources
- Domain knowledge of archive format design (ZIP, tar, 7z format specifications)
- Domain knowledge of cryptographic best practices (NIST, libsodium documentation patterns)
- Domain knowledge of Android crypto APIs (javax.crypto, OpenSSL CLI)
- Domain knowledge of busybox utility capabilities