scan binary

Detects cryptographic assets in compiled binaries, firmware images, and container archives without requiring source code. Useful for auditing third-party binaries, embedded firmware, shipped artifacts, or any environment where source is unavailable.

Usage

qtz-discovery-cli scan binary <path> [flags]
# Scan a single firmware image
qtz-discovery-cli scan binary ./firmware.bin

# Scan a Java archive and emit JSON findings
qtz-discovery-cli scan binary ./app.jar --format json

# Scan a directory of binaries, skip large debug files
qtz-discovery-cli scan binary /usr/lib --max-file-size 32 --exclude "*.debug"

# Scan with AI-driven deep analysis
qtz-discovery-cli scan binary ./firmware.bin --llm --llm-quality deep

# Dry-run AI analysis — estimate scope without spending
qtz-discovery-cli scan binary ./firmware.bin --llm --dry-run

# Write CycloneDX CBOM to file
qtz-discovery-cli scan binary ./release/ --format cyclonedx --output cbom.json

Supported Formats

The scanner auto-detects format by magic bytes and transparently decompresses nested archives.

FormatNotes
ELF (Linux, Android, embedded)Symbol table, dynamic imports, DT_NEEDED chains
PE / COFF (Windows, UEFI)Import table, wide strings (UTF-16LE), .NET CLR metadata
Mach-O / Fat Mach-O (macOS, iOS)Dylib load commands, universal binary slices
JAR / WAR / AAR / APKZIP-expanded; each entry scanned recursively
TAR, tar.gz, tar.xz, tar.bz2Auto-decompressed; entries scanned
DEB packagesdata.tar.* extracted and scanned
RPM packagesCPIO payload extracted and scanned
Android Boot / Sparse ImagesKernel and ramdisk extracted
CPIO archivesEntries expanded and scanned
Broadcom TRX firmwareKernel and rootfs partitions extracted
U-Boot uImagePayload extracted
Apple XAR (.pkg)Entries expanded and scanned
Windows Cabinet (.cab)MSZIP/uncompressed extraction
gzip, bzip2, xz, zstd, LZ4Transparently decompressed; stacked compression supported
Java .classConstant pool parsed for class/method names
Dalvik DEX (Android)String pool analyzed
WASMStructured name sections parsed
BEAM (Erlang/Elixir)Atom table extracted
Python .pycMarshal stream analyzed
Lua bytecodePrototype strings analyzed
Intel HEX (.hex, .ihex)Decoded to raw bytes before analysis
Motorola SRECDecoded to raw bytes before analysis
UPX-packed binariesDetected; dynamic unpacking attempted (depth 0 only)
Raw / unknown firmwarePattern matching against raw bytes

Archives are unpacked recursively up to 8 layers deep (zip-bomb protection). Files exceeding --max-file-size are skipped with an informational advisory.

Detection Passes

Three static analysis passes run in parallel on each binary:

PassWhat it findsConfidence
STATIC — strings & constants Printable string extraction matched against 149+ crypto patterns (library version strings, algorithm names, import paths, PEM headers). Byte-constant matching against known AES S-box, SHA IVs, DES S-box, and EC curve parameters. high
STRUCT — symbol tables Parses ELF symbol/dynamic tables, PE import tables, Mach-O LC_LOAD_DYLIB commands, .NET CLR metadata, and JVM constant pools. Direct function references produce the highest-confidence findings. confirmed
DEPS — library chains Traces PQC library dependency chains from ELF DT_NEEDED, Mach-O dylib loads, .NET imports, and JVM constant pools. Detects liboqs, ML-KEM, ML-DSA, and other post-quantum libraries. confirmed

Flags

Analysis control

FlagDefaultDescription
--stringstrueEnable string-extraction pass
--symbolstrueEnable symbol-table pass (ELF / PE / Mach-O)
--min-string-len6Minimum printable-string run length for extraction
--max-file-size128Skip files larger than this many MiB
--excludeGlob patterns to skip (e.g. *.debug,vendor/*)
--dynamic-timeout10sPer-binary wall-clock timeout for dynamic unpacking (UPX, encrypted blobs)

AI analysis (requires portal connection)

FlagDefaultDescription
--llmfalseEnable AI-driven semantic analysis (requires --server)
--llm-qualityautoAnalysis depth: auto|fast|deep|chain
--scan-budgetMax USD for AI analysis (e.g. 2.50; 0 = unlimited)
--dry-runfalseEstimate AI analysis scope without executing (requires --llm)
--llm-max-strings150Max crypto-relevant strings sent to AI (0 = unlimited)
--llm-max-symbols500Max symbol/import names sent to AI (0 = unlimited)
--llm-max-api-calls100Max emulation API log entries sent to AI (0 = unlimited)

Finding Confidence Levels

ConfidenceSourceWhat it means
confirmedSTRUCT passDirect function reference or dylib load — binary definitely uses this algorithm
highSTATIC passStrong string match or known byte-constant pattern — very likely present
mediumSTATIC pass (bytecode)String match in interpreted bytecode (DEX, .class, BEAM)
lowFallbackWeak signal; review manually

Exit Codes

CodeMeaning
0Success (findings may or may not be present)
1Error (I/O failure, unreadable path, etc.)

CI/CD Example

- name: Crypto scan — release binary
  run: |
    qtz-discovery-cli scan binary ./dist/my-service-linux-amd64 \
      --format sarif \
      --output binary-results.sarif

- name: Upload SARIF to GitHub Security
  if: always()
  uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: binary-results.sarif
    category: qtz-binary

Firmware Example

# Scan a compressed firmware image — archives are decompressed automatically
qtz-discovery-cli scan binary router-fw-v2.4.1.tar.gz --format json --output fw-findings.json

# View a summary
qtz-discovery-cli report summary fw-findings.json