This is the original IOCX engine for static IOC extraction and PE analysis. Any other repositories using the name "iocx" are not affiliated with this project.
- PyPI: https://pypi.org/project/iocx/
- Github: https://github.com/iocx-dev/iocx
- Website: https://iocx.dev/
Static IOC extraction from a PE file using the IOCX CLI
Fast, safe, deterministic IOC extraction for DFIR, SOC automation, and large-scale threat analysis.
IOCX is a lightweight, extensible engine for extracting Indicators of Compromise (IOCs) and structural metadata using pure static analysis. No execution. No sandboxing. No risk.
Built for:
- DFIR workflows
- SOC automation
- Threat-intel pipelines
- CI/CD security checks
- Large‑scale batch processing
IOCX is a core component of the MalX Labs ecosystem for scalable, modern threat‑analysis tooling.
IOCX is designed for environments where safety, determinism, and automation matter. Unlike extractors that operate only on raw text, IOCX includes:
- Binary‑aware static analysis
- A plugin-friendly rule system
- A stable JSON schema suitable for pipelines and long-term integrations
- Static‑only design — never executes untrusted code
- Binary parsing — PE-aware extraction with section analysis, entropy, and obfuscation hints
- Analysis level — basic, deep, and full for performance-tuned workflows
- Deterministic behaviour — stable output and predictable performance
- Extensible rule engine — custom detectors, parsers, and plugins
- Consistent JSON schema — clean integration with SIEM/SOAR
- Low dependency footprint — safe for enterprise environments
- Pipeline-ready — fast start‑up, fast throughput
To avoid confusion:
- Not a sandbox
- Not a malware emulator
- Not a behavioural analysis tool
- Not an enrichment engine (that lives in the MalX Cloud platform)
IOCX is static extraction only, by design.
- Extract indicators from emails, alerts, or analyst clipboard text
- Parse IOCs from reports into structured JSON
- Safely inspect malware samples without execution
- Normalize indicators from feeds
- Batch‑process unstructured text
- Build enrichment pipelines on top of deterministic output
- Scan binaries for embedded indicators before publishing
- Integrate IOC extraction into automated checks
- Detect accidental inclusion of URLs or addresses in builds
- Pipe logs or artifacts through IOCX
- Use the Python API for ETL or batch workflows
- Extend with custom detectors for internal patterns
- Deterministic heuristics: anti‑debug APIs, TLS anomalies, packer‑like behaviour, RWX sections, import anomalies.
- Adversarial testing: added three initial Layer 3 samples to validate rich heuristics, entropy analysis and string‑based IOC extraction.
- Contract testing: deterministic snapshots for sections, imports, heuristics, and IOCs.
- Bug fix: resolved a crash caused by non‑UTF8 Rich Header bytes by introducing deep hex‑encoding sanitisation.
- Docs: new deterministic‑output section and appendices for adversarial samples.
- Introduced a fully stable JSON schema across all analysis levels
- Added strict structural guarantees for
iocs,metadata, andanalysisblocks - Normalised PE metadata fields for deterministic output (headers, TLS, optional header, signatures)
- Ensured all IOC categories always exist (empty arrays when no matches)
- Formalised analysis‑level behaviour:
- core behaviour → no analysis block
- basic → section layout + entropy
- deep → adds obfuscation heuristics
- full → adds extended metadata summaries
- Added snapshot‑contract tests to prevent schema drift across releases
- Improved PE parser consistency for imports, resources, and section metadata
- Strengthened safety guarantees for CI/CD and large‑scale automation pipelines
This release establishes the long‑term schema contract that downstream tools can rely on.
- New analysis‑level system: basic, deep (default), and full (future‑ready)
- PE structural analysis: section layout, raw/virtual sizes, entropy
- Obfuscation heuristics: abnormal section patterns, virtual‑only sections, entropy anomalies
- Extended analysis stub for future packer/TLS/anti‑debug modules
- Clean, stable JSON schema with optional analysis block
- No‑flag mode remains fast and minimal for pipeline use
- Introduced the plugin‑ready rule engine, enabling custom IOC detectors and parsers
- Unified internal detection flow under a consistent, extensible interface
- Added support for user‑defined regex detectors and lightweight parsing modules
- Improved separation between core engine, detectors, and output formatting
- Reduced coupling across modules to support long‑term extensibility
- Maintained the same fast, deterministic performance profile
- Ethereum & Bitcoin wallet detection
- Improved architecture for long-term extensibility
- Same blazing performance on multi-MB inputs
Significant improvements to IPv4/IPv6 extraction in noisy, malformed, mixed-content environments
$ iocx chaos_corpus.json
{
"file": "examples/samples/structured/chaos_corpus.json",
"type": "text",
"iocs": {
"urls": [
"http://[2001:db8::1]:443"
],
"domains": [],
"ips": [
"2001:db8::1",
"2001:db8::1:443",
"10.0.0.1",
"192.168.1.10",
"fe80::dead:beef%eth0",
"1.2.3.4",
"fe80::1%eth0",
"192.168.1.110",
"fe80::1%eth0fe80",
"::2%eth1",
"2001:db8::"
],
"hashes": [],
"emails": [],
"filepaths": [],
"base64": []
},
"metadata": {}
}
Chaos Corpus: Input → Extracted Output → Explanation
| Input | Extracted Output | Explanation |
|---|---|---|
| fe80::dead:beef%eth0/garbage | fe80::dead:beef%eth0 | Salvaged valid IPv6, junk ignored. |
| xxx192.168.1.10yyy | 192.168.1.10 | IPv4 inside junk text. |
| DROP:client=10.0.0.1;;;ERR | 10.0.0.1 | IPv4 from noisy log field. |
| [2001:db8::1]::::443 | 2001:db8::1 | IPv6 and IPv6+port extracted. |
| 2001:db8::1:443 | ||
| GET http://[2001:db8::1]:443/index | http://[2001:db8::1]:443 | URL with IPv6 parsed correctly. |
| udp://[fe80::1%eth0]::::53 | fe80::1%eth0 | Concatenated IPv6 split up. |
| 192.168.1.110.0.0.1 | 192.168.1.110 | Combined IP segment salvaged. |
| fe80::1%eth0fe80::2%eth1 | fe80::1%eth0fe80, ::2%eth1 | Concatenated IPv6 split up. |
| 2001:db8::12001:db8::2 | 2001:db8:: | Longest valid IPv6 prefix found. |
| 256.256.256:256 | — | Invalid indicator ignored. |
Performance Benchmarks (v0.2.0)
All measurements from the latest performance suite:
| Sample Type | Time |
|---|---|
| 1 MB mixed‑content sample | 0.0053s |
| Pathological IPv6 blob | 0.0055s |
| 100 KB sample | 0.0006s |
| 300 KB sample | 0.0017s |
| 600 KB sample | 0.0031s |
| 1 MB sample | 0.0055s |
- Throughput: ~200 MB/s
- Worst‑case IPv6 blob: ~0.5 ms
- Linear scaling: almost perfect from 100 KB → 1 MB
Performance Benchmarks (v0.3.0)
All measurements from the latest performance suite:
| Sample Type | Time |
|---|---|
| IP | |
| 1 MB mixed‑content sample | 0.0070s |
| Pathological IPv6 blob | 0.0004s |
| 100 KB sample | 0.0008s |
| 300 KB sample | 0.0021s |
| 600 KB sample | 0.0038s |
| 1 MB sample | 0.0068s |
| Filepath | |
| 1 MB mixed‑content sample | 0.0040s |
| Pathological deep unix path | 0.0237s |
| 300 KB sample | 0.0011s |
| 600 KB sample | 0.0022s |
| 1000 KB sample | 0.0038s |
| 1500 KB sample | 0.0055s |
| Crypto | |
| 1 MB mixed‑content sample | 0.0021s |
| Pathological ETH-like blob | 0.0012s |
| 300 KB sample | 0.0006s |
| 600 KB sample | 0.0012s |
| 1000 KB sample | 0.0020s |
| 1500 KB sample | 0.0031s |
- Throughput: ~200 MB/s
- Worst‑case IPv6 blob: ~0.5 ms
- Worst‑case filepath blob: ~23 ms
- Worst‑case crypto blob: ~1 ms
- Linear scaling: almost perfect from 100 KB → 1 MB
IOCX is the name of the official static IOC extraction engine published on:
The IOCX name, branding, and project identity refer exclusively to this project and its associated packages, documentation, and releases.
To protect users from confusion and maintain a healthy ecosystem:
- Use
iocxas the name of their repository - Publish tools named “iocx” that are not this project
- Present themselves as the creators or maintainers of IOCX
- Use the PyPI badge for the official
iocxpackage - Imply official affiliation or endorsement without permission
These actions mislead users and violate the identity of the project.
Third‑party tools, plugins, and integrations are welcome. To avoid confusion, they should follow this naming pattern:
iocx-<plugin-name>iocx-plugin-<feature>iocx-extension-<name>
Examples:
iocx-osint-enricheriocx-detector-custom
IOCX is used in DFIR, SOC automation, CI/CD pipelines, and threat‑intel workflows. Clear naming ensures:
- Users know which tool is the official IOCX engine
- Third‑party tools are discoverable without causing confusion
- The ecosystem grows in a structured, healthy way
If you are building something that integrates with IOCX and want guidance on naming or attribution, feel free to open an issue
- Core Engine: https://github.com/iocx-dev/iocx
- Plugins Meta‑Repo: https://github.com/iocx-dev/iocx-plugins
- Documentation: https://github.com/iocx-dev/iocx/tree/main/docs/specs
- PyPI Package: https://pypi.org/project/iocx/
- URLs
- Domains
- IPv4 / IPv6 addresses
- File paths (Windows, Linux, UNC, env-vars)
- Hashes (MD5 / SHA1 / SHA256 / SHA512 / Generic Hex)
- Email addresses
- Base64 strings
- Crypto wallets (Ethereum / Bitcoin)
- Windows PE files (
.exe,.dll) - Extracted strings from binaries
- Imports, sections, resources, metadata
- Analysis levels:
basic- section layout + entropydeep- adds obfuscation heuristicsfull- extended analysis stub (future-ready)
- Fast startup and throughput
- Optional caching for repeated scans
- Suitable for CI/CD and large batch workflows
- Clean, stable JSON output
- CLI + Python API
- Modular, extensible rule system
- Minimal dependency footprint
- Zero malware execution
- Safe for untrusted input
- Deterministic behaviour for pipelines
Static analysis ensures safety, determinism, and CI‑friendly operation. No sandboxing, no execution, and no risk of triggering malware behaviour.
IOCX v0.6.0 defines a stable, deterministic JSON schema designed for DFIR, SOC automation, and threat‑intel pipelines. The schema is intentionally simple, predictable, and safe for long‑term integrations.
The top‑level structure contains three blocks:
iocs— extracted indicatorsmetadata— structural information about the artifactanalysis— optional deeper inspection depending on analysis level
This structure is identical across all input types, with PE‑specific fields populated only when applicable.
The iocs block always contains the same keys, regardless of analysis level:
urlsdomainsipshashesemailsfilepathsbase64crypto.btccrypto.eth
Each category is always an array. Empty categories are returned as empty arrays to ensure predictable downstream parsing.
The metadata block contains structural information about the file. For PE files, this includes:
- Imports and import details
- Sections
- Resources and resource strings
- TLS directory
- Header and optional header
- Rich header
- Signatures
These fields are always present, even when empty. Metadata is independent of analysis level and is always returned in full.
The analysis block is the only part of the schema that changes based on the selected analysis level.
- basic — section layout + entropy
- deep — adds obfuscation heuristics
- full — adds extended metadata summaries
This tiered design allows users to trade off performance vs. depth without changing their downstream parsing logic.
IOCX v0.6.0 guarantees:
- Stable keys
- Stable types
- No volatile values in minimal modes
- Deterministic behaviour across runs and platforms
This makes IOCX safe for SIEM/SOAR ingestion, CI/CD pipelines, and large‑scale batch processing.
IOCX guarantees a stable JSON schema, not a guaranteed ordering of keys within objects. JSON objects are defined as unordered maps, so consumers should rely on field presence and structure rather than positional ordering. All fields, types, and structural relationships remain consistent across versions, even if internal key order changes.
pip install iocxiocx suspicious.exeecho "Visit http://bad.example.com" | iocx -iocx alerts.logiocx suspicious.exe -aOr choose a specific level:
iocx suspicious.exe -a basic
iocx suspicious.exe -a deep
iocx suspicious.exe -a fullfrom iocx.engine import Engine
engine = Engine()
results = engine.extract("suspicious.exe")
print(results)Show Example JSON Output
{
"file": "suspicious.exe",
"type": "PE",
"iocs": {
"urls": [],
"domains": [],
"ips": [],
"hashes": [],
"emails": [],
"filepaths": [
"C:\\Windows\\System32\\cmd.exe",
"D:\\Temp\\payload.bin",
"E:/Users/Bob/AppData/Roaming/evil.dll",
"F:\\Program Files\\SomeApp\\bin\\run.exe",
"C:\\Users\\Alice\\Desktop\\notes.txt",
"Z:\\Archive\\2024\\logs\\system.log",
"\\\\SERVER01\\share\\dropper.exe",
"\\\\192.168.1.44\\c$\\Windows\\Temp\\run.ps1",
"\\\\FILESRV\\public\\docs\\report.pdf",
"\\\\NAS01\\data\\backups\\2024\\config.json",
"/usr/bin/python3.11",
"/etc/passwd",
"/var/lib/docker/overlay2/abc123/config.v2.json",
"/tmp/x1/x2/x3/x4/x5/script.sh",
"/opt/tools/bin/runner",
"/home/alice/.config/evil.sh",
".\\payload.exe",
"..\\lib\\config.json",
"./run.sh",
"../bin/loader.so",
".\\scripts\\install.ps1",
"%APPDATA%\\Microsoft\\Windows\\Start Menu\\Programs\\Startup\\evil.lnk",
"%TEMP%\\payload.exe",
"%USERPROFILE%\\Downloads\\file.txt",
"$HOME/.config/evil.sh",
"$HOME/bin/run.sh",
"$TMPDIR/cache/tmp123.bin",
"C:\\Windows\\Temp\\payload.bin",
"/home/alice/.config/evil"
],
"base64": [],
"crypto.btc": [],
"crypto.eth": []
},
"metadata": {
"file_type": "PE",
"imports": [
"KERNEL32.dll",
"msvcrt.dll"
],
"sections": [
".text",
".data",
".rdata",
".pdata",
".xdata",
".bss",
".idata",
".CRT",
".tls",
".rsrc",
".reloc"
],
"resource_strings": [
"C:\\Windows\\System32\\cmd.exe",
"\\\\SERVER01\\share\\dropper.exe",
"/home/alice/.config/evil.sh@%APPDATA%\\Microsoft\\Windows\\Start Menu\\Programs\\Startup\\evil.lnk"
]
},
"analysis": {
"sections": [
{
"name": ".text",
"raw_size": 7168,
"virtual_size": 6712,
"characteristics": 1610612832,
"entropy": 5.790750971742716
},
{
"name": ".data",
"raw_size": 512,
"virtual_size": 464,
"characteristics": 3221225536,
"entropy": 2.094202310841767
},
{
"name": ".rdata",
"raw_size": 3584,
"virtual_size": 3408,
"characteristics": 1073741888,
"entropy": 4.545752258688727
},
{
"name": ".pdata",
"raw_size": 1024,
"virtual_size": 540,
"characteristics": 1073741888,
"entropy": 2.327719716055491
},
{
"name": ".xdata",
"raw_size": 512,
"virtual_size": 488,
"characteristics": 1073741888,
"entropy": 4.1370410751038245
},
{
"name": ".bss",
"raw_size": 0,
"virtual_size": 384,
"characteristics": 3221225600,
"entropy": 0.0
},
{
"name": ".idata",
"raw_size": 1536,
"virtual_size": 1472,
"characteristics": 3221225536,
"entropy": 3.7542599473501452
},
{
"name": ".CRT",
"raw_size": 512,
"virtual_size": 96,
"characteristics": 3221225536,
"entropy": 0.2718922950073886
},
{
"name": ".tls",
"raw_size": 512,
"virtual_size": 16,
"characteristics": 3221225536,
"entropy": 0.0
},
{
"name": ".rsrc",
"raw_size": 512,
"virtual_size": 416,
"characteristics": 1073741888,
"entropy": 2.6481096709923975
},
{
"name": ".reloc",
"raw_size": 512,
"virtual_size": 188,
"characteristics": 1107296320,
"entropy": 2.2248162937403557
}
],
"obfuscation": [
{
"value": "abnormal_section_layout_virtual_only",
"start": 0,
"end": 0,
"category": "obfuscation_hint",
"metadata": {
"section": ".bss",
"raw_size": 0,
"virtual_size": 384
}
}
]
}
}
iocx/
│
├── examples/ # Sample files + generators
├── docs/ # Detector contracts, overlap suppression rules, and plugin authoring guidelines
├── tests/ # Unit, integration, fuzz, robustness, and performance tests
├── iocx
├── detectors/ # Regex-based IOC detectors
├── parsers/ # PE parsing, string extraction
├── plugins/ # Plugin API and registry
├── cli/ # Command-line interface
├── analysis/ # PE static-analysis modules
The engine is intentionally modular so components can be extended or replaced easily.
See docs/specs/ for:
- Detector contracts
- Overlap suppression rules
- Plugin authoring guidelines
All test samples are:
- Synthetic
- Benign
- Publicly safe (EICAR, GTUBE)
- Designed to avoid accidental malware handling
We welcome:
- New IOC detectors
- Parser improvements
- Bug reports
- Documentation updates
- Synthetic test samples
See CONTRIBUTING.md for full guidelines.
If you discover a security issue, do not open a GitHub issue. Please follow the instructions in SECURITY.md.
Licensed under the MIT License. See LICENSE for details.
The IOCX name and project identity refer exclusively to the IOCX engine maintained under the iocx-dev organisation
