MWDB ClamYara is a local MWDB plugin that automatically scans newly uploaded and re-uploaded files using ClamAV and YARA.
The plugin is designed to be simple, portable, and safe, relying exclusively on CLI tools and the official MWDB API.
- Automatic scanning on:
- file creation
- file reupload
- ClamAV scanning via
clamdscan(recommended, fast; requiresclamddaemon) - YARA scanning via
yaraCLI - Results published directly to MWDB:
- human-readable comments
- structured tags
- Safe temporary file handling (no disk leaks)
- Path traversal protection for temporary files
- Configurable via environment variables
- Robust input validation for all configuration values
- No Python bindings for ClamAV or YARA required
Note: Using
clamdscaninstead ofclamscanavoids expensive per-scan signature DB loads.
In high-throughput environments (many uploads/reuploads),clamdscanis strongly preferred.
This plugin intentionally focuses only on ClamAV and YARA.
Reasons:
- Commercial antivirus engines usually prohibit this usage without expensive licenses
- ClamAV and YARA are widely accepted, open-source, and suitable for automation
- CLI usage ensures maximum portability across environments
The architecture strictly separates responsibilities:
| Component | Responsibility |
|---|---|
scanner.py |
Execute CLI scans and return results |
hook.py |
MWDB integration (comments, tags, lifecycle hooks) |
utils.py |
Temporary file handling and path validation |
config.py |
Environment-based configuration with validation |
__init__.py |
MWDB plugin bootstrap |
- Linux environment
- Python 3.10+
- MWDB server with plugin support
The following tools must be installed and available in PATH:
- ClamAV —
clamd+clamdscan - YARA —
yara
No Python bindings (yara-python, clamd Python module) are required.
-
Clone the repository into the MWDB plugins directory:
git clone https://github.com/dyussekeyev/mwdb-plugin-clamyara.git
-
Checkout the desired version:
git checkout v0.0.3
-
Install Python dependencies:
pip install mwdblib
-
Ensure
clamdscanandyaraare installed and accessible:clamdscan --version yara --version
All configuration is done via environment variables.
| Variable | Description |
|---|---|
CLAMYARA_MWDB_API_URL |
MWDB API base URL |
CLAMYARA_MWDB_API_KEY |
MWDB API key for authentication |
⚠️ If either required variable is missing, the plugin will refuse to start and log an error.
| Variable | Default | Description |
|---|---|---|
CLAMYARA_CLAMAV_ENABLED |
true |
Enable ClamAV scanning |
CLAMYARA_CLAMD_SOCKET |
(empty) | Optional unix socket path for clamdscan (e.g. /run/clamav/clamd.ctl) |
CLAMYARA_CLAMAV_TIMEOUT |
60 |
ClamAV scan timeout in seconds |
CLAMYARA_YARA_ENABLED |
true |
Enable YARA scanning |
CLAMYARA_YARA_RULES_PATH |
/opt/yara/rules.yar |
Path to YARA rules file |
CLAMYARA_MAX_FILE_SIZE |
52428800 |
Maximum file size in bytes (50 MB). Must be a positive integer. |
export CLAMYARA_MWDB_API_URL="https://mwdb.local/api"
export CLAMYARA_MWDB_API_KEY="your_api_key"
export CLAMYARA_YARA_RULES_PATH="/opt/yara/rules.yar"
export CLAMYARA_MAX_FILE_SIZE="52428800"
# Optional: force clamdscan to use a specific unix socket
export CLAMYARA_CLAMD_SOCKET="/run/clamav/clamd.ctl"- MWDB triggers a plugin hook on file event:
on_created_fileon_reuploaded_file
- The plugin resolves a shared MWDB API client (created once per process)
- File metadata is queried; size is checked before downloading content
- If the file exceeds
CLAMYARA_MAX_FILE_SIZE, it is skipped with a warning - File content is downloaded and written to a secure temporary file
- Scanning is performed:
- ClamAV via
clamdscan(timeout:CLAMYARA_CLAMAV_TIMEOUT, default 60 s) - YARA via
yara(timeout: 30 s)
- ClamAV via
- Results are published to MWDB:
- A comment with the scan summary
- Tags for each detection
- The temporary file is always removed in a
finallyblock
ClamAV: Win.Trojan.Generic (Version: ClamAV 1.3.0)
YARA: APT_Loader, Packed_PE
clamav:win.trojan.generic
yara:apt_loader
yara:packed_pe
Duplicate tags are automatically avoided.
This section is intentionally detailed, because clamdscan requires both:
- signature DB present/updated (
freshclam), and clamddaemon running and reachable.
sudo apt update
sudo apt install -y clamav clamav-daemonPackage names vary between distros, but typically:
sudo dnf install -y clamav clamav-update clamav-server clamav-server-systemdRun a first update manually:
sudo systemctl stop clamav-freshclam 2>/dev/null || true
sudo freshclamEnable periodic updates (if your distro provides a service unit):
sudo systemctl enable --now clamav-freshclam
sudo systemctl status clamav-freshclam --no-pagerTroubleshooting tips:
- If
freshclamfails due to proxy/SSL/network, fix network access first. - If you are rate-limited by upstream mirrors, retry later or configure a local mirror.
Start the daemon (unit name depends on distro):
sudo systemctl enable --now clamav-daemon 2>/dev/null || true
sudo systemctl enable --now clamd 2>/dev/null || trueCheck status:
sudo systemctl status clamav-daemon --no-pager 2>/dev/null || true
sudo systemctl status clamd --no-pager 2>/dev/null || trueVerify the client can talk to the daemon:
clamdscan --versionIf clamdscan cannot connect, the error is usually self-explanatory (socket missing / connection refused).
Depending on distro, typical socket paths include:
/run/clamav/clamd.ctl/var/run/clamav/clamd.ctl
You can search for it:
sudo find /run /var/run -maxdepth 3 -type s -name "*clamd*" 2>/dev/null || trueOr inspect listening unix sockets:
sudo ss -xlpn | grep -i clamd || trueIf you found the socket, you can pass it to the plugin:
export CLAMYARA_CLAMD_SOCKET="/run/clamav/clamd.ctl"EICAR is a harmless standardized test string.
Create a file:
cat > /tmp/eicar.com <<'EOF'
X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*
EOFScan:
clamdscan --no-summary /tmp/eicar.com
echo $?Expected:
- output contains
FOUND - exit code is
1
Below is a list of widely used public rule sources (curate/pin to commits/tags for production use):
- Elastic protection artifacts (includes YARA): https://github.com/elastic/protections-artifacts
- YARA-Rules community repo: https://github.com/Yara-Rules/rules
- Florian Roth / Neo23x0 signature base (includes YARA rules): https://github.com/Neo23x0/signature-base
- Intezer public YARA rules: https://github.com/intezer/yara-rules
- CAPE Sandbox YARA rules: https://github.com/kevoreilly/CAPEv2/tree/master/data/yara
- Quick sanity test against a benign file:
yara /opt/yara/rules.yar /bin/ls >/dev/null
echo $?- If you have a known sample that should match certain rules:
yara /opt/yara/rules.yar /path/to/sampleIf the output prints rule names, those rule names will be used as MWDB tags:
yara:<rule_name>
- Memory safety — file size is validated via metadata before content is downloaded, preventing OOM on oversized files
- Path traversal protection — all temporary file paths are validated to reside inside the system temp directory before being passed to CLI tools
- No shell injection — all subprocess calls use list arguments (never
shell=True) - CLI timeouts — ClamAV and YARA are executed with strict timeouts to prevent hangs
- Startup validation — missing or invalid configuration values raise errors at startup, not silently at runtime
- Guaranteed cleanup — temporary files are always removed via
try/finally, preventing disk leaks
- Only ClamAV and YARA are supported (by design)
- CLI tools must be installed on the MWDB host
- No parallel scanning (by design, to avoid resource exhaustion)
- YARA rules must be compiled into a single file at
CLAMYARA_YARA_RULES_PATH
- Performance: switched ClamAV scanning to
clamdscan(requiresclamddaemon) to avoid repeated signature DB loads on each scan - Config: added
CLAMYARA_CLAMD_SOCKETto optionally specify aclamdunix socket - Config: added
CLAMYARA_CLAMAV_TIMEOUTto configure ClamAV scan timeout - Docs: added detailed ClamAV installation/initial DB download (
freshclam) and EICAR detection test instructions - Docs: added YARA rules sources and basic testing commands
- Security: file size is now checked via metadata before downloading content (prevents OOM)
- Security: added path traversal validation for all temporary file paths
- Fix: corrected
AttributeErrorwhen comparing MWDB tag objects (.tagattribute used correctly) - Fix: corrected YARA return code interpretation (
returncode != 0is always an error) - Fix:
MAX_FILE_SIZEenv var is now validated as a positive integer with a descriptive error on misconfiguration - Fix: ClamAV signature parsing hardened with a regex to handle signatures containing colons
- Fix: empty lines in YARA output no longer cause
IndexError - Improvement: MWDB API client is now created once per process (lazy singleton), not on every file event
- Improvement: missing
CLAMYARA_MWDB_API_URL/CLAMYARA_MWDB_API_KEYnow produce an explicitRuntimeErrorat startup - Improvement: network errors from
mwdb.query_file()are now caught and logged gracefully
- Refactored architecture with strict separation of concerns
- Switched to CLI-only scanning (no Python bindings)
- Added guaranteed temporary file cleanup
- Added file size limit protection
- Improved error handling and logging
- Environment-based configuration
- Fully MWDB-native integration
- Initial implementation
- Basic ClamAV and YARA scanning
- Manual temporary directory handling
MIT License
Askar Dyussekeyev