Rspamd Features
Rspamd is a high-performance spam filtering system that combines traditional spam detection techniques with modern machine learning approaches. Each message is analyzed by multiple independent checks (called symbols) that contribute to a cumulative spam score. Based on this score and configurable thresholds, Rspamd recommends an action (reject, greylist, add header, or accept) to your mail server.
Core Architecture Features
Event-Driven Non-Blocking Architecture
Rspamd uses libevent for asynchronous I/O operations, allowing a single worker process to handle thousands of concurrent connections without blocking.
Technical details:
- Non-blocking DNS lookups (hundreds of RBL/DKIM/SPF queries per message)
- Asynchronous Redis operations for statistics and learning
- Parallel HTTP requests to external services (antivirus, URL checkers)
- Async session tracking ensures no message is finalized until all checks complete
Performance impact:
- Single worker can process 100+ messages simultaneously
- Typical message scanning: 50-200ms (including all network operations)
- Memory footprint: ~50-100MB per worker process
See Architecture documentation for internal details.
Multi-Worker Process Model
Rspamd uses a master-worker architecture inspired by nginx:
- Main process: Configuration management, worker lifecycle, no message processing
- Normal workers: Message analysis and scoring
- Proxy workers: Protocol translation (milter, HTTP) and load balancing
- Controller worker: Web UI and management API
- Fuzzy storage worker: Distributed fuzzy hash storage
Benefits:
- Isolated processes improve stability (one worker crash doesn't affect others)
- Easy horizontal scaling (add more workers or servers)
- Zero-downtime configuration reloads (graceful worker restart)
- Per-worker resource limits prevent runaway processes
Modular Plugin System
Over 60 built-in modules can be enabled/disabled/configured independently:
# Example: Configure SPF module
# /etc/rspamd/local.d/spf.conf
external_relay = ["192.168.1.0/24"]; # Skip SPF for internal relays
whitelist = ["example.com"]; # Whitelist trusted domains
Module categories:
- Authentication: SPF, DKIM, DMARC, ARC
- Content analysis: Regex rules, MIME checks, language detection
- External services: Antivirus, URL redirector, GPT integration
- Statistics: Bayes classifier, neural networks, fuzzy hashing
- Policies: Rate limiting, greylisting, whitelisting, force actions
- Exporting: ClickHouse, Elastic, metadata exporter
See Modules documentation for complete list.
Flexible Configuration System
Rspamd uses UCL (Universal Configuration Language) - a JSON-compatible format with includes and macros:
# Base configuration
reject = 15;
# Include external file
.include(try=true) "/etc/rspamd/custom-thresholds.conf"
# Macros
.define MY_NETWORK "192.168.1.0/24"
whitelist_ip = "$MY_NETWORK";
Configuration layers:
- Default config:
/etc/rspamd/rspamd.conf - Module defaults:
/etc/rspamd/modules.d/*.conf - Local overrides:
/etc/rspamd/local.d/*.conf(recommended) - Force overrides:
/etc/rspamd/override.d/*.conf(complete replacement)
See Configuration fundamentals for best practices.
Advanced Analysis Features
Email Authentication Standards
SPF (Sender Policy Framework):
- Validates sender IP against DNS records
- Supports includes, redirects, and complex policies
- Symbols:
R_SPF_ALLOW,R_SPF_FAIL,R_SPF_SOFTFAIL,R_SPF_NEUTRAL,R_SPF_PERMFAIL
DKIM (DomainKeys Identified Mail):
- Verifies cryptographic signatures in email headers
- Supports multiple signatures per message
- Caches public keys in Redis for performance
- Symbols:
R_DKIM_ALLOW,R_DKIM_REJECT,R_DKIM_TEMPFAIL,R_DKIM_PERMFAIL
DMARC (Domain-based Message Authentication):
- Combines SPF and DKIM results with domain policy
- Supports aggregate and forensic reporting
- Policy enforcement: none, quarantine, reject
- Symbols:
DMARC_POLICY_ALLOW,DMARC_POLICY_REJECT,DMARC_POLICY_QUARANTINE,DMARC_POLICY_SOFTFAIL
ARC (Authenticated Received Chain):
- Preserves authentication results across forwarding
- Validates authentication chain integrity
- Essential for mailing lists and forwarders
- Symbols:
ARC_ALLOW,ARC_REJECT,ARC_INVALID
See SPF module, DKIM module, DMARC module, ARC module.
Statistical Learning
Bayesian Classification:
- Token-based statistical analysis (words, patterns, metadata)
- Redis backend with automatic token expiration
- Per-user and per-language training support
- Autolearn mode: automatically train on high-confidence spam/ham
# /etc/rspamd/local.d/classifier-bayes.conf
backend = "redis";
new_schema = true;
expire = 8640000; # 100 days
autolearn {
spam_threshold = 12.0;
ham_threshold = -2.0;
check_balance = true;
}
Neural Networks:
- Multi-layer perceptron with rule outputs as inputs
- Automatically learns optimal symbol weight combinations
- Separate networks for short/medium/long messages
- Requires Redis for weight storage
# /etc/rspamd/local.d/neural.conf
rules {
"NEURAL_SPAM" {
train {
max_trains = 10000; # Training cycles
max_usages = 100; # Retrains after this many classifications
spam_score = 8.0; # Learn as spam if score >= 8
ham_score = -2.0; # Learn as ham if score <= -2
}
}
}
Fuzzy Hashing:
- Perceptual hashing identifies similar messages
- Resistant to minor content modifications
- Distributed storage with replication
- Encrypted communication between nodes
Use cases:
- Newsletter detection (shared fuzzy hash across servers)
- Spam campaign identification (similar messages)
- Hash sharing with trusted partners
See Statistic configuration, Neural module, Fuzzy check module.
Content Analysis
Regular Expression Rules:
- LuaJIT-optimized regex engine (Hyperscan on x86_64)
- Multi-expression matching in single pass
- Header, body, URL, and raw content matching
Example custom rule:
-- /etc/rspamd/local.d/custom_rules.lua
rspamd_config.SUSPICIOUS_ATTACHMENT = {
callback = function(task)
local parts = task:get_parts()
for _, part in ipairs(parts) do
local ext = part:get_extension()
if ext and (ext == "exe" or ext == "scr" or ext == "bat") then
return true, 1.0, ext -- Return true, weight 1.0, attachment extension
end
end
return false
end,
score = 5.0,
group = "malware",
description = "Suspicious executable attachment"
}
MIME Structure Analysis:
- Malformed MIME detection
- Charset validation and conversion
- Attachment type checking
- HTML/text ratio analysis
- Embedded image analysis
Language and Charset Detection:
- Automatic language identification (60+ languages)
- Mixed charset detection (common in spam)
- UTF-8 validation
- CJK (Chinese, Japanese, Korean) support
URL Processing:
- Extracts URLs from HTML, text, and headers
- SURBL/URIBL lookups (real-time URL blacklists)
- URL redirector resolution (follows shortened URLs)
- Phishing detection (lookalike domains)
- TLD validation
See Regexp module, SURBL module, Phishing module.
Reputation and Blacklists
RBL (Real-time Blackhole Lists):
- Parallel DNS queries to multiple RBLs (50+ preconfigured)
- IP reputation: sender IP, email server IPs from headers
- Automatic retry logic and caching
- Configurable weights per RBL
Commonly used RBLs:
- Spamhaus (ZEN, DBL, PBL)
- SORBS
- SpamCop
- Barracuda
- URIBL (URL-based)
ASN and Country Detection:
- GeoIP2/MaxMind database integration
- ASN-based reputation scoring
- Country-specific rules
IP Score Module:
- Tracks IP reputation based on historical behavior
- Learns from user actions (spam/ham classification)
- Exponential decay for old data
- Whitelist trusted IPs automatically
See RBL module, ASN module, IP Score module.
Anti-Abuse Mechanisms
Greylisting:
- Temporary rejection of unknown sender/recipient pairs
- Legitimate MTAs retry within minutes; spambots don't
- Redis-backed triplet storage (IP, sender, recipient)
- Configurable delay and expiration
# /etc/rspamd/local.d/greylist.conf
expire = 86400; # 24 hours
timeout = 300; # 5 minutes delay
whitelist_ip = []; # IPs to skip greylisting
whitelist_rcpt = []; # Recipients to skip greylisting
Rate Limiting:
- Limits messages per time period by IP, sender, recipient, or custom selector
- Bucket-based rate limiting (token bucket algorithm)
- Multiple limit tiers (soft limits, hard limits)
- Redis-backed counters
# /etc/rspamd/local.d/ratelimit.conf
rates {
# Limit to 100 messages per hour per sender IP
to = {
bucket = {
burst = 120;
rate = "100 / 1h";
}
}
# Limit to 1000 recipients per hour per authenticated user
to_ip_from = {
bucket = {
burst = 1100;
rate = "1000 / 1h";
}
}
}
Spamtrap Detection:
- Mark certain addresses as spamtraps
- Auto-learn as spam any message to spamtraps
- Feed spamtraps to Bayesian classifier
- Block sender IPs sending to spamtraps
See Greylisting module, Ratelimit module, Spamtrap module.
Integration and Management
Protocol Support
HTTP/JSON API:
- Native protocol for message scanning
- RESTful endpoints for management
- WebSocket support for real-time updates
- HTTPCrypt encryption for inter-server communication
Example API request:
curl -X POST http://localhost:11333/checkv2 \
-H "Content-Type: message/rfc822" \
--data-binary @message.eml
Response:
{
"action": "add header",
"score": 8.5,
"required_score": 15.0,
"symbols": {
"R_SPF_FAIL": {"score": 1.0},
"BAYES_SPAM": {"score": 3.5, "options": ["0.95"]},
"SUSPICIOUS_URL": {"score": 2.0}
},
"messages": [],
"message-id": "msg-12345"
}
Milter Protocol:
- Compatible with Postfix, Sendmail, and other milter-capable MTAs
- Protocol translation via Proxy worker
- Support for all milter actions (reject, tempfail, add/remove headers, modify body)
- Multiplexing multiple messages over single connection
Exim Protocol:
- Native integration via Exim's spam scanner interface
- Support for Exim ACLs
- Per-recipient scanning
See Protocol documentation, Integration guide.
Web Interface
Modern single-page application for monitoring and management:
Features:
- Real-time message history with detailed symbol breakdown
- Live statistics and graphs (messages/sec, actions distribution)
- Bayesian training (learn spam/ham from web UI)
- Fuzzy hash management (add/delete hashes)
- Configuration validation
- Symbol and rule management
- Server cluster monitoring (multiple Rspamd instances)
Access control:
- Password-protected (bcrypt hashing)
- Separate read-only and enable passwords
- IP-based access restrictions
- Optional HTTPS with client certificates
API endpoints:
/stat- Server statistics/graph- Historical data (requires ClickHouse or Redis)/history- Recent messages/errors- Error log/learn_spam,/learn_ham- Training endpoints/saveactions- Modify action thresholds
See Controller worker documentation.
Monitoring and Observability
Built-in Metrics:
- Message processing statistics (total, per action, per symbol)
- Performance metrics (scan time, DNS time, cache hit rate)
- Bayesian learning statistics (spam/ham ratio, token count)
- Connection statistics (active connections, total processed)
Prometheus Integration:
/metricsendpoint in Prometheus format- Metric exporter module for custom metrics
- Pre-built Grafana dashboards available
Logging:
- Structured JSON logging
- Syslog support
- Per-module log levels
- Request ID tracking for debugging
Health Checks:
/ping- Liveness check (is Rspamd responding?)/stat- Readiness check (is Rspamd ready to process?)- Systemd watchdog support
Example Prometheus query:
# Message processing rate
rate(rspamd_scanned_total[5m])
# Spam detection rate
rate(rspamd_spam_total[5m]) / rate(rspamd_scanned_total[5m])
Deployment and Scalability
High Availability
Load Balancing:
- Proxy worker can forward to multiple Normal workers
- Round-robin, hash-based, or least-connection algorithms
- Automatic failover on worker failure
- Health checks for backend workers
Redis High Availability:
- Redis Sentinel support for automatic failover
- Redis Cluster support for sharding
- Consistent hashing for multi-Redis setups
- Connection pooling and retry logic
Fuzzy Storage Replication:
- Master-slave replication for fuzzy hashes
- Mirroring mode: write to multiple storage nodes
- Encrypted replication channels
Configuration Synchronization:
- Centralized configuration management (version control)
- Configuration templating (Ansible, Puppet, Chef)
- Dynamic configuration updates via controller API
Horizontal Scaling
Typical deployment patterns:
-
Single server (< 100K messages/day):
MTA → Rspamd (proxy + normal + controller) → Redis -
Load-balanced (100K - 1M messages/day):
┌→ Rspamd Worker 1 ┐
MTA → Rspamd Proxy →→ Rspamd Worker 2 → Redis
└→ Rspamd Worker 3 ┘ -
Distributed (> 1M messages/day):
MTA Cluster → Hardware LB → Rspamd Proxy Cluster → Rspamd Worker Cluster → Redis Cluster
Performance expectations:
- Single worker: 5-10 messages/sec (500K-1M messages/day)
- 4-worker server: 20-40 messages/sec (2-3M messages/day)
- DNS resolver speed is often the bottleneck (use local recursive resolver)
See Architecture deployment patterns.
Security
HTTPCrypt Protocol:
- Lightweight encryption for inter-server communication
- X25519 key exchange + XChaCha20-Poly1305 encryption
- Forward secrecy with ephemeral keys
- No certificate management (public key cryptography)
# Enable encryption between proxy and workers
# /etc/rspamd/local.d/worker-proxy.inc
upstream "backend" {
hosts = "backend1.example.com:11333";
encryption = {
type = "httpcrypt";
pubkey = "your-public-key-here";
};
}
Secure Defaults:
- Web interface bound to localhost by default
- No open ports on public interfaces
- Minimal attack surface (no direct MTA protocol handling)
- Regular security updates
See Encryption documentation for cryptographic details.
Resource Efficiency
Memory Management:
- Custom memory pools for per-message allocations
- Bulk deallocation when message processing completes
- Typical memory usage: 50-100MB per worker
- No memory leaks in core (valgrind-tested)
CPU Optimization:
- Zero-copy message handling where possible
- Regex engine optimization (Hyperscan on x86_64)
- LuaJIT for fast rule execution
- SIMD operations for fuzzy hashing
Disk I/O:
- Minimal disk writes (only logs)
- All working data in Redis (in-memory)
- Optional persistent history (SQLite or ClickHouse)
Network Optimization:
- Connection pooling for Redis, HTTP, and DNS
- Parallel DNS queries (configurable socket count)
- Keep-alive connections where supported
- Request pipelining for batch operations
Extending Rspamd
Lua Plugin Development
Write custom plugins in Lua with full access to Rspamd internals:
-- /etc/rspamd/plugins.d/sender_reputation.lua
local lua_redis = require "lua_redis"
local rspamd_logger = require "rspamd_logger"
-- Callback for checking sender reputation
local function check_sender_reputation(task)
local from = task:get_from('smtp')
if not from or not from[1] then
return false
end
local sender = from[1].addr:lower()
-- Async Redis callback
local function redis_cb(err, data)
if err then
rspamd_logger.warnx(task, 'Redis error: %s', err)
return
end
if data then
local score = tonumber(data)
if score and score > 10 then
-- Insert result symbol with score
task:insert_result('SENDER_BAD_REPUTATION', 1.0, string.format('score=%s', score))
end
end
end
-- Make async Redis request
local redis_params = lua_redis.parse_redis_server('reputation')
if redis_params then
local ret = lua_redis.redis_make_request(task,
redis_params,
sender,
false, -- is write
redis_cb,
'GET',
{'sender_rep:' .. sender}
)
if not ret then
rspamd_logger.warnx(task, 'Cannot make redis request')
end
end
return false -- Do not insert symbol here; will be inserted in callback
end
-- Register callback symbol (virtual, no score)
rspamd_config:register_symbol({
name = 'SENDER_REPUTATION_CHECK',
type = 'normal',
callback = check_sender_reputation,
flags = 'nice', -- Execute even if message is already spam
priority = 5
})
-- Register result symbol (this gets the score)
rspamd_config:register_symbol({
name = 'SENDER_BAD_REPUTATION',
type = 'virtual',
parent = 'SENDER_REPUTATION_CHECK',
score = 5.0,
group = 'reputation',
description = 'Sender has bad reputation in our database'
})
Key concepts for async operations:
- Main callback registers a check symbol (virtual, no score)
- Async operations (Redis, DNS, HTTP) use callbacks to insert results
- Result symbols use
parentto link to check symbol - Main callback returns
false(result inserted asynchronously viatask:insert_result())
Lua API features:
- Full message access (headers, body, attachments, MIME structure)
- Async operations (Redis, HTTP, DNS)
- Task manipulation (insert symbols, add/modify headers)
- Configuration access
- Logging and debugging
See Writing rules and Lua API documentation.
External Service Integration
Antivirus Scanning:
- ClamAV, Sophos, F-Prot, Kaspersky, ESET support
- Parallel scanning with multiple engines
- Result caching to avoid re-scanning
- Timeout and retry handling
URL Filtering:
- Google Safe Browsing API
- OPH (Open Phish)
- Custom URL checkers via HTTP
AI/ML Services:
- GPT integration for content analysis
- Custom HTTP-based classifiers
- Verdict aggregation with existing rules
Data Export:
- ClickHouse for long-term analytics
- Elasticsearch for log aggregation
- Custom webhooks for message events
See External services module, Antivirus module, ClickHouse module.
Migration from Other Systems
SpamAssassin Compatibility
Migration approach:
- Rspamd can work alongside SpamAssassin during transition
- Compatible scoring system and similar rule concepts
- SpamAssassin module can import scores from SA configuration
Differences from SpamAssassin:
- Much faster (10-100x depending on ruleset)
- Event-driven architecture vs process-per-message
- Better handling of modern spam techniques (DMARC, ARC, neural networks)
- Statistical learning requires retraining (Bayes databases are not compatible)
- Different plugin architecture (Lua vs Perl)
Migration strategy:
- Install Rspamd alongside SpamAssassin
- Configure both to add headers (not reject) for testing
- Compare results over several days
- Retrain Bayesian classifier with your mail corpus
- Gradually transition to Rspamd once confident
See SpamAssassin migration guide for detailed migration steps.
Comparison with Other Solutions
| Feature | Rspamd | SpamAssassin | Amavis | Rspamd Advantage |
|---|---|---|---|---|
| Performance | 20-40 msg/sec/core | 2-4 msg/sec/core | 5-10 msg/sec/core | 10-100x faster |
| Architecture | Event-driven, async | Process-per-message | Process pool | Non-blocking I/O |
| Memory usage | 50-100MB/worker | 30-50MB/process | 100-200MB/process | Efficient memory pools |
| Learning | Bayes, Neural, Fuzzy | Bayes only | Via SpamAssassin | Multiple ML methods |
| Configuration | UCL (structured) | Plain text rules | Perl code | Type-safe, validated |
| Web UI | Modern SPA | None (third-party) | None | Built-in monitoring |
| Protocol | HTTP, Milter, native | Spamc/spamd | SMTP proxy | Flexible integration |
| Real-time updates | Dynamic rules via maps | Restart required | Restart required | No downtime |
| Clustering | Native support | Limited | No | Built-in load balancing |
See Comparison page for detailed analysis.
Community and Support
- Documentation: Comprehensive guides, API reference, examples
- Community support: Support channels (Matrix, mailing list, GitHub)
- Professional support: Available from Rspamd developers and partners
- Active development: Regular releases, security updates, new features
- Open source: Apache 2.0 license, source code on GitHub
Ready to start? See Installation Guide → First Setup → Configuration Fundamentals