💥 Breaking Changes
- Bayes per-user resharding: Jump Hash replaced with Ring Hash (Ketama) for consistent upstream hashing; per-user Bayes data on sharded Redis deployments will be on wrong shards after upgrade. Run
rspamadm statistics_dump migrate before upgrading. Single-server deployments are unaffected. (#5914, 4ea7504) - Content URLs included by default:
include_content_urls now defaults to true; URLs extracted from PDF and computed parts are returned by task:get_urls() by default, which may trigger new symbol hits on messages with PDF attachments. Restore old behavior with include_content_urls = false in local.d/options.inc. (#5853) - SSL worker option removed: The
ssl = true worker option has been removed; SSL is now auto-detected from bind socket flags. Remove ssl = true from worker configs and use the ssl suffix on bind lines instead. (#5884) - Proxy load balancing default changed: Token bucket load balancing is now enabled by default for proxy upstreams, replacing simple round-robin. Remove the
token_bucket key from proxy upstream config to restore round-robin behavior. (#5874) - SenderScore RBLs disabled by default:
senderscore_reputation is disabled by default as it requires a MyValidity account and was returning blocked results for all unregistered IPs. Users with registered accounts must explicitly re-enable the rule. (#5907) - DKIM unknown key handling per RFC: Unknown and broken DKIM keys are now handled strictly per RFC, which may change DKIM results for messages with malformed keys. (e9e6bac)
- Suspicious TLDs now map-based: The hardcoded suspicious TLD list has been replaced with
conf/maps.d/suspicious_tlds.inc. Customize by creating local.d/maps.d/suspicious_tlds.inc (override) or local.d/maps.d/suspicious_tlds.inc.local (extend). (614e68c) - Neural module autolearn option renames: Autolearn options in the neural module have been renamed to match RBL module naming conventions. Review custom neural configurations for use of old option names. (#5835)
- libfasttext external dependency removed: The external libfasttext C++ library has been replaced with a built-in mmap-based shim. The
ENABLE_FASTTEXT cmake option is removed (always enabled). Packagers must remove the libfasttext build dependency. (#5897)
✅ Added
- Jinja2 configuration templates: Configuration files are now preprocessed by the Lupa Jinja2-compatible template engine before UCL parsing. Environment variables prefixed with
RSPAMD_ are exposed as the env table in templates; modified delimiters ({= =} for expressions, {% %} for control structures) avoid conflicts with UCL syntax. New validation filters (mandatory, require_int, require_number, require_bool, require_duration, require_json, fromjson) abort startup with a clear error on invalid input, enabling container-ready configuration validation without shell entrypoint scripts. (#5938, #5941) - checkv3 multipart protocol: New
/checkv3 endpoint using multipart/form-data requests and multipart/mixed responses; metadata sent as structured JSON/msgpack instead of HTTP headers, per-part zstd compression, optional body part for rewritten messages, and zero-copy piecewise writev for responses. Use rspamc --protocol-v3 or rspamc --msgpack to activate. (#5880) - Pluggable Hyperscan cache backend: Hyperscan compilation and caching moved to an async Lua backend with Redis-based shared database support across workers and hosts. Async compilation prevents blocking the main event loop; self-healing cache auto-detects stale blobs and triggers recompile; small databases compiled in-memory without file caching. (#5813, #5952)
- Multi-flag fuzzy hashes: A single fuzzy hash can now carry up to 8 flags simultaneously, allowing multiple rules to match the same digest with independent flag/value pairs. Redis update path rewritten in Lua with EVALSHA and NOSCRIPT recovery. Backward-compatible epoch 12 wire protocol with highest-value flag promoted to the primary slot. Fuzzy hashes now stored in Redis history. (#5894, #5860)
- HTML fuzzy phishing detection: Dual-mode fuzzy matching — template matching and domain-sensitive matching. New
FUZZY_HTML_PHISHING symbol fires when an HTML template matches but domains differ, detecting reused phishing templates with swapped links. (173058061) - Built-in Fasttext shim: External C++ libfasttext replaced with a zero-dependency mmap-based reader providing shared memory across workers via
MAP_SHARED, eliminating per-worker heap copies and saving approximately 500MB–7GB RAM. No more C++ exception ABI issues. Existing .bin/.ftz models continue to work unchanged. Fasttext wired through maps infrastructure for hot-reloading. (#5897, #5909) - Neural network and LLM embedding improvements: External pretrained neural model support; LLM embedding providers with multi-model support, mean+max pooling, and SIF word weighting; multi-layer funnel architecture; language-based model and URL selection; expression-based autolearn for neural LLM providers; GPT module with configurable consensus thresholds,
context_augment hook, and mempool variable storage. (#5924, #5903, #5897, #5835) - HTTPS server support: Workers can now serve HTTPS natively with SSL auto-detected from bind socket configuration, enabling secure WebUI and API without a reverse proxy. (#5884, d04b367)
- Ring Hash (Ketama) consistent hashing: Proper consistent hashing with virtual nodes ensures only ~1/n keys redistribute when an upstream fails, and keys return to their original upstream on recovery. (4ea7504)
- Token bucket proxy load balancing: New load balancing algorithm for proxy upstreams with configurable
max_tokens, scale, and base_cost parameters for better burst traffic handling. (#5874) - Multiclass Bayes support: Classifiers now support arbitrary classes beyond binary spam/ham. WebUI learning interface updated for multi-class workflows.
/stat and /bayes/classifiers endpoints extended with classifier metadata. rspamadm statistics_dump supports multi-class dump and restore. (#5900, #5893, #5914) - Structured metadata exporter: New structured formatter for the metadata exporter module with zstd compression option and detected MIME types for attachments. (#5890)
- UUID v7 per task: Native UUID v7 generation per scanning task synced with the
Log-Tag header and ClickHouse UUID v7 column support. (#5890) - ARC trusted_authserv_id: Reuse upstream authentication results via trusted
Authentication-Results headers from known authentication servers. (506ef44) - Legacy protocol milter headers: Milter
add_headers and remove_headers exposed in the RSPAMC/SPAMC text protocol with extended symbol info including descriptions and options, enabling Exim to access milter headers via $spam_report. (#5948) - rspamadm new subcommands:
rspamadm autolearnstats for autolearn statistics analysis; rspamadm logstats and rspamadm mapstats as rewrites of legacy Perl scripts; rspamadm statistics_dump migrate for Bayes shard migration. (#5946, #5885, #5914) - HTTP content negotiation: Framework for content negotiation on API endpoints;
/stat endpoint supports zstd-compressed responses. (#5832) - PDF improvements: ASCII85 decode support, ligature substitution fix, object padding evasion defeat, and small objects no longer counted toward processing limits. (73a37be, eb1acde, 2b91e5e, 1f02010)
- Reply-To validity checks: New header checks for
Reply-To address validity. (e95533f)
🔧 Fixed
- Fuzzy UDP use-after-free (critical): Fixed use-after-free on ev_io watcher in fuzzy UDP sessions. (4557166)
- Fuzzy TCP CPU busy-loop: Fixed CPU spin in fuzzy TCP client under certain error conditions. (06dba44)
- SPF address family flag inheritance: Correct propagation of address family flags in SPF resolution. (2a8643e)
- DKIM RSA signing memory leak: Fixed memory leak in RSA path of DKIM signing. (9608160)
- RHEL/CentOS 10 SHA-1 DKIM policy bypass: Fixed crypto-policy bypass for SHA-1 DKIM signatures on RHEL/CentOS 10. (7a38a8e)
- Ratelimit compatibility with old records: Fixed backward compatibility with legacy ratelimit bucket records. (#5842)
- Weighted round-robin not respecting weights: Fixed upstream selection ignoring configured weights. (f563e25)
- SVG misdetection: Fixed incorrect HTML detection for messages with embedded SVG content. (170c4c5)
- Hyperscan use-after-free on config reload: Multiple use-after-free issues in Hyperscan cache handling during live configuration reload resolved. (#5813)
- Jemalloc tuning: Jemalloc tuned for Rspamd's single-threaded multi-process architecture, reducing memory overhead. (#5949)
🔄 Improved
- Consistent hash distribution: Ring Hash with virtual nodes provides true minimal disruption on upstream failure and guarantees key return to original upstream on recovery, replacing the previous Jump Hash algorithm.
- Hyperscan async compilation: Compilation no longer blocks the main event loop; self-healing blob detection ensures cache correctness after Hyperscan version changes.
- Fasttext memory efficiency: Built-in shim shares model data across all worker processes via shared memory, eliminating 500MB–7GB of duplicate heap allocations typical in multi-worker deployments.
- Fuzzy hash expressiveness: Multi-flag support allows a single stored digest to satisfy multiple independent rule checks simultaneously without duplication in storage.
Rspamd 4.0 is a landmark release delivering foundational infrastructure improvements alongside major new capabilities. The new `/checkv3` multipart protocol modernizes the scanning API with structured metadata, per-part compression, and zero-copy response paths. The built-in Fasttext shim eliminates a heavyweight C++ dependency while dramatically reducing per-worker memory usage. Multi-flag fuzzy hashes unlock more expressive detection rules, and HTML fuzzy phishing detection brings template-aware link-swap detection to the fuzzy engine. The move to Ring Hash consistent hashing corrects shard distribution behavior for Redis-backed deployments — users with sharded Bayes **must** run the migration tool before upgrading. This release is recommended for all users; users running sharded Redis Bayes backends should follow the migration procedure before upgrading.