AuthonAuthon Blog
debugging7 min read

How to Build a Lightweight Rule Engine for Automated Compliance Checks

Build a lightweight rule engine for automated compliance checks using simple Python patterns — no heavy frameworks needed.

AW
Alan West
Authon Team
How to Build a Lightweight Rule Engine for Automated Compliance Checks

California just announced it'll start ticketing driverless cars that break traffic laws. That got me thinking — not about self-driving cars specifically, but about a problem I've hit on three different projects: how do you make an automated system respect a set of rules that change over time?

Whether you're building a CI/CD pipeline that enforces deployment policies, an API gateway with rate-limiting rules, or a workflow engine that needs to comply with business regulations, you eventually need a rule engine. And if you reach for a massive enterprise framework on day one, you'll regret it.

Here's how I build lightweight rule engines that actually hold up in production.

The Problem: Hardcoded Rules Rot Fast

Every project starts the same way. Someone says "just add an if-statement." So you do.

python
# This is fine... for now
def check_deployment(deploy_request):
    if deploy_request.target == "production" and not deploy_request.has_approval:
        return Denied("Production deploys require approval")
    if deploy_request.time.hour < 9 or deploy_request.time.hour > 17:
        return Denied("No deploys outside business hours")
    return Approved()

Then the rules multiply. Then someone wants to change them without a code deploy. Then different environments need different rules. Then someone asks for an audit log of which rules fired and why.

Now your neat little function is 200 lines of nested conditionals, and every change is a production risk.

The Core Pattern: Separate Rules From Execution

The fix isn't a framework — it's a pattern. You need three things:

  • A rule definition format (data, not code)
  • An evaluation engine (small, testable, deterministic)
  • A result collector (for audit trails and debugging)
  • Here's the minimal version I keep coming back to:

    python
    from dataclasses import dataclass, field
    from typing import Any, Callable
    import operator
    
    # Map string operators to actual functions
    OPERATORS = {
        "eq": operator.eq,
        "ne": operator.ne,
        "gt": operator.gt,
        "lt": operator.lt,
        "gte": operator.ge,
        "lte": operator.le,
        "in": lambda val, collection: val in collection,
        "not_in": lambda val, collection: val not in collection,
        "contains": lambda collection, val: val in collection,
    }
    
    @dataclass
    class Rule:
        name: str
        field: str           # dot-notation path into the context
        op: str              # operator key from OPERATORS
        value: Any           # what we're comparing against
        message: str = ""    # human-readable explanation
        severity: str = "error"  # error, warning, info
    
    @dataclass
    class RuleResult:
        rule: Rule
        passed: bool
        actual_value: Any = None
    
    def resolve_field(obj: dict, path: str) -> Any:
        """Navigate nested dicts with dot notation: 'deploy.target.env'"""
        current = obj
        for key in path.split("."):
            if isinstance(current, dict):
                current = current.get(key)
            else:
                return None
        return current
    
    def evaluate(rules: list[Rule], context: dict) -> list[RuleResult]:
        results = []
        for rule in rules:
            actual = resolve_field(context, rule.field)
            op_func = OPERATORS.get(rule.op)
            if op_func is None:
                raise ValueError(f"Unknown operator: {rule.op}")
            try:
                passed = op_func(actual, rule.value)
            except TypeError:
                passed = False  # type mismatch = rule not satisfied
            results.append(RuleResult(rule=rule, passed=passed, actual_value=actual))
        return results

    Nothing fancy. No DSL parser, no YAML templating language, no dependency injection. Just data in, results out.

    Loading Rules From Config

    The real power comes when rules live outside your code. I typically use JSON or YAML, loaded at startup or fetched from a config service.

    python
    import json
    
    def load_rules(path: str) -> list[Rule]:
        with open(path) as f:
            raw = json.load(f)
        return [Rule(**r) for r in raw["rules"]]
    
    # rules.json
    # {
    #   "rules": [
    #     {
    #       "name": "business_hours_only",
    #       "field": "request.hour",
    #       "op": "gte",
    #       "value": 9,
    #       "message": "Action not permitted outside business hours",
    #       "severity": "error"
    #     },
    #     {
    #       "name": "max_batch_size",
    #       "field": "payload.items_count",
    #       "op": "lte",
    #       "value": 1000,
    #       "message": "Batch size exceeds safe limit",
    #       "severity": "warning"
    #     }
    #   ]
    # }

    Now your ops team can tweak compliance rules without touching application code. You can version the rule files in git, diff them in PRs, and roll them back independently.

    Adding Rule Groups and Short-Circuit Logic

    In practice, you'll want to group rules. Some groups should short-circuit (stop on first failure), others should collect all violations.

    python
    @dataclass
    class RuleGroup:
        name: str
        rules: list[Rule]
        mode: str = "all"  # "all" = collect everything, "first_fail" = stop early
    
    def evaluate_group(group: RuleGroup, context: dict) -> list[RuleResult]:
        results = []
        for rule in group.rules:
            actual = resolve_field(context, rule.field)
            op_func = OPERATORS[rule.op]
            try:
                passed = op_func(actual, rule.value)
            except TypeError:
                passed = False
            result = RuleResult(rule=rule, passed=passed, actual_value=actual)
            results.append(result)
            # bail early if this group uses short-circuit mode
            if not passed and group.mode == "first_fail":
                break
        return results
    
    def evaluate_all_groups(groups: list[RuleGroup], context: dict) -> dict:
        return {
            group.name: evaluate_group(group, context)
            for group in groups
        }

    This is the 80/20 point. You've got configurable rules, grouped evaluation, short-circuit logic, and a full audit trail of what passed and what didn't. For most projects, this is enough.

    When You Actually Need More

    I've only outgrown this pattern twice in eight years. The signs you need something heavier:

    • Rules reference other rules ("if rule A passed, skip rule B") — now you need a dependency graph
    • Rules need temporal logic ("this value was X five minutes ago") — now you need state
    • Non-technical users need to author rules — now you need a UI and probably a real DSL

    If you hit those cases, look at existing open-source rule engines for your ecosystem before building one. Python has projects like business-rules. JavaScript has json-rules-engine. Go has grule-rule-engine. They handle the graph traversal and conflict resolution that you don't want to write yourself.

    But don't start there. Start with the 50-line evaluator above and see how far it takes you.

    Practical Tips From Production

    A few things I learned the hard way:

    • Always log the full context alongside results. When someone asks "why was this request denied at 2 AM last Tuesday," you want the exact input that was evaluated, not just the rule name that fired.
    • Version your rule sets. Every time rules change, tag the version. Store the version alongside any decision the engine made. You'll need this for audits.
    • Test rules like code. Write unit tests for your rule definitions. Feed them known contexts, assert expected outcomes. This catches typos in field names and logic inversions before production does.
    • Set up dry-run mode from day one. Before enforcing a new rule, run it in shadow mode — evaluate but don't block. This has saved me from deploying overly aggressive rules more times than I want to admit.
    python
    def make_decision(groups: list[RuleGroup], context: dict, dry_run: bool = False):
        all_results = evaluate_all_groups(groups, context)
        failures = [
            r for results in all_results.values()
            for r in results
            if not r.passed and r.rule.severity == "error"
        ]
        decision = "deny" if failures and not dry_run else "allow"
        # always log regardless of mode
        log_decision(context, all_results, decision, dry_run)
        return decision, all_results

    Wrapping Up

    The pattern here isn't specific to any domain. I've used it for deployment gates, invoice validation, content moderation filters, and API request policies. The shape is always the same: define rules as data, evaluate them against a context, collect the results.

    Start with the simplest evaluator that does the job. Keep rules in version-controlled config files. Log everything. Add complexity only when the current system genuinely can't express what you need.

    Forty lines of code and a JSON file will get you surprisingly far.

    How to Build a Lightweight Rule Engine for Automated Compliance Checks | Authon Blog