The Trust Problem: "How Do I Know Your Numbers Are Right?"

If you've ever evaluated a payroll platform, you've asked some version of this question. Maybe during a vendor demo. Maybe during due diligence. Maybe after the first production payrun produced a number that didn't match what your payroll consultant expected. The question is always the same: how do I know your calculations are correct?

The traditional answer from payroll vendors is a variation of "trust us." We've been doing this for twenty years. We have payroll consultants on staff. We're certified. Our clients haven't complained. These are all reasonable signals of competence, but they're not evidence. They're social proof — and social proof doesn't survive an audit.

When a tax authority questions a specific calculation — why is this employee's Solidaritätszuschlag EUR 0 when their income clearly exceeds the Freigrenze? — the correct answer isn't "our system is certified." The correct answer is: here is the test that verifies this exact scenario, here is the input data, here is the expected result, and here is the calculation trace that proves the engine produces it.

That's the premise behind PayrollEx's approach to compliance. Every country regulation ships with a test suite. Not documentation tests. Not sample data. Integration tests — full payrun executions against a running Payroll Engine backend, with explicit expected values for every wage type in the result.

The number: 694 integration tests across 11 countries. Each test defines an employee scenario, runs a payrun against the live calculation engine, and asserts specific expected values for income tax, social security contributions, employer costs, and net pay. Every regulation update must pass all tests before deployment.

This isn't test-driven development in the software engineering sense — though it borrows from that methodology. It's test-driven compliance: the proposition that every statutory calculation should be independently verifiable, and that the proof should ship alongside the product.

For providers, this changes the conversation. Instead of "trust us," the answer becomes "inspect the tests yourself." Instead of a black box that produces payslips, it's a transparent system where every calculation can be traced, reproduced, and challenged.

What a Compliance Test Looks Like

A compliance test in PayrollEx is not a unit test that checks a single function in isolation. It's an end-to-end integration test that exercises the entire calculation pipeline: case data in, payrun execution, results out, assertions verified.

Each test lives in its own folder with three mandatory files:

  • Exchange JSON (.et.json) — The complete test scenario: employee case data, payrun job invocation, and the expected payroll results with specific wage type values
  • Run command (.pecmd) — A single command that executes the test against a running backend
  • README (README.md) — A human-readable calculation walkthrough showing the step-by-step arithmetic for every expected value

The exchange JSON is the machine-readable contract. It declares: "Given this employee with these characteristics, running payroll for this period, the system must produce these exact numbers." The README is the human-readable proof: it shows why those numbers are correct, tracing through statutory formulas, thresholds, and rates.

A Concrete Example: German Income Tax

Consider a test for German Lohnsteuer (income tax). The scenario: an employee in Steuerklasse I (tax class 1, single, no children), earning EUR 3,000 per month, working in a state without Kirchensteuer (church tax). No special circumstances — no company car, no bAV, no Einmalzahlung. A straightforward monthly payrun.

The exchange JSON defines the employee's case values: tax class, monthly salary, social insurance category, health insurance type (gesetzlich), state of employment. It specifies the payrun period (January 2026) and the expected results: Lohnsteuer, Solidaritätszuschlag, KV/PV/RV/AV contributions (employee and employer shares), Umlage costs, and net pay.

The README traces the arithmetic. For Lohnsteuer, it follows the Programmablaufplan (PAP) published by the German Federal Ministry of Finance — the official algorithm that every German payroll system must implement. The PAP takes annual gross income, subtracts Werbungskostenpauschale, Sonderausgabenpauschale, and the Vorsorgepauschale (itself a multi-step calculation involving KV, PV, RV, and AV contributions), applies the tax tariff formula, and produces annual income tax. Divide by twelve for the monthly value.

For social security, the README shows each contribution: the applicable rate, the Beitragsbemessungsgrenze (contribution ceiling), whether the employee's salary is above or below the ceiling, and the resulting contribution for both employee and employer. Each number traces back to a statutory parameter stored in a versioned data regulation.

The test doesn't just check that "income tax is approximately right." It asserts a specific value — EUR 301.33 (or whatever the PAP produces for this scenario in 2026) — to the cent. If the engine produces EUR 301.34, the test fails. That precision is the point: payroll is exact, and the tests enforce exactness.

Why Three Files, Not One

The three-file structure serves different audiences. The exchange JSON is for machines: it's what the test runner consumes. The pecmd file is for operators: it's a single command you can run to reproduce the test. The README is for auditors, consultants, and anyone who needs to understand why a particular result is correct without reading JSON.

This matters in practice. When a provider's payroll consultant reviews a regulation update, they don't need to parse JSON. They open the README, see the scenario parameters, see the expected results, and see the arithmetic. If the arithmetic matches their understanding of the statute, the test is credible. If it doesn't, they've found either a bug or a misunderstanding — both of which are valuable to surface before production.

Coverage by Country

Test coverage varies by country, reflecting the complexity of each jurisdiction's payroll regulations and the breadth of scenarios that need verification. Here's the current state:

Country Tests Key Areas Covered
Germany (DE) 119 LSt PAP, SolZ, KiSt (Bayern/NRW), KV/PV/RV/AV, Gleitzone/Midijob, UV, Umlagen, KUG, Retro correction, bAV, Firmenwagen, Einmalzahlung, Märzklausel, PV Sachsen, Aktivrente
Austria (AT) 52 Lohnsteuer tariff, SV (ASVG), DB/DZ, Kommunalsteuer, Sonderzahlungen, Pendlerpauschale
Switzerland (CH) 68 Quellensteuer (cantonal tariffs), AHV/IV/EO, ALV, BVG, KTG/UVG, Familienausgleich
Netherlands (NL) 74 Loonbelasting (white/green table), ZVW, WW/WIA/WAO, arbeidskorting, algemene heffingskorting, ABP/PfV
Belgium (BE) 58 Bedrijfsvoorheffing, RSZ, Bijzondere Bijdrage Sociale Zekerheid, werkbonus, fiscale aftrekken
Luxembourg (LU) 45 Retenue d'impôt (barème mensuel), CNS, CCSS cotisations, crédit d'impôt salarié
Spain (ES) 61 IRPF (Art. 20 LIRPF), Seguridad Social (Régimen General), Convenios colectivos, pagas extra, contratos parciales
France (FR) 48 PAS (prélèvement à la source), cotisations patronales/salariales, CSG/CRDS, retraite complémentaire
Portugal (PT) 39 IRS (tabelas de retenção), Segurança Social (TSU), subsistence allowances, meal allowance
United Kingdom (UK) 64 PAYE (cumulative/Week 1), National Insurance (multiple categories), Pension auto-enrolment, Student Loan (Plan 1–5), NMW/NLW
United States (US) 66 Federal income tax (W-4 2020+), FICA (Social Security + Medicare), state income tax (multi-state), SUI/SDI, pre-tax deductions

Total: 694 integration tests, running on every regulation change.

The distribution isn't random. Germany has the highest count because German payroll has the most regulatory surface area: a federal tax algorithm that changes annually (the PAP), separate church tax regimes by state, a Gleitzone/Midijob system with its own interpolation formula, and social security with contribution ceilings that diverge between East and West. Each of these areas requires dedicated test scenarios.

The UK's 64 tests reflect the complexity of cumulative PAYE (where tax calculations depend on year-to-date history), multiple NI categories, and five different Student Loan repayment plans. The US's 66 tests cover the combinatorial challenge of federal tax plus state-specific withholding rules.

Switzerland's 68 tests include cantonal Quellensteuer variations — each canton applies different tariff tables, and an employee who moves cantons mid-year needs both sets of calculations verified.

Every test is a complete payrun execution. Not a formula evaluation. Not a mock. The employee data is loaded into the backend, the payrun job runs through the full calculation pipeline — wage type resolution, collector accumulation, time-segmented calculation, result storage — and the output is compared against expected values.

The Annual Compliance Cycle

Payroll regulations change on a predictable calendar. Understanding this cycle explains why test-driven compliance isn't just a quality measure — it's an operational necessity.

September — Draft Parameters

Most jurisdictions publish draft regulatory parameters in the September-to-November window. Germany's SV-Rechenrößenverordnung (social security calculation parameters) typically arrives in October. The UK Autumn Statement sets NI thresholds and tax bands. Spain's Presupuestos Generales del Estado includes IRPF table revisions. The Netherlands' Nieuwsbrief Loonheffingen from the Belastingdienst arrives in late autumn.

At this stage, the parameters are draft but actionable. The regulatory team creates preliminary data regulation satellites — versioned JSON files with the new rates, thresholds, and ceilings — with validFrom dates set to January 1 of the upcoming year.

October through December — Build, Test, Verify

This is where test-driven compliance earns its name. For each country, the process is:

  1. Update data regulations — New statutory values are entered into data regulation satellites. Each value is verified against the official gazette or government publication. The source (URL and legal reference) is recorded.
  2. Run existing tests against new parameters — All existing integration tests execute against the updated data. Tests that were correct for 2025 parameters will fail if 2026 parameters produce different results. Each failure is expected — it confirms that the parameter change propagated correctly.
  3. Update expected values — Each failing test is recalculated against the new parameters. The README walkthrough is updated with the new arithmetic. The exchange JSON is updated with the new expected values.
  4. Add new tests for new rules — When a regulatory change introduces new logic (not just new values), new tests are created. Germany's PAP 2026, for example, changed the Vorsorgepauschale algorithm — new tests were added to verify the AV component integration and the removed 12% floor.
  5. Full regression — The complete test suite runs. All 694 tests must pass. Zero tolerance for failures.

January 1 — Go-Live

On January 1, the updated data regulations become active. The validFrom dates ensure that payruns for December 2025 still use 2025 parameters, and payruns for January 2026 automatically resolve to 2026 parameters. No code deployment needed — the versioned data model handles the transition.

The test suite that passed in December is the same one running in production. There's no "staging environment with test data" that differs from production. The tests define the contract, and the contract is enforced continuously.

Mid-Year Changes

Not all changes happen on January 1. The UK adjusts National Insurance rates every April. Germany occasionally revises social security parameters mid-year. Spain has been known to publish IRPF adjustments via Real Decreto-Ley that take effect on specific dates.

The process is the same regardless of timing: new data regulation with the appropriate validFrom date, updated tests, full regression, deployment. The versioned parameter model handles arbitrary effective dates, not just annual boundaries.

Tests Verify Logic, Not Statutory Values

There's a critical distinction that test suites alone cannot address, and it's important to be explicit about it.

An integration test verifies that given these input parameters, the calculation engine produces these output values. It confirms that the logic — the PAP algorithm, the social security formula, the progressive tax bracket calculation — is implemented correctly. If you put the right rates in, you get the right numbers out.

But what if the rates themselves are wrong?

Consider a scenario: the KV-Beitragssatz (health insurance contribution rate) for 2026 is 14.6%. If someone enters 14.8% in the data regulation, the integration tests will still pass — they'll be updated with expected values based on the 14.8% rate, the arithmetic will be internally consistent, and every test will show green. The system will produce wrong payslips for every employee, but all tests pass.

The distinction: Integration tests verify calculation logic — that the engine applies rates correctly, handles edge cases, and produces internally consistent results. They do not verify that the statutory values themselves are correct. That requires a separate compliance audit against official sources.

This is why PayrollEx maintains a separate data regulation compliance process alongside the test suite. Every numerical value in a data regulation — every tax rate, contribution ceiling, threshold, allowance, and minimum wage — is verified against the official source before it enters the system. The source is recorded: the specific paragraph of the Bundesgesetzblatt, the HMRC rates and thresholds page, the BOE Real Decreto.

The compliance audit follows a strict sequence: identify the values to be set, verify each value against an official online source, produce a verification table (value, source, legal reference), and only then write the data file. Estimates, interpolations, and "common knowledge" values are not permitted. If it can't be traced to a government publication, it doesn't go in.

In practice, this dual approach has caught real errors. During the 2026 annual audit, the process identified carried-forward values from prior years (Dutch pension rates that hadn't been updated), swapped contribution rates (German Minijob KV and RV pauschale values transposed), and structural changes not yet reflected in the data (Spanish IRPF moving from two to three tariff tiers under RD-Ley 4/2024).

Tests are necessary but not sufficient. Data compliance audits are necessary but not sufficient. Together — logic verification plus value verification — they constitute compliance. Neither works without the other.

What This Means for Providers

If you're a payroll provider, EOR platform, or HCM vendor evaluating PayrollEx, the test-driven compliance model changes several operational dynamics.

No "Trust Me" — Inspect the Tests Yourself

Every test is part of the regulation package. You can read the exchange JSON, follow the README arithmetic, and run the test against your own backend instance. You don't need to take PayrollEx's word for it that German Kirchensteuer in Bavaria uses a 8% rate on Lohnsteuer — you can see the test that verifies it, see the data regulation that contains the rate, and see the README that traces the calculation.

For providers operating under regulatory scrutiny — payroll bureaus, licensed EOR platforms, financial services companies — this transparency isn't a nice-to-have. It's the difference between "we use a vendor" and "we can demonstrate compliance."

Annual Updates as Verifiable Artifacts

When the 2027 regulation update ships (or 2028, or any future year), it arrives as a package with updated data regulations, updated tests, and updated READMEs. Before deploying, you can run the test suite against your staging environment and verify that all 694+ tests pass. You can compare the 2026 and 2027 expected values side by side to see exactly what changed and by how much.

This turns the annual compliance update from a trust exercise into a verification exercise. Your payroll consultants review the test deltas, confirm the arithmetic against published parameters, and approve the deployment. No black-box upgrades. No "the vendor says it's fine."

Custom Regulations Get the Same Treatment

The test infrastructure isn't limited to statutory regulations. When a provider adds a company regulation — custom benefits, industry-specific allowances, client-specific wage types — they can write tests using the same framework. Exchange JSON with case data and expected results. README with calculation walkthrough. Pecmd command for execution.

More practically, providers can use preview payrun jobs as CI gates. Before a regulation change reaches production, a preview job runs the change against test data and verifies the output. If the expected values don't match, the deployment is blocked. This turns payroll regulation changes into a standard software release process with automated quality gates.

Regression Protection

Payroll regulations interact. Changing a social security contribution rate affects gross-to-net calculation. Changing a tax-free allowance affects taxable income, which affects income tax, which affects Solidaritätszuschlag, which affects net pay. In a system without comprehensive tests, any change carries the risk of unintended side effects — and those side effects only surface when an employee notices their payslip is wrong.

With 694 tests covering a wide range of scenarios, a change that silently breaks another calculation is caught immediately. The German regulation alone has 119 tests: if a Vorsorgepauschale adjustment inadvertently changes a Gleitzone calculation, the Gleitzone test fails. If a Kirchensteuer fix affects the SolZ Freigrenze calculation, the SolZ test catches it. Complex scenarios — including retroactive corrections spanning regulatory changes — are covered by dedicated test cases.

This regression protection compounds over time. Each new test scenario added to the suite increases the safety net for future changes. After several annual cycles, the test suite becomes a comprehensive specification of the regulation's behavior — a living document that grows more valuable with each iteration.

Audit Readiness

When an auditor asks "how do you verify that your German payroll calculations comply with the 2026 PAP?", you have a concrete answer: 119 integration tests, each with a documented calculation trace, each verified against published statutory parameters, each running on every change. The test suite is the compliance evidence.

This doesn't replace the need for professional payroll consultation. Complex scenarios — international assignments, retro corrections spanning regulatory changes, unusual benefit structures — still require expert judgment. But for the statutory core — income tax, social security contributions, employer costs — the test suite provides a level of verifiable assurance that manual review alone cannot match.

See the compliance infrastructure in action

Explore the country coverage, review the test architecture, or schedule a walkthrough of a specific country regulation and its test suite.

Get in Touch →
← Back
All Articles