DAVISA AI STUDIO · USE CASE

AI-driven accounts payable anomaly detection

For CFOs, financial controllers and administration directors of mid-market companies running significant accounts payable volumes who know that, among thousands of monthly lines, near- duplicates, overpricing, out-of-pattern amounts and unauthorised spend slip through. An AI engine that reviews the supplier ledger in real time and alerts before payment goes out.

The pain we solve

In accounts payable at a mid-market company with several hundred or thousand monthly invoices, four types of anomalies consistently slip through and cost real money. Near-duplicates: the same invoice posted twice with a digit changed in the number or a different date, usually because the supplier sent the same invoice through two channels (email + EDI + portal) and it was posted on both. Overpricing against contract: the supplier invoices above the agreed price, intentionally or accidentally, and nobody cross-checks the invoice line against the contractual price list.

Outlier amounts: an extra zero in a field during manual capture, a misplaced decimal, an amount well above the typical magnitude of that spend category that goes to payment without a filter because there is no automatic rule to detect it. And unauthorised spend: charges to categories the company has not budgeted for, or that required prior approval that was not obtained, or that show activity outside the corporate perimeter.

The aggregate cost is serious. Sector studies we cross-reference with our own BC customer base place the cost of undetected AP anomalies between 0.3% and 1.5% of total annual AP spend at a mid-market company. For a company with 20 million in AP spend, that is between 60,000 and 300,000 euros a year paid out without anyone cross-checking the data. Most of it is not fraud: administrative errors, dirty capture, lack of price control, channel duplicates. But the bottom line is the same.

The technical problem is that detecting them by hand does not scale. A financial controller does not have time to review thousands of lines a month one by one, and the point controls of month-end close or annual audit only catch a fraction. What does get caught is usually late, with the supplier already paid and the awkward conversation of recovering the overpayment ahead. Exactly the kind of problem where well-applied AI delivers value: review everything, all the time, without getting tired.

What the AI does here

The case combines four independent detectors running in parallel over each new AP movement (invoice posting) and each payment proposal. The first is a near-duplicate detector that goes beyond exact match: it uses header similarity (number, date, tax ID), line similarity (description, taxable base, quantity) and semantic similarity of the concept (with Azure OpenAI embeddings). It catches the classic case of the same invoice issued with a different number or a date that differs by one day.

The second detector compares each invoiced line price against two references: the supplier contractual price list in BC (if it exists) and the historical price distribution for that item with that supplier over the last twelve months. If the invoiced price deviates beyond a configurable threshold (typically 5% upwards without justification), it raises an alert. Sensitivity per product family is calibrated during the pilot.

The third is a statistical outlier detector by spend category. On the historical distribution of amounts for each GL account and cost centre, it calculates the reasonable range (quartiles, standard deviation) and flags any new amount falling outside the range. It is particularly useful for capturing typographical errors at posting: an extra zero, a moved decimal, one unit invoiced when it should be one hundred.

The fourth is a supervised classifier trained on the history of anomalies flagged by your team. It learns patterns that correlate with real problems: supplier-category-approver combinations, recently-created tax IDs with headers similar to an existing supplier, amounts just below the segregated approval threshold, spend on accounts closed in the budget. The more feedback it receives, the sharper it gets.

The common engine is built on Azure Machine Learning with Azure OpenAI Service for the parts that require semantic understanding. Data does not leave your tenant. The trace of every alert is complete: which movement, which detector triggered, with what score, what the finance team decided (dismissed as false positive, adjustment, return, investigation). That feedback feeds the next retraining of the supervised classifier.

Before and after

Control aspect	Before (manual)	After (with AI)
Duplicate detection	Only those sharing exact number and tax ID get caught. Near-duplicates slip through.	Header, taxable base, date and line description similarity detector. Catches near-duplicates.
Overpricing controls	Only checked against contract if the buyer remembers. Usually not.	Automatic comparison of each invoiced price against supplier history and active contract.
Out-of-budget spend	Discovered at month-end close against budget. Too late to react.	Alert at posting time: this spend exceeds the budget for cost centre X by YY euros.
Suspicious new tax IDs	If supplier onboarding is semi-automated, made-up or close-to-existing tax IDs slip in.	Tax ID proximity alert against catalogue and mandatory review before first payment.
Outlier amounts	No magnitude control. An extra zero in a field goes to payment without a filter.	Each amount is compared against the historical distribution of its category. Outliers are flagged.
Audit	Manual sampling at annual audit. Only what the sample catches gets seen.	Complete trace of anomalies detected, handled and dismissed, with justification.
Response time	When an anomaly appears, it is discovered weeks after posting.	Alert within minutes of posting or payment attempt. Optional block based on severity.

How we deliver

Discovery

5 days

Audit of AP history in BC, approval matrix analysis, identification of the most frequent anomaly types in your context, baseline calculation (estimated cost of undetected anomalies in the last 12 months) and selection of priority detectors for the pilot.

Deliverable: roadmap with active detectors and target KPI.

Pilot

8 weeks · fixed scope

Unsupervised detectors deployed from week 1, integration with dvinvoice-hub and dvfinance, threshold configuration by category, alert management panel, finance team training. Supervised classifier training with feedback from the first few weeks.

Deliverable: live engine, alerts in flow, measured savings.

Scale-up

ongoing

Threshold tuning with operational data, extension to more group companies, integration with advanced bank reconciliation, integration with the adjacent executive close summary case, quarterly CFO reporting with consolidated savings.

Deliverable: periodic retraining, monthly KPI, ongoing support.

Tech stack

Azure OpenAI Service: embeddings and semantic similarity to detect near-duplicates by line description and reasoning over complex patterns.
Azure Machine Learning: supervised classifier training, model version management, drift monitoring, periodic retraining.
dvinvoice-hub: Davisa extension that captures the incoming invoice posting event in BC and triggers the anomaly engine before posting.
dvfinance: Davisa extension that orchestrates the financial cycle (due dates, payment proposals, reconciliation) and provides the second control point before sending to the bank.
Power Automate: alert workflow to the responsible controller by severity, with Teams notification and optional payment block until resolution.
Bank integration: additional control point before SEPA or factoring submission, depending on your active bank configuration.
BC tables involved: Vendor, Vendor Ledger Entry, Purchase Invoice Header, Purchase Line, Payment Journal, Bank Account Ledger Entry, G/L Account, Dimension Value, Approval Entry.

When this case is NOT a fit

Some scenarios do not pay back or are not viable. We say it directly.

If you process fewer than 50 supplier invoices a month. At that volume the aggregate risk is low and is controlled with good manual discipline. The pilot investment is not justified.
If you do not have BC with dvfinance or dvinvoice-hub live. The case integrates into the capture and payment cycle. Without that prior infrastructure, the implementation cost spikes and the value proposition drops.
If your real problem is governance, not detection. If the buyer approves any price without contrast or if the authority matrix is fuzzy, AI will detect but will not be able to prevent the problem from repeating. Governance needs to be fixed first.
If you expect AI to make blocking decisions instead of alerting. The system is designed to alert humans with financial responsibility. Automatic blocking is optional and limited to extreme cases. If you are looking for an autonomous gateway, this is not it.

Keep exploring

BC extension

dvinvoice-hub

The Davisa extension for the full incoming invoice cycle in BC.

BC extension

dvfinance

Davisa financial layer for BC: due dates, payments, reconciliation, cash control.

AI case

Supplier invoice automation

Extraction, coding and automatic matching of invoices. Natural predecessor to this case.

AI case

Executive monthly close summary

The month's alerts feed the executive CFO summary with no extra work.

Hub

Davisa AI Studio

The full catalogue of cases, sectors, method and discovery of the AI Studio (in Spanish).

Frequently asked questions

How long does the system need to learn your patterns?

The unsupervised detectors (near-duplicates, outlier amounts by category, brand-new tax IDs) work from day one with no training. The supervised detectors that require a historical pattern (price deviation, out-of-budget spend, atypical supplier behaviour) ideally need between 12 and 24 months of AP history in BC to consolidate a baseline. We measure this in the discovery with your real data before committing to scope.

How many false positives should we expect?

Quite a few at the start, and that is a good thing. The system is calibrated restrictively and learns from finance team feedback: each time you mark an alert as a false positive, it adjusts thresholds for that supplier family or spend category. The typical curve is aggressive at the start (lots of noise) and stabilises by the third or fourth month with a false positive rate below 15%. The design rule is always restrictive: we prefer marginal noise to missing a real anomaly.

Does it integrate with automated bank reconciliation?

Yes, and it is one of the flows where it adds the most value. The detector flags anomalies before the accounting entry (step 1: invoice enters, AI warns of near-duplicate or overpricing), before approving payment (step 2: cross-check against the supplier ledger) and at bank reconciliation (step 3: when a bank movement does not find a reasonable match). If you have dvfinance live, all three levels run chained on the same engine.

Can it detect internal fraud?

It can detect patterns that correlate with internal fraud (same approver repeatedly on the same supplier with price deviation, tax IDs created with very similar headers to an existing supplier, amounts just below the segregated approval threshold), but it does not accuse: it alerts for investigation. The decision on whether an anomaly is an administrative error or fraud is always taken by your audit or compliance team. The AI provides the thread to pull on, not the verdict.

How does the system scale as invoice volume grows?

The engine runs as an Azure ML service with autoscaling by load. Volumes of up to several hundred thousand AP movements per month are handled without reconfiguration. The bottleneck is usually human: if the finance team cannot keep up with the alert flow, it is better to raise the threshold than to saturate the team. During scale-up we tune that balance with you, month by month.

Next step

Already a Davisa customer?

We frame the case within your current BC, dvinvoice-hub and dvfinance relationship. Your usual advisor coordinates the AI Studio entry.

Talk to the team →

New to Davisa?

We start with the 5-day discovery. We audit your AP, estimate the real cost of undetected anomalies and size the pilot against that figure.

Request AI discovery →