DAVISA AI STUDIO · USE CASE AI-driven accounts payable anomaly detection
For CFOs, financial controllers and administration directors of mid-market companies running
significant accounts payable volumes who know that, among thousands of monthly lines, near-
duplicates, overpricing, out-of-pattern amounts and unauthorised spend slip through. An AI
engine that reviews the supplier ledger in real time and alerts before payment goes out.
The pain we solve
In accounts payable at a mid-market company with several hundred or thousand monthly invoices,
four types of anomalies consistently slip through and cost real money. Near-duplicates: the same
invoice posted twice with a digit changed in the number or a different date, usually because
the supplier sent the same invoice through two channels (email + EDI + portal) and it was
posted on both. Overpricing against contract: the supplier invoices above the agreed price,
intentionally or accidentally, and nobody cross-checks the invoice line against the contractual
price list.
Outlier amounts: an extra zero in a field during manual capture, a misplaced decimal, an amount
well above the typical magnitude of that spend category that goes to payment without a filter
because there is no automatic rule to detect it. And unauthorised spend: charges to categories
the company has not budgeted for, or that required prior approval that was not obtained, or
that show activity outside the corporate perimeter.
The aggregate cost is serious. Sector studies we cross-reference with our own BC customer base
place the cost of undetected AP anomalies between 0.3% and 1.5% of total annual AP spend at a
mid-market company. For a company with 20 million in AP spend, that is between 60,000 and
300,000 euros a year paid out without anyone cross-checking the data. Most of it is not fraud:
administrative errors, dirty capture, lack of price control, channel duplicates. But the
bottom line is the same.
The technical problem is that detecting them by hand does not scale. A financial controller
does not have time to review thousands of lines a month one by one, and the point controls
of month-end close or annual audit only catch a fraction. What does get caught is usually
late, with the supplier already paid and the awkward conversation of recovering the
overpayment ahead. Exactly the kind of problem where well-applied AI delivers value: review
everything, all the time, without getting tired.
What the AI does here
The case combines four independent detectors running in parallel over each new AP movement
(invoice posting) and each payment proposal. The first is a near-duplicate detector that goes
beyond exact match: it uses header similarity (number, date, tax ID), line similarity
(description, taxable base, quantity) and semantic similarity of the concept (with Azure
OpenAI embeddings). It catches the classic case of the same invoice issued with a different
number or a date that differs by one day.
The second detector compares each invoiced line price against two references: the supplier
contractual price list in BC (if it exists) and the historical price distribution for that
item with that supplier over the last twelve months. If the invoiced price deviates beyond a
configurable threshold (typically 5% upwards without justification), it raises an alert.
Sensitivity per product family is calibrated during the pilot.
The third is a statistical outlier detector by spend category. On the historical distribution
of amounts for each GL account and cost centre, it calculates the reasonable range (quartiles,
standard deviation) and flags any new amount falling outside the range. It is particularly
useful for capturing typographical errors at posting: an extra zero, a moved decimal, one unit
invoiced when it should be one hundred.
The fourth is a supervised classifier trained on the history of anomalies flagged by your
team. It learns patterns that correlate with real problems: supplier-category-approver
combinations, recently-created tax IDs with headers similar to an existing supplier, amounts
just below the segregated approval threshold, spend on accounts closed in the budget. The
more feedback it receives, the sharper it gets.
The common engine is built on Azure Machine Learning with Azure OpenAI Service for the parts
that require semantic understanding. Data does not leave your tenant. The trace of every
alert is complete: which movement, which detector triggered, with what score, what the
finance team decided (dismissed as false positive, adjustment, return, investigation). That
feedback feeds the next retraining of the supervised classifier.
Before and after
| Control aspect | Before (manual) | After (with AI) |
| Duplicate detection | Only those sharing exact number and tax ID get caught. Near-duplicates slip through. | Header, taxable base, date and line description similarity detector. Catches near-duplicates. |
| Overpricing controls | Only checked against contract if the buyer remembers. Usually not. | Automatic comparison of each invoiced price against supplier history and active contract. |
| Out-of-budget spend | Discovered at month-end close against budget. Too late to react. | Alert at posting time: this spend exceeds the budget for cost centre X by YY euros. |
| Suspicious new tax IDs | If supplier onboarding is semi-automated, made-up or close-to-existing tax IDs slip in. | Tax ID proximity alert against catalogue and mandatory review before first payment. |
| Outlier amounts | No magnitude control. An extra zero in a field goes to payment without a filter. | Each amount is compared against the historical distribution of its category. Outliers are flagged. |
| Audit | Manual sampling at annual audit. Only what the sample catches gets seen. | Complete trace of anomalies detected, handled and dismissed, with justification. |
| Response time | When an anomaly appears, it is discovered weeks after posting. | Alert within minutes of posting or payment attempt. Optional block based on severity. |
How we deliver
1 Discovery
5 days
Audit of AP history in BC, approval matrix analysis, identification of the most frequent
anomaly types in your context, baseline calculation (estimated cost of undetected anomalies
in the last 12 months) and selection of priority detectors for the pilot.
Deliverable: roadmap with active detectors and
target KPI.
2 Pilot
8 weeks · fixed scope
Unsupervised detectors deployed from week 1, integration with dvinvoice-hub and dvfinance,
threshold configuration by category, alert management panel, finance team training.
Supervised classifier training with feedback from the first few weeks.
Deliverable: live engine, alerts in flow,
measured savings.
3 Scale-up
ongoing
Threshold tuning with operational data, extension to more group companies, integration
with advanced bank reconciliation, integration with the adjacent executive close summary
case, quarterly CFO reporting with consolidated savings.
Deliverable: periodic retraining, monthly KPI,
ongoing support.
Tech stack
- Azure OpenAI Service: embeddings and semantic similarity to detect
near-duplicates by line description and reasoning over complex patterns.
- Azure Machine Learning: supervised classifier training, model version
management, drift monitoring, periodic retraining.
- dvinvoice-hub: Davisa extension that captures the incoming invoice posting
event in BC and triggers the anomaly engine before posting.
- dvfinance: Davisa extension that orchestrates the financial cycle (due
dates, payment proposals, reconciliation) and provides the second control point before
sending to the bank.
- Power Automate: alert workflow to the responsible controller by severity,
with Teams notification and optional payment block until resolution.
- Bank integration: additional control point before SEPA or factoring
submission, depending on your active bank configuration.
- BC tables involved: Vendor, Vendor Ledger Entry, Purchase Invoice Header,
Purchase Line, Payment Journal, Bank Account Ledger Entry, G/L Account, Dimension Value,
Approval Entry.
When this case is NOT a fit
Some scenarios do not pay back or are not viable. We say it directly.
- If you process fewer than 50 supplier invoices a month. At that volume the
aggregate risk is low and is controlled with good manual discipline. The pilot investment
is not justified.
- If you do not have BC with dvfinance or dvinvoice-hub live. The case
integrates into the capture and payment cycle. Without that prior infrastructure, the
implementation cost spikes and the value proposition drops.
- If your real problem is governance, not detection. If the buyer approves
any price without contrast or if the authority matrix is fuzzy, AI will detect but will not
be able to prevent the problem from repeating. Governance needs to be fixed first.
- If you expect AI to make blocking decisions instead of alerting. The
system is designed to alert humans with financial responsibility. Automatic blocking is
optional and limited to extreme cases. If you are looking for an autonomous gateway, this
is not it.
Frequently asked questions
How long does the system need to learn your patterns?
The unsupervised detectors (near-duplicates, outlier amounts by category, brand-new tax IDs) work from day one with no training. The supervised detectors that require a historical pattern (price deviation, out-of-budget spend, atypical supplier behaviour) ideally need between 12 and 24 months of AP history in BC to consolidate a baseline. We measure this in the discovery with your real data before committing to scope.
How many false positives should we expect?
Quite a few at the start, and that is a good thing. The system is calibrated restrictively and learns from finance team feedback: each time you mark an alert as a false positive, it adjusts thresholds for that supplier family or spend category. The typical curve is aggressive at the start (lots of noise) and stabilises by the third or fourth month with a false positive rate below 15%. The design rule is always restrictive: we prefer marginal noise to missing a real anomaly.
Does it integrate with automated bank reconciliation?
Yes, and it is one of the flows where it adds the most value. The detector flags anomalies before the accounting entry (step 1: invoice enters, AI warns of near-duplicate or overpricing), before approving payment (step 2: cross-check against the supplier ledger) and at bank reconciliation (step 3: when a bank movement does not find a reasonable match). If you have dvfinance live, all three levels run chained on the same engine.
Can it detect internal fraud?
It can detect patterns that correlate with internal fraud (same approver repeatedly on the same supplier with price deviation, tax IDs created with very similar headers to an existing supplier, amounts just below the segregated approval threshold), but it does not accuse: it alerts for investigation. The decision on whether an anomaly is an administrative error or fraud is always taken by your audit or compliance team. The AI provides the thread to pull on, not the verdict.
How does the system scale as invoice volume grows?
The engine runs as an Azure ML service with autoscaling by load. Volumes of up to several hundred thousand AP movements per month are handled without reconfiguration. The bottleneck is usually human: if the finance team cannot keep up with the alert flow, it is better to raise the threshold than to saturate the team. During scale-up we tune that balance with you, month by month.
Next step
Already a Davisa customer?
We frame the case within your current BC, dvinvoice-hub and dvfinance relationship.
Your usual advisor coordinates the AI Studio entry.
Talk to the team →
New to Davisa?
We start with the 5-day discovery. We audit your AP, estimate the real cost of undetected
anomalies and size the pilot against that figure.
Request AI discovery →