How Banks and FinTechs Integrate with Payment APIs
The full integration lifecycle — partnership onboarding, IP whitelisting, payer-type classification, routing, reconciliation, and OTC agent networks — from inside the payments aisle of a tier-1 bank.
The README of a payment API is the easy half. The hard half is the integration lifecycle — the months between “we have a partner” and “we have a settled, reconciled transaction every two seconds”. This post walks that lifecycle in the order it actually happens.
1. Partnership onboarding
Before any code runs, three documents are signed:
- Master Services Agreement — liability split, SLA, audit rights.
- Technical Schedule — endpoints, message formats, retry policy, error matrix.
- Operational Schedule — settlement windows, dispute SLAs, reconciliation cadence, support escalation.
The Technical Schedule is the one engineering owns. Two clauses we’ve learned to make non-negotiable:
- Either side may rotate credentials at any time with 24-hour notice.
- Idempotency is mandatory on all state-changing endpoints.
Without (1), credential leaks become legal incidents. Without (2), retries under load cause double-debits, and double-debits cause regulatory letters.
2. Whitelisting & connectivity
We don’t accept payment traffic from the public internet. Partners come in through one of three lanes:
- MPLS for top-tier banks and the central bank. Predictable latency, expensive, slow to provision (4-8 weeks).
- Site-to-site IPSec VPN for mid-tier banks and large fintechs. Fast to provision, reasonable latency.
- Public mTLS over a hardened gateway for fintech startups. Requires a client certificate plus IP whitelist plus JWT — three layers, because losing any one is survivable.
The cost per lane matters more than people realise: an MPLS port is ~200/month, and mTLS is free. We let the lane match the partner’s traffic, not their prestige.
3. Payer-type classification
Not every payer is the same. We classify them at the edge so downstream systems can specialise:
| Payer type | Source | Lane |
|---|---|---|
| Bank customer | Internet banking, mobile app | Internal mTLS |
| MFS user | bKash, Nagad, Rocket | MPLS or IPSec |
| OTC agent | Bank branch / agent terminal | Internal MPLS |
| Card-on-file | Visa / Mastercard direct debit | Card schemes |
| Wallet aggregator | SSL Commerz, AamarPay, ShurjoPay | IPSec |
Classification matters because of routing. A bank customer paying a utility
bill goes through the core banking system’s posting engine — strict, slow,
auditable. An MFS user goes through the wallet’s pre-funded float account —
faster, but with weaker dispute rights. Both look like POST /bill/pay from
the partner’s side, and that’s the point. We absorb the complexity.
4. Routing
Routing rules are owned by ops, not engineering. The actual rule table is a few hundred rows long; the structure is small:
(bill_type, payer_type, partner_id) → biller_endpoint, settlement_account
When a partner pings POST /bill/pay, we look up the (bill_type, payer_type,
partner_id) tuple, pick the right biller endpoint, attach the right
settlement account, and only then forward the call. The lookup is in a Redis
cache; cache miss falls back to a SQL Server table; cache write-through is on
every ops update.
A small generalisation: every routing decision has a TTL. We expire the cache aggressively (60 seconds) because ops will flip a routing rule mid-day during a partner-side outage and the system must pick it up immediately.
5. Reconciliation
Reconciliation is where engineers and accountants meet. The cycle for a single day looks like this:
- T+0 23:55 — partner posts their settlement file (CSV or XML).
- T+1 00:30 — our system fetches the biller’s posting file.
- T+1 01:00 — match script runs three-way: partner transactions vs. our ledger vs. biller postings.
- T+1 02:00 — break list published to ops dashboard.
- T+1 09:00 — ops works the breaks; majority resolve within 4 hours.
A break is any transaction where at least one of the three sources disagrees. The taxonomy we use:
| Break code | Meaning | Owner |
|---|---|---|
| B1 | In partner, missing in our ledger | Ops |
| B2 | In our ledger, missing at biller | Eng |
| B3 | Amount mismatch | Partner ops |
| B4 | Status mismatch (we have OK, biller has FAIL) | Eng |
| B5 | Currency or date mismatch | Ops |
Engineering owns B2 and B4 — both are usually our timeout/retry mistakes. B1, B3, B5 are partner ops issues.
6. OTC agent networks
Over-the-counter is the messy bit. Agents are small shops, kiosks, and bank branches scattered across regions; they accept cash and post a payment on the customer’s behalf. The complications:
- Cash float reconciliation is daily and physical — an actual person counts cash and submits a sheet.
- Connectivity is unreliable — many agents are on 3G; we cache the agent’s last-known bill lookups locally on their terminal and sync on reconnect.
- Trust is bounded — every agent has a daily cash limit; the system refuses postings beyond that limit until the agent settles.
The OTC flow looks almost identical to a normal bank-customer flow at the
biller end, but at the bank end it goes through a separate agent banking
core. We hide all of that behind the same POST /bill/pay.
What goes wrong (and what to invest in)
Three categories soak up most engineering time:
- Partner-side timeouts. A partner whose code hangs on our slow path retries after 30 s, and now we have two postings to reconcile. The fix is partner-side: shorter timeouts and idempotency keys. We push hard for both during onboarding.
- Biller-side reversals. A biller decides 24 hours later that a payment was invalid and reverses it. Our ledger now has a “reversed” row but the partner doesn’t. We push reversal webhooks; partners that don’t subscribe see breaks the next day.
- Routing changes. Ops flips a routing rule and forgets to write to the audit log. When something breaks, we can’t tell what changed. We made the audit log non-negotiable in code.
Integration work is a long tail of small contracts, not a heroic architecture. The teams that ship are the ones that treat onboarding as a product, with its own backlog, its own metrics, and its own quarterly OKRs — not as a sales-engineering afterthought.