Our partners kept breaking on staging. So we gave them production access. (Don't do this.)

#api #apitesting #devops #webdev

I work on an embed platform. Partners integrate our APIs to build features into their own products — payments, identity verification, onboarding flows.

We spent six months trying to answer one question: how do partners test their integration before going live?

We tried four approaches. Each one failed in a new and exciting way.

Attempt 1: "Just use our docs"

We pointed partners to our API docs. Endpoints, auth, request/response examples. "You're good to go."

Partners would start building, hit something we didn't document, open a support ticket, wait two days, lose momentum. A few never came back. One told us they went with a competitor because "we couldn't get a working test setup in a week."

The docs described what should happen. Partners had no way to see what actually happens until they wrote code, deployed it, and hoped.

Attempt 2: Internal UAT box

Next idea: give partners access to our internal UAT environment. IP whitelisted, shared credentials. We told them "please don't break anything" without a hint of irony.

Worked for three weeks.

Then QA pushed a bad config on a Tuesday afternoon. Partner's integration broke. They didn't know it was our side — spent two days debugging their own code before opening a ticket. By the time we figured it out, their engineering lead was cc'ing our VP.

Around the same time, another team was running load tests on UAT. Partner's API calls were timing out. Their CTO emailed us asking if our platform was "production-ready."

UAT was built for us to break things. We invited external partners into that mess.

Attempt 3: Staging with IP whitelisting

We carved out a "partner staging" — same codebase, separate deployment, IP whitelisted per partner. This felt like the grown-up solution.

It wasn't.

IP whitelists were a full-time job. Every new partner meant new firewall rules. Partners working from home had different IPs than their office. One partner's CTO was traveling and couldn't hit the sandbox from his hotel. We were debugging VPN configs at 11pm on a Thursday.

Test data went stale. Partners tried to create orders for products that existed in production but not in staging. "Your API returns 404 for product_id XYZ." Yes, because nobody seeded staging in three weeks.

Deploys kept colliding. Engineers would ship to staging without checking if partners were actively testing. A deploy during a partner's live demo call is the kind of thing that gets brought up in quarterly business reviews.

Cost. Running a full mirror of production for partner testing. Database, compute, monitoring, on-call rotation. All for an environment that got used maybe 10 hours a week.

Attempt 4: Production access

I wish I was kidding.

The reasoning: "staging is flaky, UAT is a disaster, let's just whitelist them on production with read-only test accounts."

The API was stable! Partners were happy. For about a month.

Then one partner's integration bug created 400 orphaned records in production. Our data team spent a weekend cleaning it up.

Compliance wanted to know why test payloads with fake PII were hitting the production database.

When we needed to do maintenance, we had to coordinate downtime with partners who were "just running a few test calls."

We'd built the world's most expensive and dangerous sandbox: production with IP whitelisting.

What we actually needed

After burning six months on this, the answer felt obvious in hindsight. Partners didn't need access to our infrastructure at all.

They needed an API that looked and acted like ours — same endpoints, same schemas, same auth patterns — but was completely separate. Something where POST actually creates a resource, GET retrieves it, state transitions work, and nobody else's deploy can break it.

Not a mock server (those are stateless — step 2 doesn't know about step 1). Not a shared staging box (those are everybody's problem). A dedicated, isolated sandbox generated from the same OpenAPI spec our real API uses.

What I'm building now

This experience is half the reason I started working on FetchSandbox. You give it an OpenAPI spec and it spins up a stateful sandbox — real CRUD, state machines, webhook events, seed data. No infrastructure to manage.

# Partner gets a sandbox from your spec
npx fetchsandbox generate ./your-api-openapi.yaml

# They can call endpoints immediately
curl https://your-api.fetchsandbox.com/v1/orders \
  -H "api-key: sandbox_abc123"
# → 200 OK, returns realistic seed data

# They can run the full integration workflow
fetchsandbox run your-api create-and-fulfill-order
# → ✓ Create order — 201
# → ✓ Add line items — 200
# → ✓ Submit for fulfillment — 200
# → ✓ Webhook: order.fulfilled fired
# → All steps passed

No staging environment to maintain. No IP whitelists. No risk to production. Partners get their own URL, their own data, their own credentials. Your team doesn't touch it.

When a partner asks "does this workflow work?" — they prove it themselves. No support ticket needed.

The time sink nobody tracks

If you run a partner API, add up how many hours your team spends per month on:

Provisioning and maintaining test environments
Debugging "is your sandbox down?" tickets
Managing IP whitelists and VPN access
Re-seeding stale test data
Coordinating deploys around partner testing schedules

At my company it was easily 20-30 hours a month across engineering and DevOps. For a problem that shouldn't exist.

Try it

FetchSandbox works with any OpenAPI 3.x spec. 19 APIs are live — Stripe, GitHub, Twilio, WorkOS, and more. No signup.

If you run an API platform and want to try this with your own spec, reach out — @fetchsandbox on X or hello@fetchsandbox.com.

Free during early access.

I know I'm not the only one who's been through this. UAT → staging → "just give them prod." If you've found a better way to handle partner sandbox environments, I'd genuinely like to know. Still figuring this out.