Building an Engine That Converts Any API Into MCP: Architecture Deep-Dive

Maxime ChampouxFebruary 23, 202610 min read

Every integration platform eventually hits the same wall. The number of APIs grows linearly, the number of possible connections between them grows quadratically, and your engineering team stays the same size. At Qonto, we learned this the hard way over 2.5 years and 15,000 integrations. At Well, we built the engine that makes the math work.

This is the architecture behind the Dynamic Connector Registry (DCR), the system that auto-generates MCP server wrappers from any OpenAPI specification. It powers 120+ connectors today, and adding a new one takes minutes, not months.

The N×M problem nobody wants to talk about

If you have N AI agents and M external APIs, you need N×M custom integrations. Every new agent multiplies the work. Every new API multiplies it again. This is the integration tax that kills most platforms before they reach scale.

The standard playbook is to build connectors one at a time. Hire engineers, read API docs, write code, handle auth, test edge cases, ship, maintain. Repeat. At Qonto, that was exactly our approach to connecting banking infrastructure with third-party services. It worked until it didn't.

Fifteen thousand integrations over two and a half years sounds impressive on a slide deck. In practice, it meant a team perpetually underwater, debugging OAuth token refresh bugs at 2 AM while the backlog of requested integrations grew faster than we could ship. Each connector was a snowflake. Each had its own failure modes. Each needed ongoing maintenance as upstream APIs changed.

We were solving N×M with brute force, and the math was winning.

Three approaches that failed

When we started building Well, I was determined not to repeat the Qonto pattern. We tried three different approaches before finding one that worked. Each failure taught us something specific about where the real complexity lives.

Attempt 1: Hand-coded connectors

Our first instinct was the familiar one. Build each connector by hand, but this time with better abstractions. We created a connector SDK with base classes, shared authentication modules, and standardized error handling. The plan was that a strong framework would make each new connector fast to build.

It didn't. The framework helped with the 20% of work that was common across connectors. The remaining 80% was API-specific: parameter naming conventions, pagination styles, rate limit behaviors, error response formats. We were writing essentially the same boilerplate over and over, with just enough variation that we couldn't fully abstract it away. After building 12 connectors this way, we projected our timeline to 100 and quietly shelved the approach.

Attempt 2: Template-based generation

Next, we tried code generation from templates. We wrote Mustache templates for common API patterns (REST CRUD, paginated lists, webhook receivers) and fed them API metadata. The generator would spit out connector code that we could review and ship.

This worked for simple APIs. A basic CRUD service with consistent endpoint patterns would generate clean, functional connectors. But real-world APIs are messy. Stripe has nested objects five levels deep. Salesforce has polymorphic fields. HubSpot returns different response shapes for the same endpoint depending on query parameters. The templates grew more complex than the code they were replacing. We ended up maintaining a template engine that was harder to debug than hand-written connectors.

Attempt 3: LLM-generated connectors

In early 2024, with LLMs improving rapidly, we tried having models generate connector code from API documentation. We fed GPT-4 the docs for a service and asked it to produce a working connector.

The results looked plausible. The code compiled. The function signatures matched the API docs. But when we ran integration tests, roughly 30% of generated endpoints called URLs that didn't exist. The model had inferred likely endpoint paths from the API's naming patterns and confidently produced routes that the actual API had never implemented. For a system that needs to be deterministic and reliable, a 30% hallucination rate on endpoint paths is a non-starter.

We also discovered a subtler problem. Even when the LLM got the endpoints right, it would sometimes invent request parameters or misinterpret response schemas. A field it described as required might be optional. A nested object might actually be a flat key-value pair. These errors were harder to catch than missing endpoints because they only surfaced under specific conditions.

What actually works: deterministic parsing with LLM-assisted edges

The breakthrough came from an observation: almost every API we cared about already had a machine-readable specification. OpenAPI (formerly Swagger) specs describe endpoints, parameters, request bodies, response schemas, and authentication methods in a structured format. The information we needed wasn't hidden in prose documentation. It was sitting in a JSON or YAML file.

The DCR pipeline works in five stages:

**1. Ingest the OpenAPI spec. **We accept OpenAPI 2.0, 3.0, and 3.1 specifications. The ingestion layer normalizes everything to a canonical internal format, resolving $ref pointers, flattening allOf/oneOf compositions, and standardizing parameter locations (path, query, header, body).

**2. Parse endpoints into tool definitions. **Each API endpoint becomes one MCP tool. A POST /users endpoint becomes a create_user tool. The mapping is deterministic: the HTTP method and path define the tool name, the request schema defines the input parameters, the response schema defines the output type. No guessing, no inference.

**3. Generate type-safe schemas. **Every tool gets a JSON Schema definition for its inputs and outputs. These schemas are derived directly from the OpenAPI spec, preserving required/optional distinctions, enum constraints, format validations, and nested object structures. The AI model calling the tool sees exactly what parameters it can send and what it will get back.

**4. Wire authentication. **Each connector is configured with its authentication method: OAuth 2.0 (authorization code, client credentials, or PKCE), API key (header or query), Bearer token, or Basic auth. The DCR generates the auth handling code for each flow. Credentials are managed per-user, never shared across tenants.

**5. Deploy as an MCP server. **The generated connector is packaged as a standalone MCP server that speaks the Model Context Protocol. It registers its tools, handles incoming requests, executes the corresponding API calls, and returns structured responses.

The entire pipeline from OpenAPI spec to running MCP server takes under two minutes for a typical API with 50-100 endpoints.

Where LLMs do help

We didn't abandon LLMs entirely. We just moved them away from code generation and into the areas where they add genuine value.

API documentation is often incomplete or inconsistent. An OpenAPI spec might define an endpoint's parameters but leave the descriptions empty. It might use cryptic field names like acct_ref_ext_id without explanation. The spec tells you what the API accepts; it doesn't always tell you what the fields mean or when you'd use them.

We use LLMs to generate human-readable descriptions for tools and parameters, drawing on the API's documentation site, changelog, and community resources. These descriptions help AI agents choose the right tool and fill in parameters correctly. But the underlying tool structure, the endpoints, parameters, and schemas, comes entirely from deterministic parsing of the spec.

LLMs also help with edge case detection. Some APIs have undocumented rate limits, unusual pagination cursors, or response formats that don't match their spec. We use models to flag potential discrepancies between the spec and observed API behavior during testing. A human engineer reviews these flags before the connector ships.

The Meta-MCP Proxy

With 120+ connectors running as individual MCP servers, we needed a routing layer. The Meta-MCP Proxy sits between the AI agent and the constellation of connector servers.

When an agent connects to Well, it sees a single MCP interface. Behind that interface, the proxy maintains a registry of all available connectors and their tool inventories. When the agent calls a tool, the proxy routes the request to the correct connector server, waits for the response, and returns it.

This architecture has a few properties we care about:

**Isolation. **Each connector runs in its own process. A bug in the Slack connector can't crash the HubSpot connector. A slow response from Salesforce doesn't block requests to Stripe. We can restart, update, or scale individual connectors without touching the rest of the system.

**Independent scaling. **High-traffic connectors (email, calendar, CRM) get more resources. Low-traffic connectors (niche vertical tools) run on minimal infrastructure. The proxy handles load distribution based on real usage patterns.

**Hot deployment. **New connectors go live without restarting the proxy or disconnecting existing agents. The registry is dynamic. When a new connector registers its tools, they become available to all agents on the next tool discovery call.

The proxy also handles cross-connector concerns: request logging, usage metering, error rate tracking, and circuit breaking for unhealthy upstream APIs.

Architecture tradeoffs we chose to live with

Every architecture has costs. Here are the ones we accepted.

**One endpoint, one tool. **Mapping every API endpoint to its own MCP tool means some connectors expose hundreds of tools. The Salesforce connector has 200+ tools. Most agents will never use more than 10-15 of them. We considered curating a smaller set of "important" tools per connector but decided against it. Curation requires human judgment about which endpoints matter, and that judgment varies by use case. Instead, we rely on good tool descriptions and the agent's ability to select relevant tools from a large inventory.

**OpenAPI dependency. **We can only auto-generate connectors for APIs that have OpenAPI specs. Not every API does. Some have GraphQL schemas, some have gRPC protobuf definitions, some have only prose docs. For now, we build non-OpenAPI connectors by hand or by writing minimal OpenAPI specs ourselves. GraphQL and gRPC support is on the roadmap.

**Spec quality variance. **Not all OpenAPI specs are created equal. Some are meticulous, with detailed descriptions, examples, and complete schemas. Others are auto-generated stubs with missing fields and incorrect types. We've built validation and enrichment layers to handle poor-quality specs, but the output quality is still bounded by the input quality. A garbage spec produces a functional but poorly documented connector.

**Auth complexity. **OAuth 2.0 has dozens of implementation variations. Some providers require PKCE. Some use non-standard token endpoints. Some return refresh tokens that expire after a single use. We handle the common patterns automatically, but unusual OAuth flows still require manual configuration. About 15% of our connectors needed at least some custom auth work.

What this means at scale

The DCR changes the economics of integration coverage. The traditional model bills engineering time per connector: weeks of work per API, ongoing maintenance, scaling linearly with the number of integrations. Our model pays a fixed cost for the DCR engine itself, then near-zero marginal cost per additional connector.

Going from 120+ to 200 connectors doesn't require hiring more engineers. It requires finding the next batch of OpenAPI specs. The parsing, schema generation, auth wiring, and deployment are automated. The review and testing cycles are shorter because the generated code follows consistent patterns.

This also changes how quickly we can respond to customer requests. When someone asks for a new integration, the answer isn't "we'll add it to the roadmap." It's "send us the API docs and we'll have it running this week."

The harder question is whether the DCR architecture generalizes beyond our current use case. MCP is still young as a protocol. The ecosystem of MCP clients and servers is growing, but standards for tool discovery, capability negotiation, and cross-server orchestration are still evolving. We've built for the protocol as it exists today, with enough abstraction layers that we can adapt as it matures.

A hundred and twenty connectors in, the engine is working. The math finally makes sense.

Maxime Champoux

CEO & co-founder, Well

Maxime is the CEO and co-founder of Well. He built Well to rebuild finance around AI-native data, not spreadsheets.

Ready to automate your financial workflows?

Try Well free