From GraphQL to Zod: Simplifying Arte's API Architecture
Context: When GraphQL Becomes a Burden
When our team first partnered with Arte in 2017, we faced a classic multi-platform challenge. Each Arte platform - mobile apps, connected TVs, web portal, ISP integrations - needed to aggregate data from different APIs. Our solution was elegant: implement GraphQL as a backend-for-frontend (BFF) to centralize data management.
At Marmelab, we’ve found that GraphQL shines when you need flexible data fetching across diverse client needs. Arte’s implementation worked well initially, providing a single entry point that simplified content uniformity across all channels.
However, our architecture had a complexity twist. Not all client platforms could handle GraphQL directly, so we built a REST API layer that internally executed GraphQL calls using GraphQL.js. This hybrid approach supported versioning - older mobile apps continued using v3 endpoints while newer ones moved to v4.
Problem: The Hidden Complexity Tax
After three years at Marmelab, I’ve learned that questioning our own architectural decisions is crucial for long-term sustainability. What initially solved Arte’s data aggregation problem had evolved into something more complex than necessary.
The challenge became clear during code reviews and onboarding sessions: we didn’t really have a dedicated GraphQL API. Instead, we had a REST API that happened to use GraphQL.js internally. This created several pain points:
- Code navigation confusion: Tracing request flows required understanding both REST and GraphQL layers
- Maintenance overhead: Each change demanded expertise in dual paradigms
- Onboarding friction: New team members struggled with the mixed architecture
Our team’s approach to this problem was pragmatic. Rather than defending our original choice, we decided to prototype removing the GraphQL layer entirely.
Solution Journey: Discovering Zod’s Power
We deliberately chose Arte’s most complex endpoint for our proof of concept - both from a GraphQL schema perspective and business logic complexity. Our team’s approach was to tackle the hardest case first: if we could successfully migrate the most challenging endpoint, we’d have confidence that the entire migration was feasible.
Together, we figured out that removing GraphQL immediately exposed three critical problems that needed solving:
Runtime Type Validation Challenge
The first hurdle was maintaining type safety beyond TypeScript’s compile-time guarantees. Our BFF aggregates data from external APIs we don’t control, and GraphQL’s schema validation had been protecting us from runtime type mismatches.
# GraphQL schema - explicit nullability
type User {
id: String! # Non-null field
name: String! # Non-null field
url: String # Nullable field
}
The problem: if an external API returns null for a field we expected to be non-null, TypeScript can’t catch this at runtime.
This is where Zod became our game-changer:
import { z } from 'zod';
// Zod schema matching GraphQL nullability
const UserSchema = z.object({
id: z.string(), // Non-null
name: z.string(), // Non-null
url: z.string().nullable(), // Nullable
});
// Runtime validation with clear error handling
const validateUser = (data: unknown) => {
return UserSchema.parse(data); // Throws detailed errors on failure
};
JSON Serialization Gotchas
The second problem emerged during functional testing. Our REST API responses were missing keys, creating breaking changes for clients. The breakthrough came when we understood Express’s behavior: res.json() uses JSON.stringify() internally, which removes keys with undefined values. GraphQL resolvers return null by default for undefined values.
// The problem
const response = {
name: "Arte",
description: undefined, // Disappears in JSON.stringify()
};
// Our Zod solution
const ResponseSchema = z.object({
name: z.string(),
description: z.string().nullable().default(null), // Provide null defaults like it was in GraphQL
});
Extra Properties Filtering
The third issue was the opposite: extra keys appearing in responses. GraphQL naturally eliminates unwanted fields since clients request specific properties. Without it, we needed explicit filtering.
// Zod's strict() method solves this elegantly
const StrictUserSchema = z.object({
id: z.string(),
name: z.string(),
}).strict(); // Rejects objects with unexpected properties
Technical Implementation: Advanced Patterns
Working with the most complex endpoint, we encountered sophisticated scenarios that required advanced Zod patterns.
One challenging case involved polymorphic content types. Initially, we tried Zod’s intersection(), but discovered it doesn’t work with discriminated unions:
// This approach failed
const BaseContent = z.object({
id: z.string(),
title: z.string(),
});
const VideoContent = z.object({
type: z.literal("video"),
duration: z.number(),
});
// intersection() fails with discriminated unions
const Content = BaseContent.and(VideoContent); // ❌
The key insight was switching to Zod’s merge() method:
// This works perfectly
const VideoContent = BaseContent.merge(z.object({
type: z.literal("video"),
duration: z.number(),
}));
const ArticleContent = BaseContent.merge(z.object({
type: z.literal("article"),
wordCount: z.number(),
}));
const ContentSchema = z.discriminatedUnion("type", [
VideoContent,
ArticleContent,
]);
Team Learnings: Embracing Evolution
This project taught us valuable lessons about architectural pragmatism:
Question established patterns: What worked in 2017 might not be optimal today. At Marmelab, we encourage challenging our own decisions.
Start with complexity: Prototyping the hardest case first gave us confidence the approach would scale.
Tools matter, patterns matter more: Zod proved excellent for validation, but finding the right patterns required experimentation.
Simplicity wins: Removing the GraphQL intermediary reduced cognitive load without sacrificing functionality.
Broader Implications: Choosing the Right Tool
Based on our experience, here’s when each approach makes sense:
Choose GraphQL when:
- You have diverse client types with vastly different data needs
- You’re building a true GraphQL API (not a REST wrapper)
- Query flexibility outweighs architectural simplicity
Choose Zod + REST when:
- You need runtime validation with TypeScript
- Architectural simplicity is a priority
- Your team prefers REST patterns
- You’re building internal APIs with known consumers
Conclusion
This aligns with our sustainability goals at Marmelab - sometimes the most environmentally responsible choice is simplification over complexity. Removing the GraphQL intermediary reduced our codebase’s cognitive overhead while maintaining type safety through Zod.
While GraphQL remains excellent for many scenarios, our Arte experience reinforces that architectural decisions should evolve with project needs. The key insight was recognizing we weren’t leveraging GraphQL’s core strengths - we were using it as a validation layer in a REST API.
For teams facing similar decisions, I recommend starting with a focused POC on your most complex endpoint. The confidence gained from proving the approach scales makes broader migration much smoother. Our POC success convinced Arte to proceed with the full migration.
What patterns have you found effective for runtime type validation in TypeScript projects? How do you balance architectural flexibility with maintainability in your APIs?
Authors
Before choosing full-stack development, Cindy was a dentist. No kidding. You can imagine that she's a fast learner.