ByteByteGo 07月15日
How Tinder’s API Gateway Handles A Billion Swipes Per Day
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了Tinder如何通过自研API网关TAG解决大规模微服务架构下的API管理挑战。TAG基于Spring Cloud Gateway构建,提供灵活的配置驱动架构,支持独立扩展的网关实例,通过预配置解析实现高性能路由,并整合了反爬虫、请求响应扫描、会话管理等核心功能。该方案有效解决了第三方网关集成复杂、扩展性不足、开发体验差等问题,已成为Tinder及旗下多品牌的核心API管理框架。

💡TAG采用配置驱动架构,将路由规则、安全策略和中间件行为通过YAML/JSON文件定义,由Spring Cloud Gateway底层引擎统一处理,实现开发与运维分离,大幅提升配置变更效率。

🛡️TAG通过预配置解析机制,在启动阶段完成所有路由规则和中间件的编译,避免了请求级别的动态解析开销,确保了低延迟和高吞吐量的性能表现。

🔄TAG实现了完整的请求生命周期管理,包含反向Geo IP查询、请求响应扫描、会话管理、谓词匹配、服务发现、预/后置过滤器等标准化处理流程,各阶段可灵活插入自定义逻辑。

🔗TAG通过Envoy服务网格与内部系统集成,实现了服务发现、流量调度等功能的解耦,同时通过全局中间件模式,在保持团队自治的同时确保了安全策略的一致性。

🚀TAG支持独立实例的弹性伸缩,每个应用团队可部署专属网关,通过配置共享和插件复用机制,在满足业务定制化需求的同时避免了重复开发成本。

Learn how to apply Cloud SIEM best practices (Sponsored)

This guide outlines a practical approach to monitoring and analyzing security events across cloud platforms—drawing on lessons learned from real-world implementations in AWS, GCP, Azure, Kubernetes, and identity providers.

You’ll learn how to collect and contextualize logs from key sources, detect and investigate suspicious behavior in real time, and apply detection rules based on the MITRE ATT&CK® framework.

Whether you're building out a SIEM strategy or refining an existing one, this guide provides a clear framework for scaling security visibility across modern, distributed systems.

Get the ebook


Disclaimer: The details in this post have been derived from the articles shared online by the Tinder Engineering Team. All credit for the technical details goes to the Tinder Engineering Team.  The links to the original articles and sources are present in the references section at the end of the post. We’ve attempted to analyze the details and provide our input about them. If you find any inaccuracies or omissions, please leave a comment, and we will do our best to fix them.

API gateways sit at the front line of any large-scale application. They expose services to the outside world, enforce security, and shape how clients interact with the backend. Most teams start with off-the-shelf solutions like AWS API Gateway, Apigee, or Kong. And for many use cases, these tools work well, but at some points, they might not be sufficient.

Tinder reached that point sometime around 2020.

Over the years, Tinder scaled to over 500 microservices. These services communicate internally via a service mesh, but external-facing APIs, handling everything from recommendations to matches to payments, needed a unified, secure, and developer-friendly entry point. Off-the-shelf gateways offered power, but not precision. They imposed constraints on configuration, introduced complexity in deployment, and lacked deep integration with Tinder’s cloud stack.

There was also a velocity problem. Product teams push frequent updates to backend services and mobile clients. The gateway needed to keep up. Every delay in exposing a new route or tweaking behavior at the edge slowed down feature delivery.

Then came the bigger concern: security. Tinder operates globally. Real user traffic pours in from over 190 countries. So does bad traffic that includes bots, scrapers, and abuse attempts. The gateway became a critical choke point. It had to enforce strict controls, detect anomalies, and apply protective filters without slowing down legitimate traffic.

The engineering team needed more than an API gateway. It needed a framework that could scale with the organization, integrate deeply with internal tooling, and let teams move fast without compromising safety. 

This is where TAG (Tinder API Gateway) was born. 

Challenges before TAG

Before TAG, gateway logic at Tinder was a patchwork of third-party solutions. Different application teams had adopted different API gateway products, each with its own tech stack, operational model, and limitations. What worked well for one team became a bottleneck for another.

This fragmented setup introduced real friction:

Here’s a glimpse at the complexity of session management across APIs at Tinder before TAG.

At the same time, core features were simply missing or difficult to implement in existing gateways. Some examples were as follows:

These weren’t edge cases. They were daily needs in a fast-moving, global-scale product.

The need was clear: a single, internal framework that let any Tinder team build and operate a secure, high-performance API gateway with minimal friction. 

Limitations of Existing Solutions

Before building TAG, the team evaluated several popular API gateway solutions, including AWS API Gateway, Apigee, Kong, Tyk, KrakenD, and Express Gateway. 

Each of these platforms came with its strengths, but none aligned well with the operational and architectural demands at Tinder.

Several core issues surfaced during evaluation:

What is TAG?

TAG, short for Tinder API Gateway, is a JVM-based framework built on top of Spring Cloud Gateway. 

It isn’t a plug-and-play product or a single shared gateway instance. It’s a gateway-building toolkit. Each application team can use it to spin up its own API gateway instance, tailored to its specific routes, filters, and traffic needs. See the diagram below for reference:

At its core, TAG turns configuration into infrastructure. Teams define their routes, security rules, and middleware behavior using simple YAML or JSON files. TAG handles the rest by wiring everything together behind the scenes using Spring’s reactive engine.

This design unlocks three critical outcomes:

From a developer's perspective, the experience looks like this:

TAG Boot Flow

Most API gateways suffer when the configuration grows large. Routes take time to load. Filters add complexity. Some systems even parse the config on the fly, introducing latency at request time. TAG avoids this entirely by doing the heavy lifting at startup.

Built on Spring Cloud Gateway, TAG extends the default lifecycle with custom components that process all configuration before traffic begins to flow. The result is a fully prepared routing engine that’s ready from the first request.

Here’s how the boot flow works:

This design ensures that routing logic executes with minimal overhead. Every decision has already been made. Every route, predicate, and filter is compiled into the runtime graph.

The trade-off is simple and deliberate: if something is misconfigured, the gateway fails fast at startup instead of failing slowly during production traffic. It enforces correctness early and protects runtime performance.

Request Lifecycle in TAG

When a request hits a TAG-powered gateway, it passes through a well-defined pipeline of filters, transformations, and lookups before reaching the backend. This flow is a consistent execution path that gives teams control at each stage.

See the diagram below:

Here’s how TAG handles an incoming request from start to finish:

Reverse Geo IP Lookup (RGIL)

The first step is geolocation. TAG applies a global filter that maps the client’s IP address to a three-letter ISO country code. This lightweight check powers:

The filter runs before any route matching, ensuring even invalid or blocked paths can be stopped early.

Request and Response Scanning

TAG captures request and response schemas, not full payloads. This happens through a global, asynchronous filter that publishes events to Amazon MSK (Kafka). 

The data stream enables:

The filter works off the main thread, avoiding impact on request latency.

Session Management

A centralized global filter handles session validation and updates, ensuring that session logic stays consistent across all gateways and services. 

There is no per-service session drift or duplicated logic.

Predicate Matching

Once preliminary filters are complete, TAG matches the request path to a configured route using Spring Cloud Gateway’s predicate engine. 

If no match is found, the request is rejected early.

Service Discovery

With the route identified, TAG uses Envoy's service mesh to resolve the correct backend service. This approach decouples routing from fixed IPs or static service lists.

Pre-Filters

Before forwarding the request, TAG applies any pre-filters defined for that route. These can include:

Pre-filters run in a defined sequence, determined by the configuration.

Post-Filters

After the backend service responds, TAG processes the output through post-filters. These often include:

Again, execution order is configurable.

Final Response

Once all post-filters complete, the final response is sent back to the client. No surprises, no side effects.

Every filter (pre, post, global, or custom) follows a strict execution order. Developers can:

This predictability is what makes TAG maintainable under load. 

Conclusion

TAG has become the standard API gateway framework across Tinder. Instead of relying on one centralized gateway instance, each application team deploys its TAG instance with application-specific configurations. This model gives teams autonomy while preserving consistency in how routes are defined, traffic is filtered, and security is enforced.

Every TAG instance scales independently, making it easy to adapt to changes in traffic patterns, feature launches, or business priorities. TAG now powers both B2C and B2B traffic, not just for Tinder, but also for other Match Group brands like Hinge, OkCupid, PlentyOfFish, and Ship. 

See the visualization below for this capability.

TAG’s design unlocks several long-term advantages:

Beyond current production use, TAG lays the foundation for future initiatives that require visibility and control at the API layer.
The lesson here isn’t that every company needs to build a custom gateway. The lesson is that at a certain scale, flexibility, consistency, and performance can’t be solved with off-the-shelf tools alone. TAG works because it’s deeply shaped by how Tinder builds, deploys, and defends its software, without compromising developer velocity or operational clarity.

References:


SPONSOR US

Get your product in front of more than 1,000,000 tech professionals.

Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.

Space Fills Up Fast - Reserve Today

Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing sponsorship@bytebytego.com.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Tinder API Gateway TAG架构 微服务治理 配置驱动 高性能网关
相关文章