Zuul - Netflix API Gateway that handles 2M requests per second
How do we effectively route 2M requests per second to hundreds of backend services? How do we protect those backend services from constant DDoS attacks and retry storms? How do we keep Netflix humming along when an Amazon Web Services outage takes an entire AWS region out?
Meet Zuul - Netflix's swiss army knife for application networking. Zuul is Netflix's API gateway, load balancer, reverse proxy and more. It fronts all of the API traffic entering Netflix cloud and routes it to many backend services. It shields these backend services from retry storms, DDoS attacks and other outages shifting traffic across AWS regions, if necessary. It load balances millions of requests per second across thousands of machines, routing them intelligently and automatically around faulty instances.
In addition to this mission critical role in live production, Zuul is also invaluable during development and test to debug, canary and load test new features. Zuul lets us define new routing rules dynamically at run time that take effect near instantly without any deployments or restarts. This capability to slice, dice and change route for a portion of traffic quickly lets Zuul do all sort of tricks like canary / load testing, surgical traffic debugging for a single customer / device, black-holing malicious traffic etc.
At its heart, Zuul is a high performance, non blocking reverse proxy and layer 7 load balancer built on top of Netty. It supports multiple protocols including HTTP 1.0, HTTP 1.1, HTTP/2, WebSockets and Server Sent Events. It also offers flexible, configurable transport layer security including TLS, mTLS and application layer security like Netflix specific security protocols like MSL.
We will cover:
- Zuul concept - where Zuul fits in overall Netflix architecture.
- Zuul design - Architecture of Zuul and its fundamental components.
- Zuul operation - How we operate 80+ Zuul clusters to proxy traffic to hundreds of backends
- Zuul future - where are we going next with Zuul
参考译文:
我们是如何有效的在每秒中向数百个后端服务发送请求的?我们又是如何保护这些后端服务免受持续的DDoS攻击和重试风暴?当AWS服务中断时,我们如何保持Netflix正常运作。
Zuul——被称为Netflix应用网络的瑞士军刀,它是Netflix的API网关,负载均衡器,反向代理等等。它掌控着Netflix云的所有API流量,并将这些流量分发到多个后端服务上。必要时,它可以保护这些后端服务免受重试风暴,DDoS攻击以及其他服务中断在AWS区域之间转移流量带来的影响。它可以在数千台计算机上平衡每秒数百万个请求,并在故障实例周围智能的调度资源。
不仅在处理在线业务的过程中发挥关键作用,Zuul在开发和测试过程中也非常有价值,可用于调试、测试、加载测试新功能。Zuul支持在运行时动态定义新的分发规则,这些规则可以在无需任何部署或重新启动的情况下即时生效。这种切片,切块和更改部分流量分发的功能可以让Zuul执行各种任务,如为单个客户/设备、黑洞恶意流量做Canary(金丝雀部署)/负载测试,以及外科式流量调试。
从本质上讲,Zuul构建于Netty之上,是一个高性能,无阻塞的反向代理和7层负载均衡器。它支持多种协议,包括HTTP 1.0,HTTP 1.1,HTTP / 2,WebSockets和Server Sent Events。还提供灵活可配置的传输层安全性,包括TLS,mTLS和应用层安全性,如Netflix特定的安全协议MSL。
演讲提纲:
- Zuul概念 - Zuul适合整体Netflix的架构
- Zuul设计 - Zuul架构及其基本组成部分
- Zuul运维 - 我们如何运维80多个Zuul集群来代理数百个后端的流量
- Zuul未来 - Zuul的下一步打算