What is OCI Load Balancing?
- A service that provides automated traffic distribution from one entry point to multiple servers reachable from your VCN.
- The service offers a load balancer with your choice of a public or private IP address, & provisioned bandwidth.
- A load balancer improves resource utilization, facilitates scaling, & helps ensure HA.
- You can configure multiple load balancing policies & app-specific health checks to ensure that the load balancer directs traffic only to healthy instances.
- The LB can reduce your maintenance window by draining traffic from an unhealthy app server before you remove it from service for maintenance.
How Load Balancing Works
- The Load Balancing service enables you to create a public or private load balancer within your VCN.
- A public LB has a public IP that is accessible from the internet.
- A private LB has an IP address from the hosting subnet, which is visible only within your VCN.
- You can configure multiple listeners for an IP to load balance transport Layer 4 and Layer 7 (TCP & HTTP) traffic.
- Both public & private LBs can route data traffic to any backend server that is reachable from the VCN.
Public LB
- To accept traffic from the internet, you create a public LB.
- The service assigns it a public IP that serves as the entry point for incoming traffic.
- You can associate the public IP with a friendly DNS name through any DNS vendor.
- A public LB is regional.
- If your region has multiple ADs, a public LB requires either a regional subnet (recommended) or 2 AD-specific subnets, each in a separate AD.
- With a regional subnet, the Load Balancing service creates a primary LB & a standby LB, each in a different AD, to ensure accessibility even during an AD outage.
- If you create an LB in 2 AD-specific subnets, 1 subnet hosts the primary LB & the other hosts a standby LB.
- If the primary LB fails, the public IP switches to the secondary LB.
- The service treats the two LBs as equivalent & you cannot specify which one is primary.
- Whether you use regional or AD-specific subnets, each LB requires one private IP from its host subnet.
- The Load Balancing service supplies a floating public IP to the primary LB.
- The floating public IP does not come from your backend subnets.
- If your region includes only one AD, the service requires just 1 subnet, either regional or AD-specific, to host both the primary & standby LBs.
- The primary & standby LBs each require a private IP from the host subnet, in addition to the assigned floating public IP.
- If there is an AD outage, the LB has no failover.
- You cannot specify a private subnet for your public LB.
Private LB
- To isolate your LB from the internet & simplify your security posture, you can create a private LB.
- The Load Balancing service assigns it a private IP that serves as the entry point for incoming traffic.
- When you create a private LB, the service requires only 1 subnet to host both the primary & standby load balancers.
- The LB can be regional or AD-specific, depending on the scope of the host subnet.
- The LB is accessible only from within the VCN that contains the host subnet, or as further restricted by your security rules.
- The assigned floating private IP is local to the host subnet.
- The primary & standby LBs each require an extra private IP from the host subnet.
- If there is an AD outage:
- A private LB created in a regional subnet in a multi-AD region provides failover.
- A private LB created in an AD-specific subnet, or in a regional subnet in a single AD region, has no failover.
All LBs
- Your LB has a backend set to route incoming traffic to your Compute instances.
- The backend set is a logical entity that includes:
- A list of backend servers.
- A load balancing policy.
- A health check policy.
- Optional SSL handling.
- Optional session persistence configuration.
- The backend servers (Compute instances) associated with a backend set can exist anywhere, as long as the associated NSGs, security lists, & route tables allow the intended traffic flow.
- If your VCN uses NSGs, you can associate your LB with an NSG.
- If you prefer to use security lists, the Load Balancing service can suggest appropriate security list rules.
- Oracle recommends that you create your LB in a regional subnet.
- Oracle recommends that you distribute your backend servers across all ADs within the region.
Load Balancing Concepts
- Backend Server:
- An app server responsible for generating content in reply to the incoming TCP or HTTP traffic.
- You typically identify app servers with a unique combination of private IP & port — 10.10.10.1:8080 & 10.10.10.2:8080.
- Backend Set:
- A logical entity defined by a list of backend servers, a load balancing policy, & a health check policy.
- SSL configuration is optional.
- The backend set determines how the LB directs traffic to the collection of backend servers.
- Certificates:
- If you use HTTPS or SSL for your listener, you must associate an SSL server certificate (X.509) with your LB.
- A certificate enables the LB to terminate the connection & decrypt incoming requests before passing them to the backend servers.
- Health Check:
- A health check is a test to confirm the availability of backend servers.
- A health check can be a request or a connection attempt.
- Based on a time interval you specify, the LB applies the health check policy to continuously monitor backend servers.
- If a server fails the health check, the LB takes the server temporarily out of rotation.
- If the server subsequently passes the health check, the LB returns it to the rotation.
- You configure your health check policy when you create a backend set.
- You can configure TCP-level or HTTP-level health checks for your backend servers:
- TCP-level health checks attempt to make a TCP connection with the backend servers & validate the response based on the connection status.
- HTTP-level health checks send requests to the backend servers at a specific URI & validate the response based on the status code or entity data (body) returned.
- The service provides app-specific health check capabilities to help you increase availability & reduce your app maintenance window.
- Health Status:
- An indicator that reports the general health of your LBs & their components.
- Listener:
- A logical entity that checks for incoming traffic on the LB’s IP.
- You configure a listener’s protocol & port, & optional SSL.
- To handle TCP, HTTP, & HTTPS traffic, you must configure multiple listeners.
- Supported protocols — TCP, HTTP/1.0, HTTP/1.1.
- Load Balancing Policy:
- A load balancing policy tells the load balancer how to distribute incoming traffic to the backend servers.
- Common load balancing policies include:
- Round robin.
- Least connections.
- IP hash.
- Path Route Set:
- A set of path route rules to route traffic to the correct backend set without using multiple listeners or LBs.
- Session Persistence:
- A method to direct all requests originating from a single logical client to a single backend server.
- Shape:
- A template that determines the LB’s total pre-provisioned maximum capacity (bandwidth) for ingress + egress traffic.
- Available shapes — 10 Mbps (always free), 100 Mbps, 400 Mbps, & 8 Gbps.
- SSL — You can apply the following 3 SSL configurations to your LB:
- SSL Termination — The LB handles incoming SSL traffic & passes the unencrypted request to a backend server.
- End-to-End SSL — The LB terminates the SSL connection with an incoming traffic client, & initiates another SSL connection to a backend server.
- SSL Tunneling — If you configure the LB’s listener for TCP, the LB tunnels incoming SSL connections to your app servers.
- Load Balancing supports TLS 1.2 with a default setting of strong cipher strength.
- Virtual Hostname:
- A virtual server name applied to a listener to enhance request routing.
Load Balancing Limits
- You cannot dynamically change the LB shape to handle more incoming traffic.
- You cannot convert an AD-specific LB to a regional LB or vice-versa.
- Max concurrent connections are limited when you use stateful security rules for your LB subnets.
- To accommodate high-volume traffic, use stateless security rules for your LB subnets.
- Each LB has the following configuration limits:
- 1 IP
- 16 backend sets
- 512 backend servers per backend set
- 1024 backend servers total
- 16 listeners
How Load Balancing Policies Work
- A TCP LB considers policy & weightage to direct an initial incoming request to a backend server.
- All subsequent packets on this connection go to the same endpoint.
- An HTTP LB configured for cookie-based session persistence forwards requests to the backend server specified by the cookie’s session information.
- For non-sticky HTTP requests, the LB applies policy & weightage to every incoming request & determines an appropriate backend server.
- Multiple requests from the same client could be directed to different servers.
Round Robin
- Round Robin is the default LB policy.
- This policy distributes incoming traffic sequentially to each server in a backend set list.
- After each server has received a connection, the load balancer repeats the list in the same order.
- Round Robin is a simple load balancing algorithm.
- It works best when all the backend servers have similar capacity & the processing load required by each request does not vary significantly.
Least Connections
- The Least Connections policy routes incoming non-sticky request traffic to the backend server with the fewest active connections.
- This policy helps you maintain an equal distribution of active connections with backend servers.
- As with the round robin policy, you can assign a weight to each backend server & further control traffic distribution.
IP Hash
- The IP Hash policy uses an incoming request’s source IP as a hashing key to route non-sticky traffic to the same backend server.
- The LB routes requests from the same client to the same backend server as long as that server is available.
- This policy honors server weightage when establishing the initial connection.
- IP Hash ensures that requests from a particular client are always directed to the same backend server, as long as it is available.
- You cannot add a backend server marked as Backup to a backend set that uses the IP Hash policy.
- Multiple clients that connect to an LB through a proxy or NAT router appear to have the same IP. If you apply the IP Hash policy to your backend set, the LB routes traffic based on the incoming IP & sends these proxied client requests to the same backend server. If the proxied client pool is large, the requests could flood a backend server.
Connection Management
- LBs support connection multiplexing.
- The LB can route many incoming requests from multiple clients to the destination backend server through a few (1 or multiple) backend connections.
- After your LB connects a client to a backend server, the connection can be closed due to inactivity.
- You can configure LB listeners to control the max idle time allowed during each TCP connection or HTTP request & response pair.
- Oracle recommends that you do not allow your backend servers to close connections to the load balancer.
- 3 timeout settings affect your LB’s behavior:
- Keep-alive setting between the LB & backend server.
- The LB closes backend server connections that are idle for more than 300 seconds (= 5 minutes).
- The load balancing service does not honor keep-alive settings from backend servers.
- To prevent possible 502 errors, ensure that your backend servers do not close idle connections in less than 310 seconds.
- Keep-alive setting between the LB & the client.
- The Load Balancing service sets this keep-alive value to maintain the connection for 10K transactions or until it has been idle for 65 seconds, whichever occurs first. You cannot change this.
- Idle timeout.
- You can set the duration of the idle timeout when you create a listener.
- This setting applies to the time allowed between 2 successive receive or send network input/output operations during the HTTP request-response phase.
- If the configured timeout has elapsed with no packets sent or received, the client’s connection is closed.
- For HTTP & WebSocket connections, a send operation does not reset the timer for receive operations & a receive operation does not reset the timer for send operations.
- This timeout setting does not apply to idle time between a completed response & a subsequent HTTP request.
- The default timeout values are:
- 300 seconds for TCP listeners.
- 60 seconds for HTTP listeners.
- Modify this timeout if either the client or the backend server requires more time to transmit data, such as:
- The client sends a database query to the backend server & the database takes over 300 seconds to execute.
- The client uploads data using the HTTP protocol. During the upload, the backend does not transmit any data to the client for more than 60 seconds.
- The client downloads data using the HTTP protocol. After the initial request, it stops transmitting data to the backend server for more than 60 seconds.
- The client starts transmitting data after establishing a WebSocket connection, but the backend server does not transmit data for more than 60 seconds.
- The backend server starts transmitting data after establishing a WebSocket connection, but the client does not transmit data for more than 60 seconds.
- The maximum timeout value is 7200 seconds. Contact My Oracle Support to file a service request if you want to increase this limit for your tenancy.
- Keep-alive setting between the LB & backend server.
HTTP “X-” Headers
The Load Balancing service adds or modifies the following X- headers when it passes requests to your servers:
- X-Forwarded-For
- Provides a list of connection IP addresses.
- The LB appends the last remote peer address to the X-Forwarded-For field from the incoming request.
- A comma & space precede the appended address.
- If the client request header does not include an X-Forwarded-For field, this value is equal to the X-Real-IP value.
- The original requesting client is the first (left-most) IP address in the list, assuming that the incoming field content is trustworthy.
- The last address is the last (most recent) peer, that is, the machine from which the LB received the request.
X-Forwarded-For: 202.1.112.187, 192.168.0.10
- X-Forwarded-Host
- Identifies the original host & port requested by the client in the Host HTTP request header.
- This header helps you determine the original host, since the hostname or port of the reverse proxy (LB) might differ from the original server handling the request.
X-Forwarded-Host: www.oracle.com:8080
- X-Forwarded-Port
- Identifies the listener port number that the client used to connect to the load balancer.
X-Forwarded-Port: 443
- X-Forwarded-Proto
- Identifies the protocol that the client used to connect to the LB, either http or https.
X-Forwarded-Proto: https
- X-Real-IP
- Identifies the client’s IP.
- For the Load Balancing service, the “client” is the last remote peer.
- Your LB intercepts traffic between the client & your server.
- Your server’s access logs, therefore, include only the LB’s IP.
- The X-Real-IP header provides the client’s IP address.
X-Real-IP: 192.168.0.10