Load Balancing in Oracle Cloud Infrastructure

What is OCI Load Balancing?

  • A service that provides automated traffic distribution from one entry point to multiple servers reachable from your VCN.
  • The service offers a load balancer with your choice of a public or private IP address, & provisioned bandwidth.
  • A load balancer improves resource utilization, facilitates scaling, & helps ensure HA.
  • You can configure multiple load balancing policies & app-specific health checks to ensure that the load balancer directs traffic only to healthy instances.
  • The LB can reduce your maintenance window by draining traffic from an unhealthy app server before you remove it from service for maintenance.

How Load Balancing Works

  • The Load Balancing service enables you to create a public or private load balancer within your VCN.
  • A public LB has a public IP that is accessible from the internet.
  • A private LB has an IP address from the hosting subnet, which is visible only within your VCN.
  • You can configure multiple listeners for an IP to load balance transport Layer 4 and Layer 7 (TCP & HTTP) traffic.
  • Both public & private LBs can route data traffic to any backend server that is reachable from the VCN.

Public LB

  • To accept traffic from the internet, you create a public LB.
  • The service assigns it a public IP that serves as the entry point for incoming traffic.
  • You can associate the public IP with a friendly DNS name through any DNS vendor.
  • A public LB is regional.
  • If your region has multiple ADs, a public LB requires either a regional subnet (recommended) or 2 AD-specific subnets, each in a separate AD.
  • With a regional subnet, the Load Balancing service creates a primary LB & a standby LB, each in a different AD, to ensure accessibility even during an AD outage.
  • If you create an LB in 2 AD-specific subnets, 1 subnet hosts the primary LB & the other hosts a standby LB.
  • If the primary LB fails, the public IP switches to the secondary LB.
  • The service treats the two LBs as equivalent & you cannot specify which one is primary.
  • Whether you use regional or AD-specific subnets, each LB requires one private IP from its host subnet.
  • The Load Balancing service supplies a floating public IP to the primary LB.
  • The floating public IP does not come from your backend subnets.
  • If your region includes only one AD, the service requires just 1 subnet, either regional or AD-specific, to host both the primary & standby LBs.
  • The primary & standby LBs each require a private IP from the host subnet, in addition to the assigned floating public IP.
  • If there is an AD outage, the LB has no failover.
  • You cannot specify a private subnet for your public LB.

Private LB

  • To isolate your LB from the internet & simplify your security posture, you can create a private LB.
  • The Load Balancing service assigns it a private IP that serves as the entry point for incoming traffic.
  • When you create a private LB, the service requires only 1 subnet to host both the primary & standby load balancers.
  • The LB can be regional or AD-specific, depending on the scope of the host subnet.
  • The LB is accessible only from within the VCN that contains the host subnet, or as further restricted by your security rules.
  • The assigned floating private IP is local to the host subnet.
  • The primary & standby LBs each require an extra private IP from the host subnet.
  • If there is an AD outage:
    • A private LB created in a regional subnet in a multi-AD region provides failover.
    • A private LB created in an AD-specific subnet, or in a regional subnet in a single AD region, has no failover.

All LBs

  • Your LB has a backend set to route incoming traffic to your Compute instances.
  • The backend set is a logical entity that includes:
    • A list of backend servers.
    • A load balancing policy.
    • A health check policy.
    • Optional SSL handling.
    • Optional session persistence configuration.
  • The backend servers (Compute instances) associated with a backend set can exist anywhere, as long as the associated NSGs, security lists, & route tables allow the intended traffic flow.
  • If your VCN uses NSGs, you can associate your LB with an NSG.
  • If you prefer to use security lists, the Load Balancing service can suggest appropriate security list rules.
  • Oracle recommends that you create your LB in a regional subnet.
  • Oracle recommends that you distribute your backend servers across all ADs within the region.

Load Balancing Concepts

  • Backend Server:
    • An app server responsible for generating content in reply to the incoming TCP or HTTP traffic.
    • You typically identify app servers with a unique combination of private IP & port — 10.10.10.1:8080 & 10.10.10.2:8080.
  • Backend Set:
    • A logical entity defined by a list of backend servers, a load balancing policy, & a health check policy.
    • SSL configuration is optional.
    • The backend set determines how the LB directs traffic to the collection of backend servers.
  • Certificates:
    • If you use HTTPS or SSL for your listener, you must associate an SSL server certificate (X.509) with your LB.
    • A certificate enables the LB to terminate the connection & decrypt incoming requests before passing them to the backend servers.
  • Health Check:
    • A health check is a test to confirm the availability of backend servers.
    • A health check can be a request or a connection attempt.
    • Based on a time interval you specify, the LB applies the health check policy to continuously monitor backend servers.
    • If a server fails the health check, the LB takes the server temporarily out of rotation.
    • If the server subsequently passes the health check, the LB returns it to the rotation.
    • You configure your health check policy when you create a backend set.
    • You can configure TCP-level or HTTP-level health checks for your backend servers:
      • TCP-level health checks attempt to make a TCP connection with the backend servers & validate the response based on the connection status.
      • HTTP-level health checks send requests to the backend servers at a specific URI & validate the response based on the status code or entity data (body) returned.
    • The service provides app-specific health check capabilities to help you increase availability & reduce your app maintenance window.
  • Health Status:
    • An indicator that reports the general health of your LBs & their components.
  • Listener:
    • A logical entity that checks for incoming traffic on the LB’s IP.
    • You configure a listener’s protocol & port, & optional SSL.
    • To handle TCP, HTTP, & HTTPS traffic, you must configure multiple listeners.
    • Supported protocols — TCP, HTTP/1.0, HTTP/1.1.
  • Load Balancing Policy:
    • A load balancing policy tells the load balancer how to distribute incoming traffic to the backend servers.
    • Common load balancing policies include:
      • Round robin.
      • Least connections.
      • IP hash.
  • Path Route Set:
    • A set of path route rules to route traffic to the correct backend set without using multiple listeners or LBs.
  • Session Persistence:
    • A method to direct all requests originating from a single logical client to a single backend server.
  • Shape:
    • A template that determines the LB’s total pre-provisioned maximum capacity (bandwidth) for ingress + egress traffic.
    • Available shapes — 10 Mbps (always free), 100 Mbps, 400 Mbps, & 8 Gbps.
  • SSL — You can apply the following 3 SSL configurations to your LB:
    • SSL Termination — The LB handles incoming SSL traffic & passes the unencrypted request to a backend server.
    • End-to-End SSL — The LB terminates the SSL connection with an incoming traffic client, & initiates another SSL connection to a backend server.
    • SSL Tunneling — If you configure the LB’s listener for TCP, the LB tunnels incoming SSL connections to your app servers.
    • Load Balancing supports TLS 1.2 with a default setting of strong cipher strength.
  • Virtual Hostname:
    • A virtual server name applied to a listener to enhance request routing.

Load Balancing Limits

  • You cannot dynamically change the LB shape to handle more incoming traffic.
  • You cannot convert an AD-specific LB to a regional LB or vice-versa.
  • Max concurrent connections are limited when you use stateful security rules for your LB subnets.
    • To accommodate high-volume traffic, use stateless security rules for your LB subnets.
  • Each LB has the following configuration limits:
    • 1 IP
    • 16 backend sets
    • 512 backend servers per backend set
    • 1024 backend servers total
    • 16 listeners

How Load Balancing Policies Work

  • A TCP LB considers policy & weightage to direct an initial incoming request to a backend server.
    • All subsequent packets on this connection go to the same endpoint.
  • An HTTP LB configured for cookie-based session persistence forwards requests to the backend server specified by the cookie’s session information.
  • For non-sticky HTTP requests, the LB applies policy & weightage to every incoming request & determines an appropriate backend server.
    • Multiple requests from the same client could be directed to different servers.

Round Robin

  • Round Robin is the default LB policy.
  • This policy distributes incoming traffic sequentially to each server in a backend set list.
  • After each server has received a connection, the load balancer repeats the list in the same order.
  • Round Robin is a simple load balancing algorithm.
  • It works best when all the backend servers have similar capacity & the processing load required by each request does not vary significantly.

Least Connections

  • The Least Connections policy routes incoming non-sticky request traffic to the backend server with the fewest active connections.
  • This policy helps you maintain an equal distribution of active connections with backend servers.
  • As with the round robin policy, you can assign a weight to each backend server & further control traffic distribution.

IP Hash

  • The IP Hash policy uses an incoming request’s source IP as a hashing key to route non-sticky traffic to the same backend server.
  • The LB routes requests from the same client to the same backend server as long as that server is available.
  • This policy honors server weightage when establishing the initial connection.
  • IP Hash ensures that requests from a particular client are always directed to the same backend server, as long as it is available.
  • You cannot add a backend server marked as Backup to a backend set that uses the IP Hash policy.
  • Multiple clients that connect to an LB through a proxy or NAT router appear to have the same IP. If you apply the IP Hash policy to your backend set, the LB routes traffic based on the incoming IP & sends these proxied client requests to the same backend server. If the proxied client pool is large, the requests could flood a backend server.

Connection Management

  • LBs support connection multiplexing.
  • The LB can route many incoming requests from multiple clients to the destination backend server through a few (1 or multiple) backend connections.
  • After your LB connects a client to a backend server, the connection can be closed due to inactivity.
  • You can configure LB listeners to control the max idle time allowed during each TCP connection or HTTP request & response pair.
  • Oracle recommends that you do not allow your backend servers to close connections to the load balancer.
  • 3 timeout settings affect your LB’s behavior:
    • Keep-alive setting between the LB & backend server.
      • The LB closes backend server connections that are idle for more than 300 seconds (= 5 minutes).
      • The load balancing service does not honor keep-alive settings from backend servers.
      • To prevent possible 502 errors, ensure that your backend servers do not close idle connections in less than 310 seconds.
    • Keep-alive setting between the LB & the client.
      • The Load Balancing service sets this keep-alive value to maintain the connection for 10K transactions or until it has been idle for 65 seconds, whichever occurs first. You cannot change this.
    • Idle timeout.
      • You can set the duration of the idle timeout when you create a listener.
      • This setting applies to the time allowed between 2 successive receive or send network input/output operations during the HTTP request-response phase.
      • If the configured timeout has elapsed with no packets sent or received, the client’s connection is closed.
      • For HTTP & WebSocket connections, a send operation does not reset the timer for receive operations & a receive operation does not reset the timer for send operations.
      • This timeout setting does not apply to idle time between a completed response & a subsequent HTTP request.
      • The default timeout values are:
        • 300 seconds for TCP listeners.
        • 60 seconds for HTTP listeners.
      • Modify this timeout if either the client or the backend server requires more time to transmit data, such as:
        • The client sends a database query to the backend server & the database takes over 300 seconds to execute.
        • The client uploads data using the HTTP protocol. During the upload, the backend does not transmit any data to the client for more than 60 seconds.
        • The client downloads data using the HTTP protocol. After the initial request, it stops transmitting data to the backend server for more than 60 seconds.
        • The client starts transmitting data after establishing a WebSocket connection, but the backend server does not transmit data for more than 60 seconds.
        • The backend server starts transmitting data after establishing a WebSocket connection, but the client does not transmit data for more than 60 seconds.
      • The maximum timeout value is 7200 seconds. Contact My Oracle Support to file a service request if you want to increase this limit for your tenancy.

HTTP “X-” Headers

The Load Balancing service adds or modifies the following X- headers when it passes requests to your servers:

  • X-Forwarded-For
    • Provides a list of connection IP addresses.
    • The LB appends the last remote peer address to the X-Forwarded-For field from the incoming request.
    • A comma & space precede the appended address.
    • If the client request header does not include an X-Forwarded-For field, this value is equal to the X-Real-IP value.
    • The original requesting client is the first (left-most) IP address in the list, assuming that the incoming field content is trustworthy.
    • The last address is the last (most recent) peer, that is, the machine from which the LB received the request.
    • X-Forwarded-For: 202.1.112.187, 192.168.0.10
  • X-Forwarded-Host
    • Identifies the original host & port requested by the client in the Host HTTP request header.
    • This header helps you determine the original host, since the hostname or port of the reverse proxy (LB) might differ from the original server handling the request.
    • X-Forwarded-Host: www.oracle.com:8080
  • X-Forwarded-Port
    • Identifies the listener port number that the client used to connect to the load balancer.
    • X-Forwarded-Port: 443
  • X-Forwarded-Proto
    • Identifies the protocol that the client used to connect to the LB, either http or https.
    • X-Forwarded-Proto: https
  • X-Real-IP
    • Identifies the client’s IP.
    • For the Load Balancing service, the “client” is the last remote peer.
    • Your LB intercepts traffic between the client & your server.
    • Your server’s access logs, therefore, include only the LB’s IP.
    • The X-Real-IP header provides the client’s IP address.
    • X-Real-IP: 192.168.0.10