Single-cluster deployments

When to use a single-cluster setup

The Keycloak single-cluster setup is targeted at use cases that:

Deploy to an infrastructure with transparent networking, like for example a single Kubernetes cluster.
Desire all healthy Keycloak instances to handle user requests.
Are constrained to a single AWS Region or an equivalent low-latency setup.
Permit planned outages for maintenance.
Fit within a defined user and request count.
Can accept the impact of periodic outages.
Deployed in data centers with the required network latency and database configuration

Tested Configuration

We regularly test Keycloak with the following configuration:

An OpenShift cluster deployed across three AWS availability zones in the same region.
- Provisioned with Red Hat OpenShift Service on AWS (ROSA), using ROSA HCP.
- At least one worker node for each availability-zone
- OpenShift version 4.17.
Amazon Aurora PostgreSQL database
- High availability with a primary DB instance in one availability zone, and synchronously replicated readers in the other availability zones
- Version 17.5

Configuration

Keycloak deployed on a Kubernetes cluster
- For cloud setups, Pods can be scheduled across multiple availability zones within the same region if Keycloak’s latency requirements are met.
- For on-premise setups, Pods can be scheduled across multiple datacenters if Keycloak’s latency requirements are met.
Keycloak deployed on virtual machines or bare metal
- Instances can be scheduled across multiple availability zones within the same cloud-provider region or multiple datacenters if Keycloak’s latency requirements are met.
Deployments require a round-trip latency of less than 10 ms between Keycloak instances.
Database
- For a list of supported databases, see Configuring the database.
- Deployments spanning multiple availability zones must utilize a database that can tolerate zone failures and synchronously replicates data between replicas.

While equivalent setups should work, you will need to verify the performance and failure behavior of your environment. We provide functional tests, failure tests and load tests in the Keycloak Benchmark Project.

Read more on each item in the Building blocks single-cluster deployments guide.

Tested load

We regularly test Keycloak with the following load:

100,000 users
300 requests per second

We have successfully scaled to the following load:

100,000 users
1,000 logins, with 20,000 token refreshes per second.

We used the following setup for this:

Single-cluster setup consisting of 6 Pods, spread across 3 availability zones in a single AWS region.
- Each Pod with limits of 40 vCPU and 8 GB memory
Amazon Aurora PostgreSQL multi-az database deployed in two availability zones
- Instance type: db.r8g.16xlarge for both reader and writer instance

As the load increased, the CPU usage increased linearly.

See the Concepts for sizing CPU and memory resources guide for more information.

Limitations

Even with the additional redundancy of three availability-zones, downtime can still occur when:

Simultaneous node failures occur
Rolling out Keycloak upgrades
Infrastructure fails, for example the Kubernetes cluster

For more details on limitations see the Concepts for single-cluster deployments guide.

Next steps

The different guides introduce the necessary concepts and building blocks. For each building block, a blueprint shows how to deploy a fully functional example. Additional performance tuning and security hardening are still recommended when preparing a production setup.