Schema-based multi-tenant architecture using Quarkus & Hibernate-ORM

CoffeeBeans_BrewingInnovations
8 min readJun 30, 2022

--

Architecture design is a must while developing a SaaS application to ensure its scalability and optimising infrastructure costs. In this blog, Lets discuss the implementation of one such architecture with Quarkus java framework and Hibernate ORM

By Vishal Bihani , CoffeeBeans Consulting

Suppose you have a great product idea, you started working on it and developed a cool SaaS product. You get new tenants to use your product and they liked it. In a nutshell, your product is a hit.

Everything seems to be going well until you realize you are bleeding money with every increasing tenant. You want to scale up but you are unable to do so because you have already exhausted all your cash.

You try to figure out what is going wrong and eventually look into your application infrastructure costs.

You understood that every time a new tenant onboards your application you deploy a new server, database, etc. for that tenant. Separate servers and/or databases are only useful for computing or data-intensive applications but yours is neither of that.

In such scenarios, multi-tenant architecture can be very useful in designing applications with shared resources. Multiple combinations are possible some of which are:

1. Separate servers and shared database

2. Shared server and separate databases

3. Shared server and shared database

How should I decide which resource to keep separate and which to be shared?

The decision should be made on each resource after looking at its constraints for scaling up. It’s good if each of your resources scales individually based on the utilization rather than complete application scaling up and remaining underutilized.

There are a few things one should keep in mind while deciding.

For servers:

If your business logic is very customized for each tenant then shared servers are not an optimal solution.

OR

If your application is compute-intensive or each tenant is limited by their respective compute then separate servers are a good solution.

For databases:

Generally, it’s a trade-off between data isolation and resource utilization. Development complexity and scaling constraints are also key decision factors.

A separate database provides tenant-specific customization freedom and ease in monitoring per tenant usage and billing them for backup and disaster recovery.

Multi-tenant database patterns

There are three types of patterns:

1. Database per tenant

2. Schema per tenant

3. Row per tenant

Database per tenant

Separated databases will be provisioned for each tenant. It achieves maximum data isolation and will give you freedom for high customization and tenant-specific operations to the database.

The development complexity and cost of this approach are high.

Schema per tenant

Separate schemas are created for each tenant. It achieves data isolation at schema level.

High customization and tenant-specific operations like backup, monitoring, and encryption are not supported in many databases.

The development complexity is high but the cost is low.

Row per tenant

In this approach, all the tenants’ data are stored in the same table. Each row is identified by the tenant identifier. Some databases like PostgreSQL support RLS (Row-Level Security) where the rows are implicitly filtered based on the tenant and there is no need to explicitly write the WHERE clause with the tenant identifier.

There is no data isolation and more care has to be put into writing database queries at the application level.

The development complexity and cost are low.

In this blog, we will discuss the implementation of the shared server and schema-based shared database architecture.

What we will build?

We will build an app that will store the information of the users of the tenants.

Each tenant will have its separate schema which will store its users’ details. A public schema that will be accessed by every tenant will contain the pricing detail of the product.

Database pattern

schema-based multi-tenancy database pattern

Tables

table details

Implementation

One important factor while approaching this solution is whether or not frameworks or tools are available that support this architecture and pattern.

For the implementation part, we will be using the Quarkus framework to deploy the REST APIs and perform the business logic.

The Quarkus is an open-source java framework developed by Red Hat which empowers the developers to build cloud-native and modern java applications. You can learn more about Quarkus here.

PostgreSQL will be used as the database. It is an open-source SQL database.

Hibernate-ORM supports schema-based multi-tenancy.

Let’s start

Go to https://code.quarkus.io/ and fill in the application configuration.

Application configuration

Add the following extensions

Selected extensions

Open in an IDE and run the following command from the terminal:

mvn compile quarkus:dev

It will download all the dependencies and start the server with the dev profile activated. In dev, profile Quarkus provides the support for live reloading.

After the server has started, perform a GET request to the following URL:

http://localhost:8080/hello

It should return ‘Hello RESTEasy’

Create entities

Package information is important, all the entities of the public (shared) schema should reside in the same package. Similarly for the tenant schema. We will discuss this in detail later while discussing multiple persistence units.

Hibernate does not support database generation (create or update) in the multi-tenant mode so we need to create the databases ourselves.

Create pricing_details table on public schema

CREATE TABLE public.pricing_details
(
id integer,
product_name character varying NOT NULL,
price integer NOT NULL,
created_at date,
updated_at date,
PRIMARY KEY (id)
);

Create schema for Tenant A (GOOGL) and Tenant B (AAPL)

CREATE SCHEMA googl
AUTHORIZATION postgres;

CREATE SCHEMA aapl
AUTHORIZATION postgres;

Create user_details table on both schema

CREATE TABLE googl.user_details
(
id integer,
first_name character varying NOT NULL,
last_name character varying NOT NULL,
created_at date,
updated_at date,
PRIMARY KEY (id)
);
CREATE TABLE aapl.user_details
(
id integer,
first_name character varying NOT NULL,
last_name character varying NOT NULL,
created_at date,
updated_at date,
PRIMARY KEY (id)
);

Insert values in the tables.

Create repositories for entities

Configuring application properties

write the following configuration properties in the application.properties

Only a single data source is configured because we have only one database to connect to.

A persistence unit defines a set of all entity classes that are managed by EntityManager instances in an application

We have separate persistence units for public schema (<default>) and tenant-specific schema (tenant)

The data source is set to <default> so that they connect to the default data source. Multi-tenant mode is set to SCHEMA

But we need to configure which persistence unit to manage which entities. For that, we need to create package-info.java files in the packages containing the entity files of public and tenant-specific files.

Inside the package containing the entities of the ‘public’ schema
Inside the package containing the entities of tenant-specific schema

Okay, we configured which persistence should handle which entities but what about schema information?

Each tenant will have its own schema name then how will the persistence unit know which schema to go for. For this, we will create a Tenant Resolver which will resolve the schema name based on the incoming request.

We need to create a Tenant Resolver for all the persistence units.

We will implement the hibernate’s TenantResolver interface which provides a method to resolve schema names.

The @PersistenceUnitExtension takes the name of the persistence unit for which this TenantResolver implementation is.

@PersistenceUnitExtension is for the default persistence unit.

The @RequestScoped is for defining the lifecycle of this bean same to that of the request the application receives (Bean is created when the application receives the request and is destroyed when the request is completed). Once the request is completed it will be destroyed and will be recreated with every incoming request.

what about the below code?

public String resolveTenantId() {
String tenantName = context.request().getHeader("tenant");
return tenantName;
}

To keep the implementation simpler we will pass the tenant identifier as the header in the HTTP request with the key as “tenant” and the value as the identifier. The above code will retrieve that value and return it as the schema name for that tenant.

Now all the configuration is done but where are the APIs for fetching data?

‘/pricing/all’ endpoint will fetch the list of pricing details from the public schema and return the list.

‘/user/all’ endpoint requires a tenant identifier as a header and will fetch the data from that schema and will return the list of users.

Let’s insert some data into the tables

public schema
googl schema
aapl schema

Let’s hit the APIs and see the results

We can use postman to check the APIs

list of pricing details from public schema
list of users from googl schema
list of users from aapl schema

Great! we finally achieved schema-based multi-tenancy for our application.

Important

For the simplicity of the implementation, we passed the tenant identifier in the header. It should not be passed in the header. A better approach will be to include it in the JWT tokens which can’t be modified and retrieve that by intercepting the HTTP request and storing the identifier in a TenantContext.

Now I hope you will apply this architecture to your SaaS product and save on your infrastructure costs.

Source code

You can find the source code here.

Pros

  1. Efficient use of resources.
  2. A better architecture for multi-tenancy compared to the row per tenant approach in terms of data isolation.
  3. Low infrastructure costs.
  4. Can easily scale.

Cons

  1. Development complexity is high.
  2. High customization per tenant will be difficult.
  3. Data backup for each tenant is not possible.
  4. Monitoring for each tenant may not be supported.
  5. Frameworks supporting this pattern may not be available in all languages.

Thanks for reading.

--

--

CoffeeBeans_BrewingInnovations

CoffeeBeans empowers organizations to transform their business through the use of advanced technologies. Building data-driven solutions that drive innovation.