Skip to main content

Way of multitenancy implementation from database side.

Discriminator Strategy

The discriminator pattern works on a single database service and single schema for all tenants. Constituent tenants are discriminated by a specific strategy such as a tenant_id field embedded in tables containing tenant specific data. Beyond the below pro's/con's this strategy is a non-starter for use case which legally require 'air-space' between tenants.
Pros
  • Single database and schema instance to manage
  • Single schema to backup
  • Single schema to archive, upgrade etc.
  • Simple reporting across tenants (e.g. SELECT .... GROUP BY tenant_id)
  • Single database service account to manage per application.
  • Single database instance to tune and maintain. 
Cons
  • Tenant data is interwoven meaning backup & restore is an all or nothing proposition.
  • Care needs to be taken with every database interaction that the data returned is appropriately scoped.
  • If your database goes down, all your customers go down, therefor necessitating a high availability strategy which is generally a good idea but essential in this strategy.
  • If a table becomes corrupted it becomes corrupted for all users.
  • If a tenant leaves, it can be tricky to extract and archive the information
  • If that tenant comes back it can be trickier to reinsert the data and easier to integrate from scratch.
  • While storage is cheap, performance is not and an inactive tenant in a single schema will take up database buffer pool resources simply by it's existence through indices alone.
  • Because it is likely that a single service account will be used to access the schema and all tenants reside in the schema it can be challenging to trace database load to specific tenant usage.
  • As a single database service is serving all tenants, performance is subject to "noisy neighbors".
Scaling can be problematic depending on the underlying storage technology chosen due to the monolithic nature of the schema. If a traditional RDBMS is chosen replicas can be employed for read scaling and a sharding strategy employed for write scaling. If using a RDBMS this particular strategy lends itself well to use cases where historic data can be archived leaving just hot data in the primary database system. These considerations change if using a NoSQL technology such as AWS Aurora or MongoDB where r/w scaling is handled transparently as part of the storage service layer and not a concern of the application itself. In addition to this schema upgrades can be challenging based on the volume of potential data and all customers being affected simultaneously. Even with a backing technology supporting 'online schema updates' the application may have to consider supporting multiple data schema versions until the schema update is complete.

Schema Strategy

The schema strategy employs a single database server like the DISCRIMINATOR strategy but specifies a schema instance per tenant meaning that each tenant has complete isolation at the data layer from other tenants.
Pros
  • Tenant data is robustly isolated from other tenant data
  • This in turn means for simpler more robust application development. However the application must be tenant aware and capable of switching tenants reliably.
  • Schema & table corruption affects only a single tenant
  • Ad-hoc queries are automatically scoped to a single tenant.
  • Granular backups can be taken and restored with ease & in parallel.
  • Tenants can be migrated to and from different environments easily.
  • Instrumentation is available on a per schema basis allowing the attribution of load and bottlenecks to specific tenant generated load.
  • Single database service account to manage per application.
  • Single database instance to tune and maintain.
Cons
  • As a single database service is serving all tenants, performance is subject to noisy neighbors similar to the DISCRIMINATOR strategy. However it is trivial to move problem customers onto dedicated databases should the need arise.
  • If your database goes down, all your customers go down, again necessitating a good failover strategy.
  • Tooling needs to be built to handle schema updates, backups and restores of the tenant schemas with an environment.
  • Reporting across tenants requires additional tooling.
  • De-normalization of common reference tables may be necessary or a 'common/admin' schema employed and shared by all tenants. This in itself can assist in some of the maintenance tooling mentioned.

Database Strategy

The database strategy takes the SCHEMA strategy one step further whereby each tenant has a separate schema instance on a separate database.
Pros
  • Tenant data is robustly isolated from other tenant data
  • This in turn means for simpler more robust application development.
  • Schema & table corruption affects only a single tenant
  • Granular backups can be taken and restored with ease & in parallel.
  • Tenants can be migrated to and from environments easily.
  • Instrumentation is available on a per schema basis allowing the attribution of load and bottlenecks to specific tenant generated load.
  • "Noisy neighbor" problems are eliminated at the database layer.
Cons
  • Multiple databases instances to tune and maintain.
  • Additional infrastructure cost of the multiple database instances.
  • A connection pool per tenant per application is now required (assuming the application layer is multi tenant) which may require additional tuning when considering the number of application instances you need to scale to and the overhead each connection incurs on your storage service.
  • Multiple database service accounts to manage per application.
  • This assumes that an application will switch between tenants and therefor need connection credentials to all databases making this strategy equal from a security standpoint to a single service account.
  • If a database goes down, only a single tenant is affected.
  • Tooling needs to be built to handle schema updates, backups and restores of the entire environment.
  • Reporting across tenants requires additional tooling.
  • This may be complicated by the multiple service accounts to connect with each database.

References 
1. http://stuartingram.com/2016/10/02/spring-boot-schema-based-multi-tenancy/

Comments

Popular posts from this blog

Who is Peter Lynch and what is his philosophy in equity market investment? 25 Golden Rules of the most successful Fund Manager.

Peter Lynch (born January 19, 1944) is an American investor, mutual fund manager, and philanthropist. As the manager of the Magellan Fund at Fidelity Investments between 1977 and 1990, Lynch averaged a 29.2% annual return, consistently more than doubling the S&P 500 stock market index and making it the best-performing mutual fund in the world. During his 13 year tenure, assets under management increased from $18 million to $14 billion. He also co-authored a number of books and papers on investing and coined a number of well known mantras of modern individual investing strategies, such as Invest in what you know and ten bagger. Lynch is consistently described as a "legend" by the financial media for his performance record. Base on his career I have compiled his investing rules here. 25 GOLDEN RULES by @Peter Lynch 1: Nobody can predict interest rates, the future direction of the economy or the stock market. Dismiss all such forecasts & concentrate on what's actual

What is version checking in Hibernate ?

Version checking used in hibernate when more then one thread trying to access same data. For example : User A edit the row of the TABLE for update ( In the User Interface changing data This is user thinking time) and in the same time User B edit the same record for update and click the update. Then User A click the Update and update done. Change made by user B is gone. In hibernate you can prevent slate object updation using version checking. Check the version of the row when you are updating the row. Get the version of the row when you are fetching the row of the TABLE for update. On the time of updation just fetch the version number and match with your version number (on the time of fetching).

What are JEE Containers? What are their different types?

Containers are the interface between a component and the low-level, platform-specific functionality that supports the component. Before it can be executed, a web, enterprise bean, or application client component must be assembled into a Java EE module and deployed into its container. The assembly process involves specifying container settings for each component in the Java EE application and for the Java EE application itself. Container settings customize the underlying support provided by the Java EE server, including such services as security, transaction management, Java Naming and Directory Interface (JNDI) API lookups, and remote connectivity. Here are some of the highlights. The Java EE security model lets you configure a web component or enterprise bean so that system resources are accessed only by authorized users. The Java EE transaction model lets you specify relationships among methods that make up a single transaction so that all methods in one transaction are treated as a