Use Statefulset to Deploy Components

For experienced kubernetes builders, the resource type of StatefulSet is no stranger. For many stateful services, the resource type of StatefulSet can be used to deploy and scale groups of Kubernetes pods.

How ​​to use the StatefulSet Resource Type to Deploy Services in Kato.

Component Deployment Type

By changing the component deployment type in the other settings of the service component, you can choose to deploy the service using the StatefulSet resource type. Pay attention to the following points before operating:

  • The component needs to be closed;
  • For service components with persistent storage, switching component deployment types will cause changes in storage mounting, so data backup must be done;

Kato provides four types of component deployment by default:

  • Stateful single instance: using StatefulSet to deploy services, horizontal scaling of instances is not possible, the number of instances is always 1;
  • Stateful multi-instance: Use StatefulSet to deploy services, and the number of instances can be scaled horizontally;
  • Stateless single instance: using Deployment to deploy services, horizontal scaling of instances is not possible, the number of instances is always 1;
  • Stateless multi-instance: Use Deployment to deploy services, and the number of instances can be scaled horizontally;

When you specify the component deployment type as StatefulSet in Kato, the service component will reflect the following characteristics:

  • In the multi-instance state, all instances will be sequential, and the naming of the instances will be similar to gr6ec114-0 gr6ec114-1, this sequence will be reflected in the full life cycle level, starting, updating, and restarting sequentially ,shut down.
  • The above host name will be resolved in the cluster. Under the same team, try to execute nslookup gr6ec114-0 in any POD. Under different teams, you need to specify the namespace. The full address of the resolvable address is: gr6ec114-0.gr6ec114.3be96e95700a480c9b37c6ef5daf3566.svc.cluster.local where 3be96e95700a480c9b37c6ef5daf3566 is the namespace.
  • In the multi-instance state, the persistent storage of each instance will be individually mounted, which means that persistent data is no longer shared between instances.
  • In the single instance state, when performing an update operation, the instance will be completely shut down and a new instance will be started, which means that the service will be interrupted.
  • To protect the consistency of persistent data, once a k8s node running a stateful service loses contact with the management node and is in the notready state, its stateful service instance will not be automatically migrated.

On the whole, the use of the StatefulSet resource type to deploy services brings new features and at the same time, it will appear to be a little dull, but the following discussion will find that these restrictions are meaningful.

Be careful, as you will surely find that we bind the resource type StatefulSet with “stateful”. Then, a new question emerged: what is the “state” of the service.

The “status” of the Service

Stateful service = Stateless application + stateful data

It can be seen from the name of the stateful service that it is related to the resource type of StatefulSet.

Simply speaking, it may be difficult to understand what a stateful service is. Let us give a few examples:

The most common stateful service is DB middleware.

For the common database Mysql, the same data can only be used by one Mysql program at the same time. After Mysql is started, it will generate a unique lock file in its own data directory and “lock” this file. In this way, other Mysql programs that want to use this data will interrupt the startup process because they find that the lock file is “locked”. The advantage of this is to ensure strong data consistency, because the same data can only be read and written by the same Mysql application at the same time.

Please recall that one of the features brought by the StatefulSet resource type is that each instance will mount an independent persistent storage, which can ensure that the Mysql service can be extended to multiple instances to run, and will not be locked due to file locks. Terminate the startup, but because the data is not shared between each other, there is essentially no relationship between the instances. Running Mysql using a stateful single instance seems to be the most correct choice.

Common database middleware with similar situations include Mongo, Postgresql, Redis, Etcd, etc.

Another common stateful service scenario is the sticky Session provided by Web services.

This sticky Session is stored in memory in some cases to provide session retention, which is also a kind of data itself. Once this service is extended to multiple instances, once the incorrect instance is accessed, the login state will be lost because the Session cannot be found. The use of IP Hash algorithm for traffic distribution in load balancing can solve this problem to a certain extent. Traffic from the same IP will be distributed to the specified instance. But we prefer that the distribution of traffic is round-robin, so that we can ensure that the load of each instance is similar, and there will be no situation where one instance is overloaded while other instances are doing nothing.

Both of these two stateful service scenarios point out to us that for stateful services, the data of different instances are independent of each other. Data is “status”.

In comparison, stateless services are much more flexible. They have no persistent data, or persistent data supports sharing. As far as the client is concerned, the return obtained by requesting which instance is consistent. This feature means that the number of instances of stateless services can be expanded at will, and traffic can be flexibly handled.

One of the biggest benefits of using cloud services is the flexibility and flexibility it provides. When the business encounters traffic peaks, it can quickly expand the instance to respond. From this perspective, we hope that services are “stateless”. Then, a new question emerged: Can we remove the “state” of the service and make it a stateless service?

Handling the “state” of the Service

The state of this type of Web service that uses sticky Session to maintain the login state can be removed.

The principle is relatively simple. You can separate Session and Web application and store them in other middleware, such as database middleware such as Mysql, Redis, and Memcached. Common web frameworks on the market will support this feature, and even use this processing method as the default option, because it’s really great!

The processed Web service becomes a stateless service, and the number of instances can be expanded arbitrarily. No matter which instance the request from the client is assigned to, its login status is retrieved from the back-end database, and the correct login status is returned. When deploying, you can choose stateless multi-instance deployment, that is, use the resource type Deployment.

But for DB database middleware, its status cannot be removed at will.

The reason is that this type of database middleware uses its own mechanism to ensure strong data consistency, such as Mysql’s lock file mechanism. The specified instance can only read and write the data corresponding to it. For this type of stateful service, the exclusive use of persistent data for each instance can be regarded as a necessary condition. And arbitrarily expanding the number of instances will encounter many fatal problems: such as data inconsistencies, or program failures, etc. Can this type of stateful service only be deployed at a single point?

Manufacturers or communities of these database middleware are also concerned about how to achieve high availability solutions to solve the above-mentioned problems. Even database middleware launched in recent years will be designed as a distributed architecture in the design phase. For example, Etcd defines itself as: a reliable and strongly consistent distributed key-value database. Internally, it uses the Raft protocol to conduct inter-instance elections to clarify a unified leader. For older database middleware like Mysql, it also has a master-slave cluster solution based on Binlog replication.

So for this type of service that cannot be removed from the state, our thinking and purpose is to follow its own support cluster solution to achieve high availability and expand the number of instances.

When actually deploying these cluster solutions, it can be concluded that most cluster solutions need to meet the following conditions:

  • Each instance mounts separate persistent data;
  • Instances need to obtain each other’s communication addresses to conduct elections or data synchronization and other actions, such as resolvable host names or domain names. When obtaining the address, you must use the host name or domain name instead of the instance IP, because as the instance restarts, the host name or domain name will not change, but the IP may change, which is very important;
  • The number of instances is required. Generally, odd numbers such as 3, 5, 7, etc. are selected to ensure that the cluster does not have split brain;

Recall the characteristics of the StatefulSet resource type. It can meet all the above conditions and is born for stateful services. Therefore, for this type of stateful service, its component deployment type must use stateful single/multiple instances anyway.