Why I use AWS auto-scaling group to deploy Couchbase clusters
Initially, it may sound odd and strange idea because someone can argue autoscaling group concept is for stateless applications like application servers, web servers, etc. Maybe that was the initial idea when introducing the concept, however, modern distributed databases are too, a good fit for autoscaling groups and we can get multiple advantages by using this concept. Like in stateless applications, we do not have to use autoscaling policies to kick off autoscaling activities for databases instead we can manage the autoscaling groups manually to trigger autoscaling activities in much more controlled fashion though we can use autoscaling policies too. For example, if we want to add a node to a Couchbase cluster, we can increase the desired capacity of the relevant autoscaling group by one so that an identical EC2 instance will be spun-up and join the cluster and rebalance automatically. (The user data section of the launch configuration has all the logic to install, configure, join and rebalance the new node to the cluster.) We can do this during a low usage time of the system so that data streaming will have minimal impact on the application performance.
Another advantage is, the new instance(s) will be balanced across availability zones automatically because of the autoscaling group thus no additional logic required. Please note that balancing the nodes across availability zones (AZs) is crucial for distributed databases for many valid reasons like rack awareness and recovery from disaster recovery in case of an AZ failure.
In case of a node failure, due to hardware or software issues, a new node will spin-up to match the minimum number of nodes of the autoscaling group, thus no human intervention is necessary at all: - auto-healing.
However, downsizing a database cluster may not be a good idea and will not happen so frequently but there are certain situations that we need to shrink the database cluster. So we can still have the downsizing functionality so that we can use it for the non-prod environment to adjust the AWS cost or to properly size the cluster. In a nutshell, the use of the autoscaling group is a very good idea for highly distributed databases like Couchbase and Cassandra and it gives us excellent features, otherwise, we need to program for all these logic- reinventing the wheel.
About the post header picture: It was taken during the after lunch walk out near the Ascend learning office in Kansas City, Leawood on Dec, 8th 2019.