Scale Meta Service Nodes
Overview
This guide will walk you through the process of scaling your Databend Meta Service cluster. You can either add new nodes (scale up) or remove existing nodes (scale down) from your cluster.
Prerequisites
- Have completed Deploy Meta Service and have a running Meta Service node
- Have completed Prepare Package Environment on the new node (for scale up)
- Have sudo privileges on the nodes
Scale Up: Add New Meta Service Node
To add a new Meta Service node, follow the steps in Deploy Meta Service on the new node. Make sure to:
-
Copy the configuration from an existing node:
sudo mkdir -p /etc/databend
sudo scp root@<existing-node-ip>:/etc/databend/databend-meta.toml /etc/databend/ -
Modify the configuration file:
sudo vim /etc/databend/databend-meta.toml
Update the following settings:
[raft_config]
id = 1 # Change this to a unique ID for each node (1, 2, 3, etc.)
single = false # Change this to false for multi-node deployment
join = ["127.0.0.1:28004"] # Add all existing Meta Service nodes here -
Follow the deployment steps in Deploy Meta Service to install and start the service on the new node.
-
Verify the new node is added to the cluster:
databend-metactl status
-
After confirming the new node is in the cluster, update the configuration on all existing nodes:
sudo vim /etc/databend/databend-meta.toml
On each existing node, update the following settings:
[raft_config]
single = false # Change this to false
join = ["127.0.0.1:28004", "127.0.0.2:28004", "127.0.0.3:28004"] # Add all Meta Service nodes including the new one -
Update the Meta Service endpoints in all Query nodes' configuration:
sudo vim /etc/databend/databend-query.toml
Update the following settings in each Query node:
[meta]
endpoints = ["127.0.0.1:9191", "127.0.0.2:9191", "127.0.0.3:9191"] # Add all Meta Service nodes including the new one
Scale Down: Remove Meta Service Node
To remove a Meta Service node from the cluster, follow these steps:
-
First, check if the node to be removed is the current leader:
databend-metactl status
Look for the "leader" information in the output. If the node to be removed is the leader, you need to transfer leadership first.
-
If the node is the leader, transfer leadership to another node:
databend-metactl transfer-leader
-
Verify the leadership transfer:
databend-metactl status
Confirm that the leadership has been transferred to the target node.
-
Now, gracefully remove the node from the cluster using the
databend-meta
command:databend-meta --leave-id <node_id_to_remove> --leave-via <node_addr_1> <node_addr_2>...
For example, to remove node with ID 1:
databend-meta --leave-id 1 --leave-via 127.0.0.1:28004
Note:
--leave-id
specifies the ID of the node to remove--leave-via
specifies the list of node advertise addresses to send the leave request to- The command can be run from any node with
databend-meta
installed - The node will be blocked from cluster interaction until the leave request is completed
-
Check if the node has been successfully removed from the cluster:
databend-metactl status
Verify that the node ID is no longer listed in the cluster members.
-
After confirming the node is removed from the cluster, stop the service:
sudo systemctl stop databend-meta
-
Verify the cluster status from any remaining node:
databend-metactl status
-
After confirming the cluster is stable, update the configuration on all remaining Meta nodes:
sudo vim /etc/databend/databend-meta.toml
On each remaining node, update the following settings:
[raft_config]
join = ["127.0.0.2:28004", "127.0.0.3:28004"] # Remove the leaving node from the list -
Update the Meta Service endpoints in all Query nodes' configuration:
sudo vim /etc/databend/databend-query.toml
Update the following settings in each Query node:
[meta]
endpoints = ["127.0.0.2:9191", "127.0.0.3:9191"] # Remove the leaving node from the list
Troubleshooting
If you encounter issues:
-
Check the service status:
sudo systemctl status databend-meta
-
View the logs for detailed error messages:
# View systemd logs
sudo journalctl -u databend-meta -f
# View log files in /var/log/databend
sudo tail -f /var/log/databend/databend-meta-*.log -
Common issues and solutions:
- Permission denied: Ensure the databend user has proper permissions
- Port already in use: Check if another service is using the configured ports
- Configuration errors: Verify the configuration file syntax and paths
- Raft connection issues: Ensure all Meta Service nodes can communicate with each other
Next Steps
After successfully scaling your Meta Service cluster, you can:
- Scale Query Service Nodes (if needed)
- Upgrade Meta Service (if needed)