Install a distributed etcd cluster on Debian 12.
This post will cover how to install a distributed etcd cluster on Debian 12.
What is etcd ?
"etcd is a distributed, reliable key-value store for the most critical data of a distributed system."
Installing
On all nodes, run apt-get install etcd-server
.
At the time writing this the Debian bookworm version of etcd-server
is 3.4.23-4+b4
.
On one node you will need etcd-client
to run etcdctl
and check the network status.
Topology
This will be a 3 node setup, the "Tours" and "Paris" nodes have private and public ips (via Internet) to communicate between each other. The "Mini" node will only be able to communicate with the other nodes via the public Internet. The "Mini" node can receive traffic on it's IPv4 IP, but not on it's IPv6 IP.
- private link
- +----------------+ +----------------+
- | +----------------->| |
- | Tours | | Paris |
- | |<-----------------+ |
- | | | |
- +-----+--+-------+ +---+------+-----+
- ^ | | ^ Internet link | ^ | ^
- | | | | | | | |
- | | | +----------------------------+ | | |
- | | +--------------------------------+ | |
- | | | |
- | | | |
- | | Internet link | | Internet link
- | | | |
- | | | |
- | | | |
- | | | |
- | | +----------------+ | |
- | | | | | |
- | +---------->| Mini |<----------+ |
- +-------------+ +-------------+
- | |
- +----------------+
Configuring
The variables
You can find them on etcd.io 3.4 op-guide.
ETCD_NAME
is the human-readable name for the node.ETCD_DATA_DIR
is important as it is the path to the data directory.ETCD_LISTEN_CLIENT_URLS
is the list (comma separated) of URLs to listen (port2379
) on for client traffic. The format isscheme://IP:port
, to bind all IPs use0.0.0.0
.ETCD_LISTEN_PEER_URLS
same as clients but using the port2380
.ETCD_ADVERTISE_CLIENT_URLS
is a list of this member's client URLs to advertise to the rest of the cluster. These URLs can contain domain names. Avoidlocalhost
as it may create infinite loops. Example:http://example.com:2379, http://10.0.0.1:2379
Bootstrap variables for the cluster
ETCD_INITIAL_ADVERTISE_PEER_URLS
is a list of this member's peer URLs to advertise to the rest of the cluster. These addresses are used for communicating etcd data around the cluster. At least one must be routable to all cluster members. These URLs can contain domain names.ETCD_INITIAL_CLUSTER
is the initial cluster configuration for bootstrapping. Example:default=http://localhost:2380
ETCD_INITIAL_CLUSTER_STATE
is a value to set to "new" for all members present during initial static or DNS bootstrapping. If this option is set to "existing", etcd will attempt to join the existing cluster. If the wrong value is set, etcd will attempt to start but fail safely.ETCD_INITIAL_CLUSTER_TOKEN
A value to use for the initial cluster token for the etcd cluster during bootstrap.
The mini node
Edit: /etc/default/etcd
ETCD_NAME="company-name-mini"
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_LISTEN_CLIENT_URLS="http://192.168.1.11:2379,http://127.0.0.1:2379"
ETCD_LISTEN_PEER_URLS="http://192.168.1.11:2380"
# Public IPs here or private ones if applicable
ETCD_ADVERTISE_CLIENT_URLS="http://82.11.22.33:2379"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="company-name-etcd-cluster-random-string"
# Similar to ETCD_ADVERTISE_CLIENT_URLS with another port
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://82.11.22.33:2380"
ETCD_INITIAL_CLUSTER="company-name-mini=http://82.11.22.33:2380,company-name-tours=http://195.8.70.6:2380,company-name-paris=http://185.132.200.12:2380"
The "Paris" node
Edit: /etc/default/etcd
ETCD_NAME="company-name-paris"
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_LISTEN_CLIENT_URLS="http://10.18.22.5:2379,http://127.0.0.1:2379"
ETCD_LISTEN_PEER_URLS="http://10.18.22.5:2380"
# Public IPs here or private ones if applicable
ETCD_ADVERTISE_CLIENT_URLS="http://185.132.200.12:2379"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="company-name-etcd-cluster-random-string"
# Similar to ETCD_ADVERTISE_CLIENT_URLS with another port
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://185.132.200.12:2380"
ETCD_INITIAL_CLUSTER="company-name-mini=http://82.11.22.33:2380,company-name-tours=http://195.8.70.6:2380,company-name-paris=http://185.132.200.12:2380"
The "Tours" node
Edit: /etc/default/etcd
ETCD_NAME="company-name-tours"
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_LISTEN_CLIENT_URLS="http://172.16.22.3:2379,http://127.0.0.1:2379"
ETCD_LISTEN_PEER_URLS="http://172.16.22.3:2380,http://127.0.0.1:2380"
# Public IPs here or private ones if applicable
ETCD_ADVERTISE_CLIENT_URLS="http://195.8.70.6:2379"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="company-name-etcd-cluster-random-string"
# Similar to ETCD_ADVERTISE_CLIENT_URLS with another port
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://195.8.70.6:2380"
ETCD_INITIAL_CLUSTER="company-name-mini=http://82.11.22.33:2380,company-name-tours=http://195.8.70.6:2380,company-name-paris=http://185.132.200.12:2380"
Testing
Firewall tests
You will need to add this script listen.py
on all machines, it only does a port listen:
#!/usr/bin/python
import socket
# Source: https://gist.github.com/echojc/5632656
port = 2380
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(('', port))
s.listen(1)
while 1:
(c, a) = s.accept()
print('connection from: ', a)
Then run it:
# Stop etcd if it is running
service etcd stop
# WARNING: delete any previous data
rm -rv /var/lib/etcd/*
# Listen to the port
python3 listen.py &
# Write the pid in a file
echo $! >/tmp/test_server.pid
Also, source of the port testing command: a post on superuser.
On the mini node
# Test itself
</dev/tcp/192.168.2.32/2380 && echo Port is open || echo Port is closed
# Test Tours
</dev/tcp/195.8.70.6/2380 && echo Port is open || echo Port is closed
# Test Paris
</dev/tcp/185.132.200.12/2380 && echo Port is open || echo Port is closed
On the Paris node
# Test itself
</dev/tcp/10.18.22.5/2380 && echo Port is open || echo Port is closed
# Test Mini
</dev/tcp/82.11.22.33/2380 && echo Port is open || echo Port is closed
# Test Tours
</dev/tcp/195.8.70.6/2380 && echo Port is open || echo Port is closed
On the Tours node
# Test itself
</dev/tcp/172.16.22.3/2380 && echo Port is open || echo Port is closed
# Test Mini
</dev/tcp/82.11.22.33/2380 && echo Port is open || echo Port is closed
# Test Paris
</dev/tcp/185.132.200.12/2380 && echo Port is open || echo Port is closed
Start the cluster
Notes: You must start at least 2 etcd nodes whitin 50 seconds or it will return error because of the master election timeout.
Note: One mistake I made while doing the setup was not having the same ETCD_INITIAL_CLUSTER
on all 3 machines.
It did give some weird context deadline exceeded
and exceeded header timeout
.
On all machines, run:
# Kill the PID
kill $(cat /tmp/test_server.pid)
# Remove the PID file
rm -v /tmp/test_server.pid ./listen.py
service etcd start
# Maybe wait one second and run:
curl http://127.0.0.1:2380/version
# On my setup it outputs: {"etcdserver":"3.4.23","etcdcluster":"3.4.0"}
Inspect the cluster
# Run this on the node(s) where etcd-client is installed
etcdctl --endpoints=127.0.0.1:2380 endpoint status -w table
Split in two for small screens ;)
+----------------+-----------------+---------+---------+-----------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER |
+----------------+-----------------+---------+---------+-----------+
| 127.0.0.1:2380 | 934de65da148d6d | 3.4.23 | 20 kB | false |
+----------------+-----------------+---------+---------+-----------+
+------------+-----------+------------+--------------------+--------+
| IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------+-----------+------------+--------------------+--------+
| false | 3 | 14 | 14 | |
+------------+-----------+------------+--------------------+--------+
etcdctl --endpoints=http://127.0.0.1:2380 member list
# 934de65da148d6d, started, company-name-mini, http://82.11.22.33:2380, http://82.11.22.33:2379, false
# 6c34ed0ae8842035, started, company-name-paris, http://185.132.200.12:2380, http://185.132.200.12:2379, false
# d9f8d0469a6bd93d, started, company-name-tours, http://195.8.70.6:2380, http://195.8.70.6:2379, false
# Ask all nodes at once (set one to 127.0.0.1):
etcdctl --endpoints=http://82.11.22.33:2380,http://195.8.70.6:2380,http://127.0.0.1:2380 endpoint status -w table
etcdctl --endpoints=http://82.11.22.33:2380,http://195.8.70.6:2380,http://127.0.0.1:2380 member list -w table
etcdctl --endpoints=http://82.11.22.33:2380,http://195.8.70.6:2380,http://127.0.0.1:2380 alarm list
Test the cluster
# On one node
etcdctl --endpoints=http://127.0.0.1:2380 put testkey okay
# Try on all nodes
etcdctl --endpoints=http://127.0.0.1:2380 get testkey
# On one node
etcdctl --endpoints=http://127.0.0.1:2380 del testkey