This guide describes how to bootstrap new Production Core OS Cluster as High Availability Service in a 15 minutes with using etcd2, Fleet, Flannel, Confd, Nginx Balancer and Docker.
- Introduction
- Basic Configuration
- Usage with Deis v1
- Appendix 1 - Info and Tutorials
- Appendix 2 - Tools and Services
CoreOS is a powerful Linux distribution built to make large, scalable deployments on varied infrastructure simple to manage.
CoreOS is designed for security, consistency, and reliability. Instead of installing packages via yum or apt, CoreOS uses Linux containers to manage your services at a higher level of abstraction. A single service's code and all dependencies are packaged within a container that can be run on one or many CoreOS machines.
Main building blocks of CoreOS — etcd, Docker and systemd.
See: 7 reasons why you should be using CoreOS with Docker.
- Find your Cloud Config file location. For examples below we will use:
/var/lib/coreos-install/user_data
- Open your config to edit:
sudo vi /var/lib/coreos-install/user_data
- Generate new token for your cluster: https://discovery.etcd.io/new?size=X, where X is servers count.
- Merge follow lines with your Cloud Config:
coreos:
etcd2:
# generate a new token for each unique cluster from https://discovery.etcd.io/new
# discovery: https://discovery.etcd.io/<token>
discovery: https://discovery.etcd.io/9c19239271bcd6be78d4e8acfb393551
# multi-region and multi-cloud deployments need to use $public_ipv4
advertise-client-urls: http://$private_ipv4:2379,http://$private_ipv4:4001
initial-advertise-peer-urls: http://$private_ipv4:2380
# listen on both the official ports and the legacy ports
# legacy ports can be omitted if your application doesn't depend on them
listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
listen-peer-urls: http://$private_ipv4:2380
fleet:
public-ip: $private_ipv4
metadata: region=europe,public_ip=$public_ipv4
units:
- name: etcd2.service
command: start
# See issue: https://github.com/coreos/etcd/issues/3600#issuecomment-165266437
drop-ins:
- name: "timeout.conf"
content: |
[Service]
TimeoutStartSec=0
- name: fleet.service
command: start
- name: flanneld.service
command: enable
drop-ins:
- name: 50-network-config.conf
content: |
[Service]
ExecStartPre=/usr/bin/etcdctl set /coreos.com/network/config '{ "Network": "10.1.0.0/16" }'
- For RPN-Online you should also add follow lines to get Private Network working:
units:
# ...
- name: 00-eno2.network
runtime: true
content: "[Match]\nName=eno2\n\n[Network]\nDHCP=yes\n\n[DHCP]\nUseMTU=9000\n"
- Validate your changes:
sudo coreos-cloudinit -validate --from-file /var/lib/coreos-install/user_data
- Reboot the system:
sudo reboot
- Check status for etcd2:
sudo systemctl status -r etcd2
Output should contain a follow line:
Active: active (running)
Sometimes it takes a time. Don't panic. Just wait for a few minutes.
If something goes wrong use follow commands to debug:
# etcd2
sudo systemctl start etcd2
sudo systemctl status etcd2
sudo journalctl -xe
sudo journalctl -ru etcd2
-
Repeat those steps for each server in your cluster.
-
Check your cluster health and fleet status:
# should be healthy
sudo etcdctl cluster-health
# should display all servers
sudo fleetctl list-machines
See: Launching Containers with fleet
- Enter to your home directory:
cd ~
- Create new Application Template Unit. For example - run
vi [email protected]
and add follow lines:
[Unit]
Description=test-app%i
After=docker.service
[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill test-app%i
ExecStartPre=-/usr/bin/docker rm test-app%i
ExecStartPre=/usr/bin/docker pull willrstern/node-sample
ExecStart=/usr/bin/docker run -e APPNAME=test-app%i --name test-app%i -P willrstern/node-sample
ExecStop=/usr/bin/docker stop test-app%i
- Submit Application Template Unit to Fleet:
fleetctl submit [email protected]
- Start new instances from Application Template Unit:
fleetctl start test-app@1
fleetctl start test-app@2
fleetctl start test-app@3
- Check that all instances has been started and active. It could take a few minutes. Example command and its output:
$ fleetctl list-units
UNIT MACHINE ACTIVE SUB
[email protected] e1512f34.../10.1.9.17 active running
[email protected] a78a3229.../10.1.9.18 active running
[email protected] 081c8a1e.../10.1.9.19 active running
-
fleet list-units
is displayingfailed
state for any unitssudo fleetctl journal testapp@1
-
Error response from daemon: Conflict. The name "testapp1" is already in use by container c4acbb70c654. You have to delete (or rename) that container to be able to reuse that name.
fleetctl stop testapp@1 docker rm testapp1 fleetctl start testapp@1
-
fleet
ssh
command doesn't working- Ensure your public key has been added everywhere in
user_data
. On each server. - Connect to your server with SSH agent:
eval `ssh-agent -s` ssh-add ~/.ssh/id_rsa ssh -A <your-host>
- Ensure your public key has been added everywhere in
Run custom-firewall.sh from Deis v1 on your local machine:
curl -O https://raw.githubusercontent.com/deis/deis/master/contrib/util/custom-firewall.sh
# run follow line for each server
ssh core@<host1> 'bash -s' < custom-firewall.sh
- Modify attached files according to your application config.
- Submit that to your Fleet:
fleetctl submit [email protected]
fleetctl submit [email protected]
fleetctl submit [email protected]
- Start Unit instances from templates:
fleetctl start someapp@{1..6}
fleetctl start someapp-discovery@{1..6}
fleetctl start someapp-lb@{1..2}
Attention! It seems that doesn't work correctly with Online.net and other bare metal setups because ceph
which is using for v1 works unstable and unpredictable. But if you would like to make an experiment, let's go:
- Create backup copy of your original config:
sudo vi cp /var/lib/coreos-install/user_data /var/lib/coreos-install/user_data.without-deis1
-
Merge your Cloud Config with Deis Cloud Config example.
-
You can configure Deis Platform from your workstation by following this instruction. The next steps adopted for server environment.
-
Download
deisctl
:
curl -sSL http://deis.io/deisctl/install.sh | sudo sh -s 1.12.3
- Set your configuration:
deisctl config platform set domain=<your-domain>
- Run platform installation:
deisctl install platform
- Boot up Deis:
deisctl start platform
If you get problems try to check Docker containers:
docker ps -a
Also you could use journal
and status
commands for deisctl
to debug.
- Once you see “Deis started.”, your Deis platform is running on a cluster. Verify that all Deis units are loaded by run:
deisctl list
All Deis units should be active. Otherwise you could destroy that all and don't forget to remove unused Docker volumes.
- Building Microservices with CoreOS & etcd [video talk]
- How To Set Up a CoreOS Cluster on DigitalOcean [article]
- http://jasonwilder.com/blog/2014/03/25/automated-nginx-reverse-proxy-for-docker/
- http://blog.stevenedouard.com/high-availability-apps-via-fleet-coreos-from-start-to-finish-provisioning-coreos-using-azure-resource-manager/
- https://blog.docker.com/2015/04/tips-for-deploying-nginx-official-image-with-docker/
- http://blog.scottlowe.org/2014/08/20/coreos-continued-fleet-and-docker/
- Nginx Load Balancer Service For Core OS
- ServerFault: Nginx proxy to many container running on different CoreOS nodes
- http://infoslack.com/devops/creating-a-cluster-with-coreos-and-docker/
- Gist: Running a High Availability Service on CoreOS using Docker, Fleet, Flannel, Etcd, Confd & Nginx
- https://www.digitalocean.com/community/tutorials/how-to-secure-your-coreos-cluster-with-tls-ssl-and-firewall-rules
- https://github.com/sedouard/fleet-bootstrapper
- https://github.com/coreos/etcd/blob/master/Documentation/runtime-reconf-design.md#do-not-use-public-discovery-service-for-runtime-reconfiguration