Containerized OpenShift Clusters And Openshift-Ansible
I had a moderately large PR land recently for openshift-ansible that was the result of our on-going efforts to start improving the experience for fully containerized OpenShift clusters. It sounds like this is the desired path forward in upstream Kubernetes and in my own testing it really feels like this is a much more convenient way to install, run, and upgrade a cluster, so I’m hopeful this post might help encourage some others to start thinking about this deployment type.
The pull request addresses a number of issues but firstly, there’s a new inventory variable that you can use to install your cluster without having to know a fully qualified x.y.z.w version for
NOTE: This will work for origin as well as soon as I can find someone who can get me v1.2 tags for the appropriate images.
This is largely a seed value, it’s used when setting up the cluster and is immediately converted to a very specific version (i.e. 18.104.22.168) for the first master. After this, we use the first master’s current version as the seed for all other masters/nodes in the cluster. That version should never change from running the config playbooks, unless you explicitly use an override variable (see below) or run an upgrade playbook.
The override variables
openshift_image_tag (for containerized installs) and
openshift_pkg_version (for rpm installs), are still valid and can be used on their own, or in conjunction with
openshift_release if desired, though that may be somewhat redundant. These are also an exception to the “never touch the running version” rule, if you modify or add these variables to your inventory and run the config playbooks (which we like to encourage for ongoing maintenance of the cluster), those versions will be applied, potentially triggering an upgrade with downtime. (so try to avoid that unless you know exactly what you’re doing)
Performing an actual upgrade now happens in one place, be it major (3.1 to 3.2) or minor (22.214.171.124 to 126.96.36.199).
To do this you’ll need to remove
openshift_release, or update it appropriately in your inventory. The upgrade will actually override it to 3.2 in this case, as such that value is really just a seed today and could probably be removed after cluster creation. In the future however, we’d kind of like to transition to a process where you perform upgrades with the regular config playbook, just by updating something like
openshift_release in your inventory. This however would be a little ways down the road.
We’ve also done a lot of work around controlling the Docker version on your hosts. I’ll try to be polite and say that recent changes in Docker have disrupted things, there was a somewhat important requirement to get everyone running Docker 1.10 very quickly despite 1.9.1 being the norm when Origin 1.2 and OSE 3.2 shipped.
We now ship a Docker upgrade playbook that can be used for this purpose to safely upgrade your environment serially with proper node evacuation, hopefully preventing downtime. This docker upgrade process is integrated into the overall 3.2 upgrade playbook going forward, improving on the previous upgrade which could involve an unsafe Docker upgrade and restart.
Additionally you can now control the Docker version to be configured or upgraded to with the inventory var
docker_version=1.10.4, and as above once that version is configured we make sure to never modify it during a config playbook run unless you explicitly say otherwise.
We are still struggling with how to roll out Docker config changes during normal config playbook maintenance runs, as this can still cause an unsafe restart. More on that should be coming soon.
Additionally a change was just merged today that dramatically shortened the time to install/upgrade a containerized cluster. For my simple testing of a single master and node, it cut install time in half from 18 minutes to 9. (NOTE: this included pre-pulled images in both cases) This was accomplished by replacing the previous containerized host wrapper script used to run client tools by starting a temporary Docker container (which is very costly). Instead we now have Ansible tasks that sync the client binaries and symlinks out of that container as part of the cluster setup/maintenance, meaning you can run them on the host immediately. This will likely be a pleasant improvement for containerized cluster admins even beyond using the ansible playbooks.
Hopefully this helps spread a little awareness about containerized installs, combined with RHEL/CentOS Atomic I find it’s a pretty interesting way to manage a cluster. Sounds like there are some really interesting things incoming for Fedora Atomic and OpenShift as well.
If you do try this out and encounter any problems please don’t hesitate to reach out via GitHub issues or #openshift-devel on Freenode.