Conference DEVOXX UK. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 3 /Blog of the company ua-hosting.company /Habr
Docker Swarm, Kubernetes and Mesos are the most popular container orchestration frameworks. In his speech, Arun Gupta compares the following aspects of the work of Docker, Swarm, and Kubernetes:
Service discovery service discovery.
Integration with Maven.
A "rolling" update.
Creating a Couchbase database cluster.
As a result, you get a clear idea of what each orchestration instrument has to offer, and learn how to use these platforms effectively.
Arun Gupta is Amazon Web Services' main open-source product technologist who has been developing the Sun, Oracle, Red Hat, and Couchbase developer communities for over 10 years. He has extensive experience working in leading cross-functional teams involved in the development and implementation of marketing campaigns and programs. He led the Sun engineering team, is one of the founders of the Java EE team and the creator of the American branch of Devoxx4Kids. Arun Gupta is the author of more than 2 thousand posts in IT blogs and has made presentations in more than 40 countries. Conference DEVOXX UK. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 1
Conference DEVOXX UK. Choose a framework: Docker Swarm, Kubernetes or Mesos. Part 2
Line 55 contains COUCHBASE_URI, pointing to the service of this database, which is also created using the Kubernetes configuration file. If you look at line ? you can see kind: Service is a service that I create under the name couchbase-service, and the same name is listed on line 4. Several ports are listed below.
The key lines are 6 and 7. In the service I say: “Hey, these are the labels that I am looking for!” And these labels are nothing but the names of the pairs of variables, and line 7 points to my application couchbase-rs-pod. The following are the ports that give access to these same labels.
On line 1? I create a new ReplicaSet type, line 31 contains the name of the image, and lines 24-27 indicate the metadata associated with my hearth. This is exactly what service is looking for and with which a connection should be established. At the end of the file is some kind of connection of lines 55-56 and ? saying: “use this service!”.
So, I start my service with a set of replicas, and since each set of replicas has its own port with a corresponding label, it is included in the service. From the point of view of the developer, you simply turn to the service, which then uses the set of replicas you need.
As a result, I have it under WildFly, which communicates with the database backend through the Couchbase Service. I can use the frontend with several pods WildFly, which also connects to the couchbase backend through the couchbase service.
Later we will look at how a service located outside the cluster communicates through its IP address with elements that are located inside the cluster and have an internal IP address.
So stateless containers are fine, but how good is the idea of using stateful containers? Let's look at the system settings for stateful, or persistent containers. Docker has 4 different approaches to the location of the data warehouse that you should pay attention to. The first is Implicit Per-Container, which means that when using satateful containers like couchbase, MySQL or MyDB, they all start with the default Sandbox. That is, everything that is stored in the database is stored in the container itself. If the container disappears, the data disappears with it.
The second is Explicit Per-Container, when you create a specific storage with the docker volume create command and save the data in it. The third approach of Per-Host is related to storage mapping, when everything that is stored in the container is simultaneously duplicated on the host. If the container fails, the data will remain on the host. The latter is the use of several Multi-Host hosts, which is advisable at the stage of production of various solutions. Suppose your containers with your applications are running on the host, but at the same time you want to store your data somewhere on the Internet, and this uses automatic mapping for distributed systems.
Each of these methods uses a specific storage location. Implicit and Explicit Per-Container store data on the host at /var /lib /docker /volumes. When using the Per-Host method, the storage is mounted inside the container, and the container is mounted on the host. For multi-hosts, solutions such as Ceph, ClusterFS, NFS, etc. can be used.
If the persistent container fails, the storage directory becomes inaccessible in the first two cases, and in the last two, access is preserved. However, in the first case, you can access the repository through the Docker host running on the virtual machine. In the second case, the data will not be lost either, because you created the Explicit storage.
If the host fails, the repository directory is inaccessible in the first three cases, in the latter case, the connection with the repository is not interrupted. Finally, the shared function is completely excluded for storage in the first case and is possible in the rest. In the second case, you can share storage, depending on whether your database supports distributed storage or not. In the case of Per-Host, data distribution is possible only on this host, and for a multi-host it is provided by the expansion of the cluster.
This should be considered when creating stateful containers. Another useful Docker tool is the Volume plugin, which works on the principle that “batteries are present but must be replaced”. When launching the Docker container, he says: “Hey, starting the container with the database, you can store your data in this container!” This function is by default, but you can change it. This plugin allows you to use a network drive or something similar instead of a container database. It includes a default driver for host-based storage and allows container integration with external storage systems such as Amazon EBS, Azure Storage and GCE Persistent persistent drives.
The following slide shows the architecture of the Docker Volume plugin.
The blue color indicates the Docker client associated with the blue Docker host, which has a Local storage engine that provides containers for data storage. The green color indicates the Plugin Client and Plugin Daemon, which are also connected to the host. They provide the ability to store data in network storages of the desired type of Storage Backend.
The Docker Volume plugin can be used with Portworx storage. The PX-Dev module is actually the container you launch, which connects to the Docker host and makes it easy to save data to Amazon EBS.
The Portworx client allows you to monitor the status of containers with various repositories that are connected to your host. If you visit my blog, you can read how to make the most of Portworx with Docker.
The concept of repositories in Kubernetes is similar to Docker and is represented by directories that are available to your container in the hearth. They are not dependent on the lifetime of any container. The most common storage types available are hostPath, nfs, awsElasticBlockStore and gsePersistentDisk. Let's see how these repositories work in Kubernetes. Usually the process of connecting them consists of 3 steps.
The first is that someone on the network side, usually the administrator, provides you with persistent storage. There is a corresponding PersistentVolume configuration file for this. Further, the application developer writes a configuration file called PersistentVolumeClaim, or a request for a PVC storage, which says: “I have prepared a 50GB distributed storage, but so that other people can use its capacity too, I tell in this PVC that I now need everything 10 GB. " Finally, the third step is that your request is mounted as a repository, and the application in which there is a set of replicas or something similar starts using it. It is important to remember that this process consists of the 3 mentioned steps and allows scaling.
The following slide shows the AWS architecture Kubernetes Persistence Container persistent container.
Inside the brown rectangle that represents the Kubernetes cluster, there is one master node and two working nodes marked in yellow. In one of the worker nodes, there is an orange under, storage, replica controller, and a green Docker Couchbase container. Inside the cluster above the nodes, a purple rectangle denotes an externally accessible Service. This architecture is recommended for storing data on the device itself. If necessary, I can store my data in EBS outside the cluster, as shown in the next slide. This is a typical model for scaling, but when applying it, you need to consider the financial aspect - storing data somewhere on the network can be more expensive than on the host. When choosing containerization solutions, this is one of the strongest arguments.
Just like with Docker, you can use Kubernetes persistent containers with Portworx.
This is what Kubernetes 1.6 calls “StatefulSet” in its current terminology — a way to work with Stateful applications that handle Pod shutdown and Graceful Shutdown events. In our case, such applications are databases. On my blog, you can read how to create a StatefulSet in Kubernetes using Portworx.
Let's talk about the development aspect. As I said, Docker has 2 versions - CE and EE, in the first case we are talking about a stable version of Community Edition, which is updated every 3 months, unlike the monthly updated version of EE. You can download Docker for Mac, Linux, or Windows. After installation, Docker will automatically update, and start working with it very easily.
In Kubernetes, I prefer the Minikube version - this is a good way to get started with this platform by creating a cluster on a single node. To create clusters of several nodes, the choice of versions is wider: these are kops, kube-aws (CoreOS + AWS), kube-up (deprecated). If you intend to use Kubernetes based on AWS, I recommend joining the AWS SIG group, which is found on the network every Friday and publishes various interesting materials on working with Kubernetes AWS.
Consider how the rolling update of the Rolling Update is performed on these platforms. If there is a cluster of several nodes, then it uses a specific version of the image, for example, WildFly: 1. A rolling update means that the version of the image is successively replaced by a new one on each node, one after the other.
For this, the docker service update command (service name) is used, in which I indicate the new version of the WildFly image: 2 and the update method update-parallelism 2. The number 2 means that the system will update 2 images of the application at a time, then 10 seconds will follow delay update delay 10s, after which the next 2 images will be updated on 2 more nodes, etc. This simple rolling update mechanism is provided to you as part of Docker.
In Kubernetes, a rolling update happens this way. The rc replication controller creates a set of replicas of one version, and each one under this webapp-rc is tagged with a label located in etcd. When I need some kind of under, through the Application Service I go to the etcd repository, which at the specified label provides me with this one.
In this case, we have 3 pods in the Replication controller that run the WildFly version 1 application. When updating in the background, another replication controller is created with the same name and index at the end - - xxxxx, where x are random numbers, and with the same labels labels. The Application Service now has three pods with the old version application and three pods with the new version in the new Replication controller. After that, the old pods are deleted, the replication controller with the new pods is renamed and included in the work.
Let's move on to monitoring. Docker has many built-in monitoring commands. For example, the docker container stats command-line interface allows you to display the status of containers to the console every second — CPU, disk, network load. The Docker Remote API tool provides data on how the client communicates with the server. It uses simple commands, but it is based on the Docker REST API. In this case, the words REST, Flash, Remote mean the same thing. When you communicate with a host, it is a REST API. The Docker Remote API provides more information about running containers. My blog details the use of this monitoring with Windows Server.
Monitoring system events docker system events when starting a multi-host cluster gives you the opportunity to get data about a host crash or container crash on a particular host, scaling of services, and the like. Starting with Docker version 1.2? it includes Prometheus, which integrates endpoints into existing applications. This allows you to receive metrics via HTTP and display them.on dashboards.
Another monitoring feature is cAdvisor (short for container Advisor). It analyzes and provides data on resource utilization and performance from running containers, providing Prometheus metrics “out of the box”. The peculiarity of this tool is that it provides data only for the last 60 seconds. Therefore, you need to provide the ability to collect this data and put it in a database in order to be able to monitor a long-term process. It can also be used to display metrics on the dashboard graphically using Grafana or Kibana. My blog has a detailed description of how to use cAdvisor to monitor containers using the Kibana panel.
The next slide shows what the result of the Prometheus endpoint and the metrics available for display look like.
At the bottom left you see the metrics of HTTP requests, responses, etc., to the right - their graphical display.
Kubernetes also contains built-in monitoring tools. This slide shows a typical cluster containing one master and three work nodes.
In each of the working nodes, an automatically launched cAdvisor is located. In addition, there is Heapster - a performance monitoring and metrics collection system compatible with Kubernetes version ??? and higher. Heapster allows you to collect not only the performance indicators of workloads, modules and containers, but also events and other signals generated by the whole cluster. To collect data, he communicates with the Kubelet of each hearth, automatically saves information in the InfluxDB database and displays it in the form of metrics on the Grafana dashboard. However, note that if you use miniKube, this function is not available by default, so you will have to use add-ons for monitoring. So it all depends on where you run the containers and which monitoring tools you can use by default and which ones you need to install as separate add-ons.
The following slide shows Grafana dashboards that show the working status of my containers. There are a lot of interesting data. Of course, there are many commercial monitoring tools for Docker and Kubernetes processes, for example, SysDig, DataDog, NewRelic. Some of them have 30 free trial periods, so you can try and choose the most suitable one. Personally, I prefer to use SysDig and NewRelic, which integrate nicely into Kubernetes. There are tools that integrate equally well in both platforms - Docker and Kubernetes.
A bit of advertising :)
Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to friends, cloud VPS for developers from $ ???r3r3310. , unique analog of entry-level servers that was invented by us for you:
The whole truth about VPS (KVM) E5-2697 v3 (6 Cores) 10GB DDR???GB SSD 1Gbps from $ 19 or how to divide the server? (options are available with RAID1 and RAID1? up to 24 cores and up to 40GB DDR4).
Dell R730xd 2 times cheaper at the Equinix Tier IV data center in Amsterdam? Only we have 2 x Intel TetraDeca-Core Xeon 2x E5-2697v???GHz 14C 64GB DDR4 4x960GB SSD 1Gbps 100 TV from $ 199 in the Netherlands! Dell R420 - 2x E5-???Ghz 6C 128GB DDR3 2x960GB SSD 1Gbps 100TB - from $ 99! Read about How to build an infrastructure class c using Dell R730xd E5-2650 v4 servers costing ?000 euros for a penny?
It may be interesting