Management of micro services using Kubernetes and Istio
A short story about the advantages and disadvantages of microservices, the concept of Service Mesh and Google tools that allow you to run microservice applications without clogging your head with endless policy settings, accesses and certificates, and quickly finding errors that are not hidden in the code, but in microservice logic.
The article is based on Report by Craig Boxing at our last year's DevOops 2017 conference. Video and translation of the report - under the cut.
.Twitter ) - DevRel from Google, responsible for the direction of microservices and tools Kubernetes and Istio. His current story is about managing microservices on these platforms.
Let's start with a relatively new concept called Service Mesh. This term is used to describe a network of interoperable microservices, of which the application consists.
At a high level, we view the network as a pipe that simply moves bits. We do not want to worry about them, or, for example, MAC addresses in applications, but we are trying to think about the services and connections that they need. If you look from the point of view of OSI , then we have a third level network (with routing and logical addressing functions), but we want to think in terms of the seventh (with the function of access to network services).
What should the real seventh-level network look like? Perhaps we want to see something like tracing traffic around problematic services. In order to be able to connect to the service, and at the same time the level of the model was raised from the third level. We want to get an idea of what is happening in the cluster, find unintended dependencies, find out the root causes of the failure. Also we need to avoid unnecessary overhead, for example, high-latency connections or connecting to servers with a cold or not fully warmed up cache.
We must be sure that the traffic between the services is protected from trivial attacks. You need mutual TLS authentication, but without embedding the appropriate modules in each application that we write. It is important to be able to manage what surrounds our applications, not only at the connection level, but also at a higher level.
Service Mesh is a layer that allows us to solve the above problems in a micro-service environment.
Monolith and microservices: pros and cons of
But first we will ask ourselves, why should we solve these problems at all? How before did we develop software? We had an application that looks something like this - like a monolith.
After all, it's great: we have all the code on our palms. Why not continue to use this approach?
Because the monolith has its own problems. The main difficulty is that if we want to rebuild this application, we must re-deploy each module, even if nothing has changed. We are forced to create an application in the same language or in compatible languages, even if different commands work on it. In fact, individual parts can not be tested independently of each other. It's time to change it, it's time for microservices.
So, we divided the monolith into parts. You may notice that in this example we removed some of the unnecessary dependencies and stopped using internal methods called from other modules. We created services from models that were used earlier, creating abstractions in those cases when we need to save the state. For example, each service must have an independent state so that when you access it you do not worry about what is happening in the rest of our environment.
What happened in the end?
We left the world of gigantic applications, getting what really looks great. We speeded up development, stopped using internal methods, created services and now we can scale them independently, make the service more without having to integrate everything else. But what is the price of the changes, what did we lose?
We had reliable calls inside applications, because you just called a function or module. We replaced the reliable call within the module with an unreliable remote procedure call. But not always the service on the other side is available.
We were safe, we used the same process inside one machine. Now we are connecting to services that can be on different machines and on an unreliable network.
In a new approach, the network may have the presence of other users who are trying to connect to services. Delays increased, and at the same time the possibilities of measuring them decreased. Now we have step-by-step connections in all services that create one module call, and we can no longer just look at the application in the debugger and find out what exactly caused the failure. And this problem must be solved somehow. Obviously, we need a new set of tools.
What can be done?
There are several options. We can take our application and say that if RPC does not work the first time, then you should try again, and then again and again. Just wait and try again or add the Jitter. Also we can add entry-exit traces to say that the call was started and ended, which for me is equated to debugging. You can add an infrastructure to ensure authentication of connections and teach all our applications to work with TLS encryption. We will have to take on the burden of maintaining individual teams and constantly keep in mind the various problems that can arise in SSL libraries.
Maintaining consistency on different platforms is an ungrateful task. I would like to make the space between applications reasonable, so that there is a possibility of tracking. We also need the ability to change the configuration at runtime so that we do not recompile or restart the migration applications. Here these wikis and implements Service Mesh.
Let's talk about Istio .
Istio is a full-fledged framework for connecting, managing and monitoring the micro-service architecture. Istio is designed to work on top of Kubernetes. He himself does not deploy the software and does not care about making it available on machines that we use for this purpose with containers in Kubernetes.
In the figure, we see three different cuts of machines and blocks that make up our microservices. We have a way to group them together using the mechanisms provided by Kubernetes. We can target and say that a particular group that can have automatic scaling, is attached to a web service or can have other deployment methods, will contain our web service. In this case, we do not need to think about machines, we operate with terms of the level of access to network services.
This situation can be represented in the form of a scheme. Consider an example when we have a mechanism that does some image processing. On the left is a user, the traffic from which comes to us in the microservice.
In order to accept payment from the user, we have a separate payment micro service that calls the external API that is outside the cluster.
To process user logon to the system, we have an authentication micro-service, and it has the states stored again outside our cluster in the database. Cloud SQL .
What does Istio do? Istio improves Kubernetes. You install it using an alpha function in Kubernetes called the Initializer. When deploying the software, kubernetes will notice it and ask if we want to change and add another container inside each kubernetes. This container will handle paths and routing, be aware of all changes to the application.
That's how the scheme looks with Istio.
We have external machines that provide an incoming and outgoing proxy for traffic in a particular service. We can unload the functions we have already talked about. We do not need to learn the application, how to perform telemetry or trace using TLS. But we can add other things inside: automatic interruption, speed limitation, canary release .
All traffic will now pass through proxy servers on external machines, and not directly to services. Kubernetes does everything on the same IP address. We can intercept traffic that would go to the front or end services.
The external proxy that Istio uses is called Envoy .
Envoy is actually older than Istio, it was developed by LYFT. He has been working in production for over a year, launching the whole infrastructure of micro services. We chose Envoy for the Istio project in collaboration with the community. Thus, Google, IBM and LYFT are the three companies that are still working on it.
Envoy is written in C ++ version 11. Has been in production for over 18 months before becoming an open source project. It does not take much resources when you connect it to your services.
Here are a few things that Envoy can do. This is the creation of a proxy server for HTTP, including HTTP /2 and protocols based on it, such as gRPC. It can also forward to other protocols at the binary level. Envoy monitors your infrastructure zone, so you can make your part stand-alone. It can handle a large number of network connections with repeated requests and waiting. You can set a certain number of attempts to connect to the server before stopping the request and send to your servers information that the service is not responding.
Do not worry about reloading the application to add the configuration to it. You just connect to it using an API very similar to kubernetes, and change the configuration at runtime.
The Istio team has contributed a lot to the UpStream Envoy Platform. For example, an alert for an injection error. We have made it so that you can see how the application behaves in case of exceeding the number of requests for the object that failed. And also realized the functions of graphical display and traffic sharing to handle cases when canary deployment is used.
The figure shows how the architecture of the Istio system looks. We will take only two microservices, which we mentioned earlier. Ultimately, on the diagram everything is very similar to a program-defined network. From the side of the Envoy proxy server, which we deployed along with the applications, there is a transfer of traffic using IP tables in the namespace. The control panel is responsible for managing the console, but it does not handle the traffic itself.
We have three components. The Pilot that creates the configuration looks at the rules that can be changed using the Istio Control Panel API, and then updates Envoy so that it behaves like a cluster discovery service. Istio-Auth serves as a certification authority and transfers TLS certificates to proxy servers. The application does not require the presence of SSL, they can connect over HTTP, and the proxy will handle all this for you.
Mixer processes requests to ensure that you comply with the security policy, and then transmits telemetry information. Without making any changes to the application, we can see everything that happens inside our cluster.
Advantages of Istio
So, let's talk more about the five things that we get from Istio. First of all, consider Traffic control . We can separate the traffic control from the infrastructure scaling, so earlier we could do something like 20 instances of the application and 19 of them will be on the old version, and one on the new version, that is, 5% of the traffic will come from the new version. With Istio, we can deploy any number of instances that we need, and at the same time specify how much traffic to send to new versions. A simple rule of separation.
Everything can be programmed "on the fly" with the help of rules. Envoy will be updated periodically as the configuration changes, and this will not lead to service failures. Since we are working at the level of access to network services, we can see the packets, and in this case it is possible to get into the user agent that is on the third level of the network.
For example, we can say that any traffic from the iPhone is subject to another rule, and we are going to send a certain amount of traffic to the new version that we want to test for a particular device. The internal calling microservice can determine to which specific version it needs to connect, and you can transfer it to another version, for example, 2.0.
The second advantage is transparency . When you have a view inside the cluster, you can understand how it is done. We do not need to create tools for metrics in the development process. Metrics already exist in every component.
Some believe that it is enough to monitor logs for monitoring. But in fact, all we need is to have such a universal set of indicators, which can be fed to any monitoring service.
This is how the toolbar looks Istio, created with the help of Prometheus. Do not forget to deploy it inside the cluster.
The example in the screenshot shows a number of monitored parameters that are specific to the entire cluster. It is possible to deduce also more interesting things, for example, what percent of applications gives more than 500 errors that means failure. The response time is aggregated in all calling and responding service instances within the cluster, this functionality does not require any settings. Istio knows that Prometheus supports, and he knows which services are available in your cluster, so Istio-Mixer can send metrics to Prometheus without additional settings.
Let's see how this works. If you call a specific service, the proxy service sends information about this call to the Mixer, which captures parameters such as waiting for the response time, code status and IP. It normalizes them and sends them to any servers that you configured. Especially for the output of key indicators there is a Prometheus service and FLUX DB adapters, but you can also write your own adapter and output data in any format for any other application. And you do not have to change anything in the infrastructure if you want to add a new metric.
If you want to conduct a more in-depth study, then use the distributed trace system Zipkin . Information about all calls that are routed through Istio-Mixer can be sent to Zipkin. There you will see the whole chain of micro service calls when responding to the user and easily find a service that tightens the processing time.
At the application level, you do not need to worry about creating a trace. Envoy itself sends all the necessary information to the Mixer, which sends it to the trace, for example, in Zipkin, stackdriver trace from Google or any other custom application.
Let's talk about fault tolerance and efficiency .
Time-outs between service calls are needed to test the availability, in the first place, of load balancers. We introduce errors into this relationship and see what happens. Let's consider an example. Let's say there is a link between service A and service B. We are waiting for a response from the video service 100 milliseconds and give only 3 attempts if the result is not received. In fact, we are going to take it for 300 milliseconds, before it reports an unsuccessful connection attempt.
Further, for example, our cinema service should look at the rating of the film through another microservice. The rating timeout is 200 milliseconds and gives two attempts to call. Calling the video service can cause you to wait 400 milliseconds in case the star rating is out of reach. But, we remember, after 300 ms, the cinema service will say that it is non-working, and we will never know the real cause of the failure. Using timeouts and testing what happens in these cases is a great way to find any clever bugs in your micro service architecture.
Let's see now what's effective. The kubernetes balancer itself acts only at the level of the fourth layer. We invented an input data constructor for load balancing from the second layer to the seventh layer. Istio is implemented as a balancer for a network layer with access to network services.
We perform TLS-offloading, so we use modern well-doped SSL in Envoy, which you do not have to worry about with vulnerabilities.
Another advantage of Istio is Security .
What are the basic security tools in Istio? Service Istio-Auth works in several directions. A public framework and a set of identification standards for services are used. SPIFFE . If we are talking about traffic flow, then we have an Istio certification authority that issues certificates for service accounts that we run inside the cluster. These certificates comply with the SPIFFE standard and are distributed by Envoy using the kubernetes protection mechanism. Envoy uses keys for two-way TLS authentication. Thus, the backend applications receive identifiers, on the basis of which it is already possible to organize the policy.
Istio independently maintains root certificates so you do not worry about cancellation and expiring dates. The system will respond to automatic scaling, so by entering a new entity, you will receive a new certificate. No manual settings. You do not need to configure the firewall. Users will use the network policy and kubernetes to implement firewalls between the containers.
Finally, applying policies . Mixer is an integration point with infrastructure backends that you can extend with Service Mesh. Services can easily move within a cluster, be deployed in several environments, in the cloud or locally. Everything is designed for on-line monitoring of calls that go through Envoy. We can resolve and prohibit specific calls, set preconditions for skipping calls, limit their speed and quantity. For example, you allow 20 free requests per day to one of your services. If the user has made 20 requests, the subsequent ones are not processed.
Prerequisites can include things like, for example, server authentication, ICL and its presence in a whitelist. Quota management can be used in cases where it is required that all users of the service have the same access speed. Finally, Mixer collects the results of processing requests and responses, suitable for telemetry. This allows manufacturers and users to look at this telemetry using services.
Remember the first slide with a photo application, from which we started to study Istio? All of the above is hidden under such a simple form. At the top level, everything will be done automatically. You deploy the application and will not worry about how to define a security policy or configure some routing rules. The application will work exactly as you expect.
How to start working with Istio
Istio supports previous versions of kubernetes, but the new initializer function that I talked about is in versions 1.7 and higher. This is an alpha function in kubernetes. I recommend using Google Container Engine Alpha clusters. We have clusters that you can turn on for a certain number of days and at the same time use all the production capabilities in them.
First of all, Istio is an open source project on github. We have just released version ???.? which will allow you to manage objects within the same name space for kubernetes. Since version 0.2 we support work in our own namespace and cluster of kubernetes. We also added access to manage services that run on virtual machines. You can deploy Envoy in a virtual machine and secure the services that run on it. In the future Istio will support other platforms, such as Cloud Foundry .
The short installation guide for the framework is here . If you have a cluster running Google Container Engine on 1.8 with alpha functions enabled, installing Istio is just one command.
If you liked this report, come October 14 to the conference DevOops 2018 (Peter): there it will be possible not only to listen to reports, but also to communicate with any speaker in the discussion area.
It may be interesting
This Post is providing valuable and unique information, I know that you take a time and effort to make a awesome article
beach wedding venues
Custom PVC Patches
There are specific dissertation web-sites by way of the web to produce safe apparently documented inside your website. <a href="https://houstonembroideryservice.com/custom-pvc-patches/">Custom PVC Patches</a>