Three kinds of memory leaks

 
3r3-31. Hello colleagues. 3r33333.  
3r33333.  
Our long search for timeless bestsellers on optimizing the code so far only gives the first results, but we are ready to please you, that the translation of the legendary book of Ben Watson " Writing High Performance .NET Code 3r33333.". In stores - approximately in April, watch for advertising. 3r33333.  
3r33333.  
And today we offer you to read a purely practical article on the most pressing types of RAM leaks, written by 3r312. Nelson Ilheage
(Nelson Elhage) from 3r314. Stripe 3r33333. . 3r33333.  
3r318. 3r33333. 3r33333.  
3r33333.  
So, you have a program, the execution of which is spent the further - the more time. It is probably not difficult for you to understand that this is a sure sign of a leak in the memory. 3r33333.  
However, what exactly do we mean by “memory leak”? In my experience, obvious leaks in memory are divided into three main categories, each of which is characterized by a specific behavior, and for debugging each category, special tools and techniques are needed. In this article I want to describe all three classes and suggest how to properly recognize, with 3r3333319.  
which class you are dealing with and how to find a leak. 3r33333.  
3r33333.  
Type (1): an unreachable memory fragment of
is allocated.  
3r33333.  
This is a classic memory leak in C /C ++. Someone allocated memory using new or malloc and never called free or delete , to free up memory at the end of work with her. 3r33333.  
3r33333.  
    void leak_memory () {
char * leaked = malloc (4096);
use_a_buffer (leaked);
/* Oops, I forgot to call free () * /
}
3r33333.  
3r33333.  
3r33333. How to determine that the leak belongs to this category 3r3333312. 3r33333.  
3r33333.  
  •  
    3r33333. If you are writing in C or C ++, especially in C ++ without the ubiquitous use of smart pointers to control the lifetime of memory segments, then this is the first option that we consider. 3r33333.  
    3r33333. If the program is executed in an environment with garbage collection, it is possible that a leak of this type was triggered by 3r372. native code extension however, you must first eliminate leakage types (2) and (3). 3r33333.  
3r33333.  
3r33333.  
3r33333. How to find such a leak [/i] 3r33333.  
3r33333.  
3r33333.  
3r33333.  
3r33180. Type (2): unplanned long-lived memory allocations 3r3181. 3r33333.  
3r33333.  
Such situations are not “leaks” in the classical sense of the word, since a link from somewhere to this area of ​​memory is still preserved, so in the end it can be released (if the program has time to get there without spending all of its memory). 3r33333.  
Situations in this category can arise for many specific reasons. The most common are:
 
3r33333.  
  •  
    3r33333. Inadvertent state accumulation in the global structure; for example, an HTTP server writes to the global list each object received by
Request . 3r33333.  
3r33333. Caches without a well thought out obsolescence policy. For example, an ORM cache that caches all of the uploaded objects that are active during the migration process, during which all the records that are present in the table are loaded. 3r33333.  
3r33333. Too volumetric state is captured in the circuit. Such a case is 3r3133. especially common 3r33333. in jаvascript, but can be found in other environments. 3r33333.  
3r33333. In a broader sense, the inadvertent retention of each of the elements of an array or stream, whereas it was assumed that these elements would be processed online. 3r33333.  
3r33333.  
3r33333.  
3r33333. How to determine that the leak belongs to this category 3r3333312. 3r33333.  
3r33333.  
  •  
    3r33333. If the program is executed in an environment with garbage collection, then this option is considered first of all. 3r33333.  
    3r33333. Compare the heap size displayed in the garbage collector statistics with the free memory size reported by the operating system. If the leak falls into this category, the numbers will be comparable and, most importantly, over time will follow each other. 3r33333.  
3r33333.  
3r33333.  
3r33333. How to find such a leak [/i] 3r33333.  
3r33333.  
Use profilers or heap dump tools that are in your environment. I know there are
guppy in Python or memory_profiler in Ruby, and I also wrote myself. ObjectSpace straight to ruby. 3r33333.  
3r33333.  
3r33180. Type (3): free but unused or unusable memory 3r3181. 3r33333.  
3r33333.  
Characterizing this category is the most difficult, but it is the most important to understand and take into account. 3r33333.  
3r33333.  
This type of leakage occurs in the gray area, between memory, which is considered “free” from the point of view of the allocator inside the VM or runtime environment, and memory, which is “free” from the point of view of the operating system. The most common (but not the only) reason for this phenomenon is heap fragmentation . Some distributors simply take and do not return memory to the operating system after it has been allocated. 3r33333.  
3r33333.  
A case of this kind can be seen on the example of a short program written in Python: 3r33333.  
3r33333.  
    import sys
from guppy import hpy
hp = hpy ()
def rss ():
return 4096 * int (open ('/proc /self /stat'). read (). split ('')[23]) 3r33356.
def gcsize ():
return hp.heap (). size
rss? gc0 = (rss (), gcsize ())
buf =[bytearray(1024) for i in range(200*1024)]
print ("start rss = {} gcsize = {}". format (rss () - rss? gcsize () - gc0))
buf = buf[::2]
print ("end rss = {} gcsize = {}". format (rss () - rss? gcsize () - gc0))
3r33333.  
3r33333.  
We allocate 20?000 1-kb buffers, and then save each subsequent one. We deduce every second the state of memory from the point of view of the operating system and from the point of view of our own Python garbage collector. 3r33333.  
3r33333.  
I get something like this on my laptop: 3r3333319.  
3r33333.  
start rss = 232222720 gcsize = 11667592
 
end rss = 232222720 gcsize = 5769520
3r33333.  
3r33333.  
We can make sure that Python actually freed up half of the buffers, because the gcsize level dropped almost half the peak value, but could not return the operating system a single byte of this memory. The freed memory remains available to the same Python process, but to no other process on this machine. 3r33333.  
3r33333.  
Such free but unused portions of memory can be both problematic and harmless. If a Python program acts like this, and then allocates a handful of 1kb fragments, then this space is simply reused, and all is well. 3r33333.  
3r33333.  
But, if we did it during the initial setup, and later allocated memory by the minimum, or if all the fragments allocated later were 1.5kb each and did not fit into these previously left buffers, then all the memory allocated in this way would always stand idle. would be wasted. 3r33333.  
Problems of this kind are particularly relevant in a specific environment, namely, in multiprocess server systems for working with languages ​​such as Ruby or Python. 3r33333.  
3r33333.  
Suppose we set up a system in which: 3r3333319.  
3r33333.  
 
3r33333. Each server uses N single-threaded workers that handle requests in a competitive manner. Let's take N = 10 for accuracy. 3r33333.  
3r33333. As a rule, each employee has almost a constant amount of memory. For accuracy, let's take 500MB. 3r33333.  
3r33333. With some low frequency, we receive requests that require much more memory than the median request. For accuracy, let's assume that once a minute we receive a request, for the execution time of which an extra 1GB of memory is additionally required, and upon completion of the processing of the request this memory is released. 3r33333.  
3r33333.  
3r33333.  
Once a minute such a "cetacean" request arrives, the processing of which we assign to one of 10 employees, say, in a random fashion: ~ random . Ideally, at the time of processing this request, the employee should allocate 1GB of RAM, and after finishing work, return this memory to the operating system so that it can be used again later. In order to process requests unlimitedly by this principle, the server will need only 10 * 500MB + 1GB = 6GB RAM. 3r33333.  
3r33333.  
However, let's assume that due to fragmentation or for some other reason, the virtual machine can never return this memory to the operating system. That is, the amount of RAM that it requires from the OS is equal to the largest amount of memory that has to be allocated at one time. In such a case, when a particular employee serves such a resource-intensive request, the area occupied by such a process in memory will swell forever by a whole gigabyte. 3r33333.  
3r33333.  
When you start the server, you will see that the amount of memory used is 10 * 500MB = 5GB. As soon as the first large request arrives, the first worker will grab 1GB of memory, and then will not give it back. The total memory used will jump to 6GB. The following incoming requests may from time to time be dropped by the process that has previously processed the “whale”, and in this case the amount of memory used will not change. But sometimes such a large request will be given to another employee, which will cause the memory to swell by another 1GB, and so on until each worker has had time to process such a large request at least once. In this case, you will use these operations up to 10 * (500MB + 1GB) = 15GB of RAM, which is much more than the ideal 6GB! Moreover, if we consider how the server fleet is used over time, then we can see how the amount of memory used gradually grows from 5GB to 15GB, which will very much resemble a “real” leak. 3r33333.  
3r33333.  
3r33333. How to determine that the leak belongs to this category 3r3333312. 3r33333.  
3r33333.  
 
3r33333. Compare the heap size displayed in the garbage collector statistics with the free memory size reported by the operating system. If the leak falls into this (third) category, then the numbers will diverge over time. 3r33333.  
3r33333. I like to configure my application servers so that both of these numbers periodically beat off in my time series infrastructure, so it’s convenient to display graphics on them. 3r33333.  
3r33333. In Linux, view the state of the operating system in field 24 of 3r3334. /proc /self /stat , and view the memory allocator through a language-specific or virtual machine-specific API. 3r33333.  
3r33333.  
3r33333.  
3r33333. How to find such a leak [/i] 3r33333.  
3r33333.  
As already mentioned, this category is a bit more cunning than the previous ones, since the problem often arises, even when all the components are working “as intended”. However, there are a number of good practices to help mitigate or reduce the impact of such “virtual leaks”: 3r-3319.  
3r33333.  
 
3r33333. Restart your processes more often. If the problem grows slowly, then perhaps restarting all the processes of the application once every 15 minutes or once an hour may not be difficult. 3r33333.  
3r33333. An even more radical approach: you can teach all processes to restart on their own as soon as the memory space they occupy exceeds a certain threshold value or grows by a specified amount. However, try to ensure that your entire server park cannot start up in a spontaneous synchronous restart. 3r33333.  
3r33333. Change the memory allocator. In the long run
tcmalloc and 3r33333. jemalloc usually cope with fragmentation much better than the default allocator, and experimenting with them is very convenient using the variable. LD_PRELOAD . 3r33333.  
3r33333. Find out if you have individual requests that consume much more memory than others. In Stripe, API servers measure RSS (constant memory consumption) before and after servicing each API request and log the delta. Then, we can easily query our log aggregation systems to determine if there are such terminals and users (and if patterns are traced) on which memory consumption bursts can be written off. 3r33333.  
Adjust the garbage collector /memory allocator. Many of them have customizable parameters that allow you to specify how actively such a mechanism will return memory to the operating system, how optimized it is to eliminate fragmentation; There are other useful options. Everything is also quite difficult here: make sure that you understand exactly what you are measuring and optimizing, and also try to find an expert on the relevant virtual machine and consult with it. 3r33333.  
3r33352.
! function (e) {function t (t, n) {if (! (n in e)) {for (var r, a = e.document, i = a.scripts, o = i.length; o-- ;) if (-1! == i[o].src.indexOf (t)) {r = i[o]; break} if (! r) {r = a.createElement ("script"), r.type = "text /jаvascript", r.async =! ? r.defer =! ? r.src = t, r.charset = "UTF-8"; var d = function () {var e = a.getElementsByTagName ("script")[0]; e.parentNode.insertBefore (r, e)}; "[object Opera]" == e.opera? a.addEventListener? a.addEventListener ("DOMContentLoaded", d,! 1): e.attachEvent ("onload", d ): d ()}}} t ("//mediator.mail.ru/script/2820404/"""_mediator") () (); 3r33350.
3r33352.
+ 0 -

Add comment