How STACKLEAK improves the security of the Linux kernel

STACKLEAK is a Linux kernel security feature, originally developed by the creators of Grsecurity /PaX. I decided to bring STACKLEAK to the official vanilla kernel (Linux kernel mainline). This article will tell you about the internal device, the properties of this security function and its very long hard way in mainline.
How STACKLEAK improves the security of the Linux kernel  
Available on the project wiki .
STACKLEAK is present as PAX_MEMORY_STACKLEAK in the grsecurity /PaX patch. However, the grsecurity /PaX patch has stopped spreading freely since April 2017. Therefore, the appearance of STACKLEAK in a vanilla kernel would be valuable for Linux users with increased requirements for information security.
The order of work:
highlight STACKLEAK from the grsecurity /PaX patch,
carefully study the code and generate a patch,
send to LKML, get feedback, improve, retry before accepting in mainline.
At the time of writing the article (September ? 2013) was sent. 15 version of the patch series . It contains an architecturally independent part and code for x86_64 and x86_32. Support for STACKLEAK for arm6? developed by Laura Abbott from Red Hat, has already managed to get into the vanilla core ???.

STACKLEAK: security features


Clearing the residual information in the kernel stack

This measure reduces the useful information that some leaks from the nuclear stack to the user space can give.
An example of information leakage from the kernel stack is shown in Scheme 1.
Scheme 1.
However, leaks of this type become useless if at the end of the system call the used part of the kernel stack is filled with a fixed value (Scheme 2).
Scheme 2.
As a consequence, STACKLEAK blocks some attacks on uninitialized variables in the kernel stack. Examples of such vulnerabilities are CVE-2017-1771? CVE-2010-2963. Description of the methodology for exploiting the vulnerability CVE-2010-2963 can find in the article by Case Cook (Kees Cook).
The essence of the attack on the uninitialized variable in the kernel stack is shown in Figure 3.
Scheme 3.
STACKLEAK blocks attacks of this type, since the value that the kernel stack pushes at the end of the system call points to an unused region in the virtual address space (Scheme 4).
Scheme 4.
An important limitation is that STACKLEAK does not protect against similar attacks performed in a single system call.

Detection of the core stack overflow "in depth"

In the kernel kernel kernel (Linux kernel mainline) STACKLEAK is effective against the kernel stack depth overflow, only in combination with CONFIG_THREAD_INFO_IN_TASK and CONFIG_VMAP_STACK. Both these measures are implemented by Andy Lutomirski (Andy Lutomirski).
The simplest version of exploitation of this type of vulnerability is shown in Figure 5.
Scheme 5.
Overwriting certain fields in the thread_info structure on the bottom of the nuclear stack allows you to increase the privileges of the process. However, when the CONFIG_THREAD_INFO_IN_TASK option is enabled, this structure is removed from the nuclear stack, which eliminates the described way of exploiting the vulnerability.
A more advanced version of this attack is to rewrite the data in the neighboring region of memory with the help of going over the border of the stack. More details about this approach:
in the presentation " The Stack is Back " by Jon Oberheide,
in the article " Exploiting Recursion in the Linux Kernel " by Jann Horn.
Attack of this type is reflected in the scheme 6.
Scheme 6.
The protection in this case is CONFIG_VMAP_STACK. When this option is enabled, a special page of memory (guard page) is placed next to the nuclear stack, access to which leads to an exception (Scheme 7).
Scheme 7.
Finally, the most interesting variant of stack overflow in depth is attack of type Stack Clash. The idea back in 2005 was put forward Gael Delalleau.
In 201? it was rethought by researchers from the company Qualys, calling this technique Stack Clash. The fact is that there is a way to jump over the guard page and overwrite data from the neighboring memory region (Figure 8). This is done using a variable length array (VLA, variable length array), the size of which is controlled by the attacker.
Scheme 8.
More information about STACKLEAK and Stack Clash is contained in blog grsecurity .
How does STACKLEAK protect against the Stack Clash in the nuclear stack? Before each call to alloca (), the stack overflow check is performed in depth. Here is the corresponding code from the 14th version of the patch series:
void __used stackleak_check_alloca (unsigned long size)
unsigned long sp = (unsigned long) & sp;
struct stack_info stack_info = {0};
unsigned long visit_mask = 0;
unsigned long stack_left;
BUG_ON (get_stack_info (& sp, current, & stack_info, & visit_mask));
stack_left = sp - (unsigned long) stack_info.begin;
if (size> = stack_left) {
* Kernel stack depth overflow is detected, let's report that.
* If CONFIG_VMAP_STACK is enabled, we can safely use BUG ().
* If CONFIG_VMAP_STACK is disabled, BUG () handling can corrupt
* the neighbor memory. CONFIG_SCHED_STACK_END_CHECK calls
* panic () in a similar situation, so let's do the same if that
* option is on. Otherwise just use BUG () and hope for the best.
* /
panic ("alloca () over the kernel stack boundary");
BUG ();

However, this functionality was excluded from the 15th version. This was done primarily because of the controversial prohibition Linus Torvalds used BUG_ON () in the Linux kernel security patches.
In addition, the 9th version of the patch series led to a discussion, which resulted in the decision to eliminate all arrays of variable length from the mainline-core. In this work included about 15 developers, and it soon will be completed .

Effect of STACKLEAK on performance

Here are the results of performance testing on x86_64. Hardware: Intel Core i7-477? 16 GB RAM.
Test number ? attractive: build the Linux kernel on a single processor core
# time make
The result is ???:
real 12m???s
user 11m???s
sys 1m???s
The result is ??? + stackleak:
real 12m???s (+ ???%)
user 11m???s
sys 1m???s

Test number ? unattractive:
# hackbench -s 4096 -l 2000 -g 15 -f 25 -P
The average result is at ???: ??? sec
The average result is ??? + stackleak: ??? seconds (+ 4.3%)

Thus, the effect of STACKLEAK on system performance depends on the type of load. In particular, a large number of short system calls increase overhead. Thus. It is necessary to evaluate the performance of STACKLEAK for the planned load before commercial operation.

Internal device STACKLEAK

STACKLEAK consists of:
The code that clears the kernel stack at the end of the system call (originally written in assembler),
GCC plugin for kernel code tools at compile time.
The kernel stack is cleared in the stackleak_erase () function. This function works before returning to the user space after the system call. STACKLEAK_POISON (-0xBEEF) is written to the used part of the thread stack. The start point of the cleanup is indicated by the variable_stack variable, which is constantly updated in stackleak_track_stack ().
Stages of operation stackleak_erase () are reflected in schemes 9 and 10.
Scheme 9.
Scheme 10.
Thus. stackleak_erase () clears only the used part of the nuclear stack. That's why STACKLEAK is so fast. And if on x86_64 to clear all 16 KB of the kernel stack at the end of each system call, hackbench shows a performance drop of 40%.
The kernel code is compiled at the compilation stage in the STACKLEAK GCC plugin.
GCC plugins are downloadable modules for the GCC compiler, specific to the project. They register new passes using the GCC Pass Manager, providing callbacks for these passes.
So, to fully work STACKLEAK in the function code with a large stack frame (stack frame), calls stackleak_track_stack () are inserted. Also before each alloca (), a call to the already mentioned stackleak_check_alloca () is inserted, and afterwards a call to stackleak_track_stack () is inserted.
As already mentioned, in the 15th version of the patch series from the GCC plugin, the insertion of calls to stackleak_check_alloca () was excluded.

The path in Linux kernel mainline

The STACKLEAK path in the mainline is very long and difficult (Figure 11).
Scheme 11. The progress of implementation of STACKLEAK in Linux kernel mainline.
In April 201? the creators of grsecurity closed their patches for the community, beginning to distribute them only on a commercial basis. In May 201? I decided to take on the task of introducing STACKLEAK into the vanilla core. Thus the journey began more than a year long. The company Positive Technologies, in which I work, gives me the opportunity to do this task some of my working time. But basically I spend on her "free" time.
Since last May, my series of patches have been reviewed many times, has undergone significant changes, was twice criticized by Linus Torvalds. I wanted to leave this whole thing many times already. But at some point there was a strong desire to still reach the end. At the time of this writing (September 2? 2018), the 15th version of the patch series is in the linux-next branch, corresponds to all the stated requirements of Linus and is ready for the merge-window of the ??? /5.0 kernel.
A month ago I made a report on this work on the Linux Security Summit. Here are links to slides and video :



STACKLEAK is a very useful Linux kernel security feature that blocks the exploitation of several types of vulnerabilities at once. In addition, the original author PaX Team was able to make it quick and beautiful in engineering terms. Therefore, the appearance of STACKLEAK in a vanilla kernel would be valuable for Linux users with increased requirements for information security. Moreover, work in this direction attracts the attention of the Linux development community to the kernel self-defense.
+ 0 -

Add comment