As I tried to make a static GLSL analyzer (and what went wrong)

3r3761. 3r3-31. 3r33737. Once I was preparing for Ludum Dare and made a simple game where I used pixel shaders (I didn't bring others into the Phaser engine). 3r33750. 3r33748. 3r3761. 3r3154.

What are shaders? [/b]

3r33737. Shaders are programs in a si-like language GLSL, which are executed on a video card. There are two types of shaders, in this article we are talking about pixel (they are also “fragment”, fragment shaders), which can be very roughly represented in this form: 3r3373750. 3r33748. 3r3761. 3r???. 3r3734. color = pixelShader (x, y, other attributes) 3r33582. 3r33748. 3r3761. 3r33737. Those. the shader is executed for each pixel of the displayed image, determining or specifying its color. 3r33748. 3r3761. Introductory can be read on another article on Habré - https://habr.com/post/333002/ 3r33750. 3r3757. 3r3757. 3r33748. 3r3761. 3r33737. Having tested it, I threw the link to a friend, and I received a screenshot from him with the question "is this normal?" 3r33750. 3r33748. 3r3761. 3r33737. 3r3335. 3r3757. 3r33748. 3r3761. 3r33737. No, it was not normal. Looking carefully at the shader code, I found an error in the calculations: 3r-3750. 3r33748. 3r3761. 3r???. 3r37777. if (t < M) {

realColor = mix (color? color? pow (1. - t /R? 0.5)); 3r3761.} 3r33537. 3r38282. 3r3748. 3r3761. 3r33737. Since Since the constant R1 was less than M, in some cases in the first argument of pow, a number less than zero was obtained. The square root of a negative number is mysterious, at least for the GLSL standard. My video card was not embarrassed, and it somehow got out of this position (it seems, by returning from pow 0), but at a friend it turned out to be more discriminating. 3r33750. 3r33748. 3r3761. 3r33737. And then I wondered: can I avoid such problems in the future? No one is immune from mistakes, especially those that are not reproduced locally. You can't write unit tests on GLSL. At the same time, the transformations inside the shader are quite simple - multiplications, divisions, sines, cosines Is it really impossible to track the values of each variable and make sure that under no circumstances does it go beyond the permissible limits of values? 3r33750. 3r33748. 3r3761. 3r33737. So I decided to try static analysis for GLSL. What came out of it - you can read under the cut. 3r33750. 3r33748. 3r3761. 3r33737. Immediately I will warn you: it was not possible to get some finished product, only a training prototype. 3r33750. 3r3365. 3r?656. 3r33748. 3r3761.

Preliminary analysis 3r3742. 3r33748. 3r3761. 3r33737. After a bit of study of existing articles on this topic (and at the same time finding out that the topic is called Value Range Analysis), I was glad that I had GLSL, and not some other language. Judge for yourself:

3r33748. 3r3761.

3r3761. 3r33737. no "dynamics" - references to functions, interfaces, automatically deducible types and other things 3r3736. 3r3761. 3r33737. no direct work with memory

3r3761. 3r33737. no modules, linking, late binding - the source code of the shader is available in its entirety

3r3761. for incoming values, as a rule, the ranges

are well known. 3r3761. 3r33737. few data types, and even those revolve around float. int /bool is rarely used, and it is not important to keep an eye on them

3r3761. 3r33737. Ifs and cycles are rarely used (due to performance issues). if cycles are used, then more often simple counters for traversing an array or repeating a certain effect several times. Nobody will write such horror in GLSL (I hope).

3r3761. 3r3738. 3r33748. 3r3761. 3r???. 3r3-300. //taken from the article - https://homepages.dcc.ufmg.br/~fernando/classes/dcc888/ementa/slides/RangeAnalysis.pdf

k = 0

while k < 100:

i = 0

j = k

while i < j:

i = i + 1

j = j - 1

k = k + 1 3r33582. 3r33748. 3r3761. 3r33737. In general, given the limitations of GLSL, the problem seems to be solved. The basic algorithm is as follows:

3r33748. 3r3761. 3r3143. 3r3761. 3r33737. parse the shader code and build a sequence of commands that change the values of any variables

3r3761. 3r33737. knowing the initial ranges for variables, go through the sequence, updating the ranges as they change to

3r3761. 3r33737. if the range violates any given bounds (for example, a negative number can come to pow, or gl_FragColor will come to the red color in the red component, something greater than 1. will appear), you should show a warning 3r3736. 3r3761. 3r3151. 3r33748. 3r3761.

Used technologies 3r3742. 3r33748. 3r3761. 3r33737. Here I had a long and painful choice. On the one hand, my main scoping is to check WebGL shaders, so why not jаvascript - to run everything in the browser during development. On the other hand, I have been planning to get off the Phaser for a long time and try another engine like Unity or LibGDX. There will also be shaders, but there will be no jаvascript anymore. 3r33750. 3r33748. 3r3761. 3r33737. And on the third hand, the task was done primarily for entertainment. And the best entertainment in the world is a zoo. Therefore: 3r350. 3r33748. 3r3761. 3r3143. 3r3761. 3r33737. parsing GLSL-code is made on jаvascript. It's just that I quickly found the GLSL parsing library in AST on it, and the test UI seems to be as common as web. AST turns into a sequence of commands that is sent to 3r3736. 3r3761. 3r33737. the second part, which is written in C ++ and compiled into WebAssembly. I decided this way: if I suddenly want to fasten this analyzer to some other engine, with the library in C ++ this should be done the easiest way.

3r3761. 3r3151. 3r33748. 3r3761. 3r3154.

A few words about the toolkit [/b]

3r3761. 3r33737. I took Visual Studio Code as the main IDE and I am generally satisfied with it. I need a little bit of happiness - the main thing is that Ctrl + Click works and autocomplete when typing. Both functions work fine in both C ++ and JS parts. Well, the ability to not switch between different IDEs among themselves is also great.

3r3761. 3r33737. is used to compile C ++ in WebAssembly. cheerp (it is paid, but free for open-source projects). I have not encountered any problems with its use, except that the code has optimized it is rather strange, but here I am not sure whose fault it is - the cheerp itself or the clang compiler it uses.

3r3761. 3r33737. for unit tests in C ++ took the good old gtest

3r3761. 3r33737. for the assembly of js in bundle took a certain microbundle. He met my requirements for “I want 1 npm-package and a couple of command line flags”, but not without problems, alas. Let's say watch falls on any error while parsing incoming jаvascript with the message 3r3734.[Object object]3r33737. that doesn't really help.

3r3761. 3r3738. 3r3757. 3r3757. 3r33748. 3r3761. 3r33737. Everything, now it is possible to go. 3r33750. 3r33748. 3r3761. 3r3185. Briefly about the model

3r33748. 3r3761. 3r33737. 3r3190. 3r3757. 3r33748. 3r3761. 3r33737. The analyzer keeps in memory a list of variables that are found in the shader, and for each stores the current possible range of values (such as 3r3734. W2w2w221. 3r3-33735. Or ` W2w2w22. 3rr3735. 3r3-33582. 3r3-33748. 3r3761. 3r33737. Here, we call the sin function, input variables with id = 3 and ? and the result is written to variables 1 and 2. This call corresponds to the GLSL value: 3r3750. 3r33748. 3r3761. 3r???. 3r3734. vec2 a = sin (b); 3r33737. 3r33582. 3r33748. 3r3761. 3r33737. Note the empty arguments (marked with "-"). In GLSL, almost all built-in functions are overloaded for different sets of input types, i.e. There are `

, 3r3734. sin (vec3) , 3r3734. sin (vec4) . For convenience, I bring all the overloaded versions to one form - in this case 3r3734. sin (vec4) . 3r33750. 3r33748. 3r3761. 3r33737. On the output, the analyzer gives a list of changes for each variable, like 3r33750. 3r33748. 3r3761. 3r???. 3r3734. cmdId: 10` sin (float) `

, 3r3734. sin (vec2)

branchId: 1

variable: 2

range:[-1,1]3r33737. 3r33582. 3r33748. 3r3761. 3r33737. What does "variable 2 on line 10 in branch 1 have a range from -1 to 1 inclusive" (what is a branch (branch) we will talk about later). Now you can beautifully highlight the ranges of values in the source code. 3r33750. 3r33748. 3r3761.

## Good start

3r33748. 3r3761. 3r33737. When the AST-tree has already begun to somehow turn into a list of commands, it is time to implement the standard functions and methods. There are quite a few of them (and they still have loads of overloads, as I wrote above), but in general they have predictable range conversions. Say, for such an example, everything is pretty obvious: 3r33750. 3r33748. 3r3761. 3r???. 3r3734. uniform float angle; //-> (-∞, ∞)//3r3761. float y = sin (angle); //->[-1,1]3r3761. float ynorm = 1 + y; //->[0,2]3r3761. gl_FragColor.r = ynorm /2 .; //->[0,1]3r33737. 3r33582. 3r33748. 3r3761. 3r33737. Branching 3r33748. 3r3761. 3r33737. Take for example such a shader. 3r33750. 3r33748. 3r3761. 3r???. 3r3734. uniform sampler2D uSampler; 3r3761. uniform vec2 uv; //[0,1]3r3761. 3r3761. void main () {

float a = texture2D (uSampler, uv) .a; //->[0,1]3r3761. float k; //->? 3r3761. if (a < 0.5) {

k = a * 2;

} else {3r37676. k = 1. - a; 3r3761.) 3r3761. 3r33737. Variable

` a `

we are taken from the texture, and therefore the value of this variable is from 0 to 1. And this is what values 3r3734 can take. k 3r3373735 ? 3r33750. 3r33748. 3r3761. 3r33737. You can go on a simple path and "merge branches" - count the range in each case and give the total. For the if branch, we get ` k =[0,2]3r33737. , and for the else branch `` k =[0,1]3r33737. . If you combine, it turns out `` [0,2]3r33737. , and it is necessary to give an error, because in the output color `` gl_FragColor `

values greater than 1. fall. 3r33748. 3r3761. 3r33737. However, this is an obvious false alarm, and for a static analyzer there is nothing worse than false positives - if it is not disabled after the first shout of "wolves", then after the tenth exactly. 3r33750. 3r33748. 3r3761. 3r33737. This means that we need to process both branches separately, and in both branches we should clarify the range of the variable ` a `

(although formally no one changed it). Here is what it might look like: 3r33748. 3r3761. 3r33737. Branch 1: 3r350. 3r33748. 3r3761. 3r???. 3r37777. if (a <0.5) {//a =[0, 0.5)

k = a * 2.;

//k =[0, 1)

gl_FragColor = vec4(1.) * k;

}[/code]

`Ветка 2:`

`if (a >= 0.5) {`

//a =[0.5, 1]3r3761. k = 1. - a; //k =[0, 0.5]3r3761. gl_FragColor = vec4 (1.) * k; 3r3761.}

3r33582. 3r33748. 3r3761. 3r33737. Thus, when the analyzer meets a certain condition that behaves differently depending on the range, it creates branches (branches) for each of the cases. In each case, he refines the range of the original variable and moves on through the list of commands. 3r33750. 3r33748. 3r3761. 3r33737. 3r33333. 3r3757. 3r33748. 3r3761. 3r33737. It should be clarified that the branches in this case are not related to the if-else construct. Branches are created when a range is split into a sub-range of a variable, and the reason may be an optional conditional statement. For example, the step function also creates branches. The next GLSL shader does the same thing as the previous one, but does not use branching (which, by the way, is better in terms of performance). 3r33750. 3r33748. 3r3761. 3r???. 3r3734. float a = texture2D (uSampler, uv) .a; 3r3761. float k = mix (a * 2., 1. - a, step (0.? a)); 3r3761. gl_FragColor = vec4 (1.) * k; 3r33737. 3r33582. 3r33748. 3r3761. 3r33737. The step function should return 0 if a < 0.5 и 1 в противном случае. Поэтому здесь тоже будут созданы ветки — аналогичные предыдущему примеру. 3r33748. 3r3761. 3r33393. Refinement of other variables 3r33748. 3r3761. 3r33737. Consider the slightly modified previous example: 3r33748. 3r3761. 3r???. 3r3734. float a = texture2D (uSampler, uv) .a; //->[0,1]3r3761. float b = a - 0.5; //->[-0.5, 0.5]3r3761. if (b < 0.) {

k = a * 2 .; //k, a ->? 3r3761.} else {3r3761. k = 1. - a; 3r3761.} 3r33535. 3r3582. 3r34848. 3r3761. 3r33737. Here the nuance is as follows: branching occurs in the variable ` b `

, and calculations occur with variable ` a `

. That is, within each branch there will be a correct value of the range ` b `

, but completely unnecessary, and the original value of the range ` a `

, completely incorrect. 3r33750. 3r33748. 3r3761. 3r33737. However, the analyzer saw that the range was ` b `

was obtained by computing from ` a `

. If this information is memorized, then with branching, the analyzer can go through all the initial variables and refine their range by performing the reverse calculation. 3r33750. 3r33748. 3r3761. 3r33737. 3r3r434. 3r3757. 3r33748. 3r3761.

` Functions and cycles 3r3742. 3r33748. 3r3761. 3r33737. In GLSL, there are no virtual methods, function pointers, or even recursive calls, so each function call is unique. Therefore, it is easiest to insert the body of the function at the call site (zainlaynit, in other words). This will fully comply with the sequence of command execution. 3r33750. 3r33748. 3r3761. 3r33737. With cycles all the more difficult, because Formally, GLSL is completely underHolds a C-like for loop. However, most often cycles are used in the simplest version, like this: 3r33748. 3r3761. 3r???. 3r3734. for (int i = 0; i < 12; i++) {}`

` Functions and cycles 3r3742. 3r33748. 3r3761. 3r33737. In GLSL, there are no virtual methods, function pointers, or even recursive calls, so each function call is unique. Therefore, it is easiest to insert the body of the function at the call site (zainlaynit, in other words). This will fully comply with the sequence of command execution. 3r33750. 3r33748. 3r3761. 3r33737. With cycles all the more difficult, because Formally, GLSL is completely underHolds a C-like for loop. However, most often cycles are used in the simplest version, like this: 3r33748. 3r3761. 3r???. 3r3734. for (int i = 0; i < 12; i++) {}`

3r3-3582.

3r3761. 3r33737. Such cycles are easy to “deploy”, i.e. insert the loop body 12 times in succession. As a result, thinking, I decided so far to support only this option. 3r33750. 3r33748. 3r3761. 3r33737. The advantage of this approach is that commands can be issued by the stream to the analyzer, without requiring it to memorize some fragments (such as function bodies or cycles) for further reuse. 3r33750. 3r33748. 3r3761. 3r33464. Emerging issues 3r33748. 3r3761. 3r33434. Problem # 1: Difficulty or impossibility to clarify 3r33535. 3r33748. 3r3761. 3r33737. Above, we considered cases when, when specifying the value for one variable, we made conclusions about the values of another variable. And this problem is solved when addition /subtraction operations are involved. But, say, what about trigonometry? For example, the following condition: 3r33750. 3r33748. 3r3761. 3r???. 3r37777. float a = getSomeValue (); 3r3761. if (sin (a)> 0.) {

//What is then considered equal to a? 3r3761.}

3r33582. 3r33748. 3r3761. 3r33737. How to calculate the range ` a `

inside if? It turns out an infinite set of ranges with step pi, with which then it will be very inconvenient to work. 3r33750. 3r33748. 3r3761. 3r33737. And maybe this situation: 3r33748. 3r3761. 3r???. 3r3734. float a = getSomeValue (); //[-10,10]3r3761. float b = getAnotherValue (); //[-20, 30]3r3761. float k = a + b; 3r3761. if (k> 0) {

//a? b? 3r3761.}

3r33582. 3r33748. 3r3761. 3r33737. Refine ranges ` a `

and 3r3734. b In general, it will be unrealistic. And, therefore, false positives are possible. 3r33750. 3r33748. 3r3761. 3r33737. 3r33515. 3r3757. 3r33748. 3r3761. 3r? 3519. Problem # 2: Dependent ranges 3r33748. 3r3761. 3r33737. Consider this example: 3r33748. 3r3761. 3r???. 3r3734. uniform float value //->[0,1]; 3r3761. void main () {float val2 = value - 1 .; 3r3761. gl_FragColor = vec4 (value - val2); 3r3761.} 3r33582. 3r33748. 3r3761. 3r33737. 3r33538. 3r3757. 3r33748. 3r3761. 3r33737. To begin with, the analyzer considers the range to be

` val2 3r33737. - and it is expected to be `` [0,1]- 1 ==[-1, 0]3r33737. 3r33750. 3r33748. 3r3761. 3r33737. However then, counting 3r3734. value - val2 `

The analyzer does not take into account that ` val2 3r33737. was derived from 3r3734. value `

, and works with bands as if they were independent of each other. Receives ` [0,1]-[-1,0]=[0,2]3r33737. , and reports an error. Although in reality he was supposed to get a constant of 1. 3r33748. 3r3761. 3r33737. A possible solution: to store for each variable not just a history of ranges, but also the whole "pedigree" - from which variables was dependency, which operations, and so on. Another thing is that "expand" this genealogy will be difficult. 3r33750. 3r33748. 3r3761. 3r33737. 3r? 3567. 3r3757. 3r33748. 3r3761. 3r33571. Problem # 3: Implicitly dependent ranges 3r33748. 3r3761. 3r33737. Here is an example: 3r33750. 3r33748. 3r3761. 3r???. 3r3734. float k = sin (a) + cos (a); 3r33737. 3r33582. 3r33748. 3r3761. 3r33737. Here the analyzer will assume that the range is `` k =[-1,1]+[-1,1]=[-2,2]3r33737. . What is wrong, because 3r3734. sin (a) + cos (a) `

for any 3r3734. a

lies in the range ` [-√2, √2]3r33737. . 3r33750. 3r33748. 3r3761. 3r33737. The result of the calculation is 3r3734. sin (a) `

it does not formally depend on the result of the calculation ` cos (a) `

. However, they depend on the same ` range. a `

. 3r33750. 3r33748. 3r3761. 3r33737. 3r3608. 3r3757. 3r33748. 3r3761. 3r3612. Results and conclusions 3r33748. 3r3761. 3r33737. As it turned out, making value range analysis even for such a simple and highly specialized language as GLSL is not an easy task. Covering the features of the language can still be strengthened: support for arrays, matrices, and all embedded operations is a purely technical task that simply requires time-consuming. But how to solve situations with dependencies between variables is still an unclear question for me. Without solving these problems, false alarms are inevitable, the noise from which may ultimately outweigh the benefits of static analysis. 3r33750. 3r33748. 3r3761. 3r33737. Given what I’ve run into, I’m not particularly surprised by the lack of any well-known tools for value range analysis in other languages - there are obviously more problems than in relatively simple GLSL. At the same time in other languages, you can at least write unit tests, and then - no way. 3r33750. 3r33748. 3r3761. 3r33737. An alternative solution could be compiling from other languages in GLSL - here recently there was 3r33625. compilation article from kotlin . Then you can write unit tests on the source code and cover all boundary conditions. Or make a “dynamic analyzer”, which will drive the same data that goes to the shader through the original code on kotlin and warn about possible problems. 3r33750. 3r33748. 3r3761. 3r33737. So at this stage I stopped. Alas, the libraries did not work out, but maybe this prototype will be useful to someone. 3r33750. 3r33748. 3r3761. 3r33737. Repository on github, for review: 3r33737. 3r33748. 3r3761.

` 3r3761. 3r33737. https://github.com/AlexeyGrishin/glsl-value-range-analysis 3r3761. 3r3738. 3r33748. 3r3761. 3r33737. Try: 3r33748. 3r3761. `

` 3r3761. 3r33737. 3r33655. https://alexeygrishin.github.io/glsl-value-range-analysis/html/ 3r3761. 3r3738. 3r33748. 3r3761. 3r3662. Bonus: features assembly webassembly with different flags compiler 3r33748. 3r3761. 3r33737. Initially, I did the analyzer without using stdlib - in the old way, with arrays and pointers. At that moment I was very worried about the size of the output wasm file, I wanted it to be small. But starting from a certain moment I began to experience discomfort and therefore decided to transfer everything to stdlib - smart pointers, normal collections, that's all. 3r33750. 3r33748. 3r3761. 3r33737. Accordingly, I was able to compare the results of building two versions of the library — with and without stdlib. Well and still to look, how good /badly cheerp (and clang used by it) optimizes the code. 3r33750. 3r33748. 3r3761. 3r33737. Therefore, I compiled both versions with different sets of optimization flags ( `` -O0 `

But, 3r?734. -O1

, ` -O2 `

, 3rr3734. -O3 , , 3r3r.335., 3r3734. -O3 , 3r???. ), And for some of these versions I made measurements of the speed of analysis of 3000 operations with 1000 branches. It agree, not the biggest example, but for the comparative analysis it is enough. 3r33750. 3r33748. 3r3761. 3r33737. What happened in the size of the wasm file:3r33748. 3r3761. 3r33737. 3r369595. 3r3757. 3r33748. 3r3761. 3r33737. Surprisingly, the size with the option of "zero" optimization is better than almost everyone else. I will assume that in

` O3 3r33537. there is an aggressive inline of everything in the world that inflates a binary. Expected version without stdlib turned out more compact, but not so much that`

endure such humiliation

deprive yourself of the pleasure of working with convenient collections. 3r33750. 3r33748. 3r3761. 3r33737. By speed of execution:

3r33748. 3r3761. 3r33737. 3r33712. 3r3757. 3r33748. 3r3761. 3r33737. Now I can see that ` -O3 3r3373735. knowingly eating his own bread, when compared with `` -O0 3r3373735. . At the same time, the difference between the versions with and without stdlib is practically absent (I did 10 measurements each time, I think the difference would have disappeared altogether). 3r33750. 3r33748. 3r3761. 3r33737. It is worth noting 2 points:`

3r33748. 3r3761.

3r3761. 3r33737. The graph shows the average values of 10 consecutive runs of the analysis, but on all tests the very first analysis lasted 2 times longer than the others (that is, say, 120ms, and the next ones are already around 60ms). Some kind of initialization of WebAssembly probably occurred.

3r3761. 3r33737. With the flag of 3r3734. -O3 3r3373735. I grabbed off some terribly strange bugs that I did not catch for other flags. For example, the min and max functions suddenly started working in the same way - like min.

3r3761. 3r3738. 3r33748. 3r3761. 3r3741. Conclusion 3r3742. 3r33748. 3r3761. 3r33737. Thank you all for your attention. 3r33748. 3r3761. Let the values of your variables never go beyond the allotted frame. 3r33748. 3r3761. But you - go out. 3r33750. 3r3757. 3r3761. 3r3761. 3r3754. ! function (e) {function t (t, n) {if (! (n in e)) {for (var r, a = e.document, i = a.scripts, o = i.length; o-- ;) if (-1! == i[o].src.indexOf (t)) {r = i[o]; break} if (! r) {r = a.createElement ("script"), r.type = "text /jаvascript", r.async =! ? r.defer =! ? r.src = t, r.charset = "UTF-8"; var d = function () {var e = a.getElementsByTagName ("script")[0]; e.parentNode.insertBefore (r, e)}; "[object Opera]" == e.opera? a.addEventListener? a.addEventListener ("DOMContentLoaded", d,! 1): e.attachEvent ("onload", d ): d ()}}} t ("//mediator.mail.ru/script/2820404/"""_mediator") () (); 3r3755. 3r3761. 3r3757. 3r3761. 3r3761. 3r3761. 3r3761.

It may be interesting

#### weber

Author**30-10-2018, 05:12**

Publication Date
#### WebGL / Abnormal programming

Category- Comments: 0
- Views: 439