The story of what not to do during development

 3r33394. 3r3-31. Prologue: For a start, I will talk about the project, so that there would be ideas about how we worked on the project and to recreate the pain we felt. 3r33381.  3r33394. 3r33381.  3r33394. I, as a developer, joined the project in 2015-201? I don’t remember exactly, but he worked 2-3 years earlier. The project was very popular in its field, namely game servers. How strange it did not sound, but projects on game servers are being carried out to this day, I recently saw vacancies and worked a bit in one team. Since the game servers are built on the already created game, therefore, the script language that is built into the game engine is used for development. 3r33381.  3r33394. 3r33381.  3r33394. We are developing a project from Garry’s Mod (Gmod) almost from scratch, it is important to note that at the time of this writing, Harry is already creating a new S & Box project on the Unreal Engine. We still sit on the Source. 3r33381.  3r33394.
Which is generally not suitable for our server theme.
The story of what not to do during development 3r33381.  3r33394.
3r33381.  3r33394. “What is your story scary?” - you ask. 3r33381.  3r33394. 3r33381.  3r33394. We have a strong theme of the game server, namely “Stalker” and even with elements of role-playing games (RP), the question immediately arises - “And how to implement it all on one server?”. 3r33381.  3r33394. 3r33381.  3r33394. Given that the Source engine is old (the 2013 version is used in Gmod also 32 bit), you can’t make large maps, small restrictions on the number of Entity, Mesh and many other things. 3r33381.  3r33394.
Who worked on the engine, will understand.
It turns out, the task is generally impossible, to make a pure multiplayer stalker with quests, RPG-elements from the original and preferably a small story. 3r33381.  3r33394. 3r33381.  3r33394. First of all, the initial writing was difficult (many actions from the category: throwing out the subject, picking up the subject were written from scratch), hoping that it would be easier to continue, but the requirements grew. The mechanics of the game was ready, it remained to make the intellect, the agrade, and all sorts of things. In general, all endured as they could. 3r33381.  3r33394. 3r33381.  3r33394. 3r33381.  3r33394. 3r33381.  3r33394. The problems started already during the work of the release version one, namely (lags, server delays). 3r33381.  3r33394. 3r33381.  3r33394. It seems a powerful server could easily handle requests and hold the entire Gamemode. 3r33381.  3r33394. 3r33381.  3r33394.
3r33333. A simple description of the gamemode [/b] 3r3333317. This is the name of the complex of scripts written to describe the mechanics of the server itself 3r33381.  3r33394. For example: we want the themes of the now popular "Royal Battles", which means that the name should correspond to the mechanics of the game too. “Spawning players on the plane, you can pick up things, players can communicate, you can’t wear more than 1 helmet, etc.” - all this is described by the game mechanics on the server. 3r33381.  3r33394. 3r33333. 3r33333. 3r33381.  3r33394. Lags were both on the server side due to the large number of players, since one player eats up a lot of RAM about 80-120 MB (not counting more items in the inventory, skills, etc.), and on the client side there was a strong decrease fps 3r33381.  3r33394. 3r33381.  3r33394. CPU power was not enough for processing physics, it was necessary to use objects with physical properties less. 3r33381.  3r33394. 3r33381.  3r33394. So even in addition were our samopisny scripts which in general were not optimized in any way. 3r33381.  3r33394. 3r33381.  3r33394. 3r380. 3r33381.  3r33394. 3r33381.  3r33394. First of all, we of course read the article on optimization in Lua. It even reached suicide r3r3368. the fact that they wanted to write DLLs in C ++, but the problem arose in downloading the DLLs from the server by the clients. Using C ++ for a DLL, you can write a program that quietly intercepts the data, the Gmod developers added an extension to the exceptions for clients to download (security, although in fact it never was). Although it would be convenient and Gmod would become more flexible, but more dangerous. 3r33381.  3r33394. 3r33381.  3r33394. Next, we looked at the profiler (since smart people wrote it) and there was horror in the functions, it was noticed that, initially, there were very slow functions in the Gmod library. 3r33381.  3r33394. 3r33381.  3r33394. If you tried to write in Gmod, then you know perfectly well that there is a library built-in called math. 3r33381.  3r33394. 3r33381.  3r33394. And the slowest functions in it are of course math.Clamp and math.Round. 3r33381.  3r33394. 3r33381.  3r33394. Having rummaged in the code of people, it was noticed that the functions were thrown in different directions, almost everywhere it is used, but incorrectly! 3r33381.  3r33394. 3r33381.  3r33394. Let's get to practice. For example, we want to round off the coordinates of the position vector to move the entity (for example, the player). 3r33381.  3r33394. 3r33381.  3r33394. 3r33333. local x = ???r3r3394. local y = ???r3r3394. local z = ???r3r3394. LocalPlayer (): SetPos (Vector (Math.Round (x), Math.Round (y), Math.Round (z))  3r33394. 3 complex rounding functions, but nothing serious, unless of course in a cycle and not often used, but Clamp is even harder. 3r33381.  3r33394. 3r33381.  3r33394. The following code is often used in projects and no one wants to change anything. 3r33381.  3r33394. 3r33381.  3r33394. 3r33333. self: setLocalVar ("hunger", math.Clamp (current + ? ? 100))
3r33333.
3r33381.  3r33394. For example, self points to the player's object and it has a local variable we’ve invented that when reset to the server is reset to zero, math.Clamp is essentially like a loop, makes a smooth assignment, like a smooth interface to do on Clamp. 3r33381.  3r33394. 3r33381.  3r33394. Problems arise when it works on every player who enters the server. It is rarely the case, but if 5-15 enter the server at once (depending on the server configuration) at one point in time and this small and simple function starts working for everyone, then the server will have good CPU delays. Still worse if math.Clamp in a loop. 3r33381.  3r33394. 3r33381.  3r33394. Optimization is actually very simple; you localize heavily loading functions. It seems primitive, but in 3 gamemode and many add-ons I saw this slow code. 3r33381.  3r33394. 3r33381.  3r33394. If you need to get the value and use it in the future, do not get it again if it does not change. After all, a player entering the server in any case will get a hunger equal to 10? so this code is several times faster. 3r33381.  3r33394. 3r33381.  3r33394. 3r33333. local value = math.Clamp (current + ? ? 100)
self: setLocalVar ("hunger", value)
3r33333.
3r33381.  3r33394. All is well, they began to look further, that yes how it works. As a result, we started to optimize everything. 3r33381.  3r33394. 3r33381.  3r33394. We noticed that the standard for cycle was slow and we decided to invent our own bike that would be faster (we did not forget about blackjack) and this was where the game began. 3r33381.  3r33394. 3r33381.  3r33394. 3r33170. 3r33381.  3r33394. 3r33381.  3r33394.
3r33333. SPOILER [/b] 3r3333317. We even managed to make the fastest loop on Lua Gmod, but on condition that the elements should be greater than 100.
 3r33394. 3r33333. 3r33333. 3r33381.  3r33394. Judging by the time spent on our cycle and its use in the code, we tried in vain to do this because it found application only in the spawn on the anomaly map after ejecting and cleaning them. 3r33381.  3r33394. And so to the code. For example, it is necessary to find all the entities with the name at the beginning of the anom, we have such anomalies of the class name. 3r33381.  3r33394. 3r33381.  3r33394. Here is for a normal Lumod Gmod script:
 3r33394. 3r33381.  3r33394. 3r33333. local anomtable = ents.FindByClass ("anom_ *")
for k, v in pairs (anomtable) do
v: Remove ()
end
3r33333.
3r33381.  3r33394. Here is for the smoker:
 3r33394. 3r33381.  3r33394. We immediately see that such Mr. * the code will be slower than the standard “for in pairs”, but as it turned out not. 3r33381.  3r33394. 3r33381.  3r33394. 3r33333. local b, key = ents.FindByClass ("anom_ *"), nil
repeat
key = next (b, key)
b[key]: Remove ()
until key! = nil
3r33333.
3r33381.  3r33394. 3r33381.  3r33394. For a complete analysis of these loop options, they need to be translated into a regular Lua script. 3r33381.  3r33394. For example, anomtable will have 5 elements. 3r33381.  3r33394. Removal is replaced by the usual addition. The main thing to see is the difference in the number of instructions between the two options for the implementation of a for loop. 3r33381.  3r33394. 3r33381.  3r33394. Vanilla cycle:
 3r33394. 3r33381.  3r33394. 3r33333. local anomtable = {? ? ? ? 5}
for k, v in pairs (anomtable) do
v = v + 1
end
3r33333.
3r33381.  3r33394. Our great:
 3r33394. 3r33381.  3r33394. 3r33333. local b, key = {? ? ? ? 5}, nil
repeat
key = next (b, key)
b[key]= b[key]+ 1
until key ~ = nil
3r33333.
3r33381.  3r33394. Let's look at the interpreter code (3r33377. Like assembly, high-level programmer doesn’t recommend looking under spoiler 3r37878.). 3r33381.  3r33394. 3r33381.  3r33394. Just in case, remove the june from the screens. I warned. 3r33381.  3r33394. 3r33381.  3r33394.
3r33333. Disassembler of the vanilla cycle [/b] 3r3333317. 3r33333. 3r33333. ; Name: for1.lua
; Defined at line: 0
; #Upvalues: 0
; #Parameters: 0
; Is_vararg: 2
; Max Stack Size: 7
3r33394. 1[-]: NEWTABLE R???; R0: = {}
2[-]: LOADK R1 K0; R1: = 1
3[-]: LOADK R2 K1; R2: = 2
4[-]: LOADK R3 K2; R3: =
5[-]: LOADK R4 K3; R4: = 4
6[-]: LOADK R5 K4; R5: = 5
7[-]: SETLIST R???; R0[(1-1)*FPF+i]: = R (0 + i), 1 <= i <= 5
8[-]: GETGLOBAL R1 K5; R1: = pairs
9[-]: MOVE R2 R0; R2: = R0
10[-]: CALL R???; R? R? R3: = R1 (R2)
11[-]: JMP 13; PC: = 13 r3r3394. 12[-]: ADD R5 R5 K0; R5: = R5 + 1
13[-]: TFORLOOP R1 2; R? R5: = R1 (R? R3); if R4 ~ = nil then begin PC = 12; R3: = R4 end
14[-]: JMP 12; PC: = 12 r3r3394. 15[-]: RETURN R0 1; return

3r33381.  3r33394. 3r33333. 3r33333. 3r33381.  3r33394.
3r33333. Disassembler cycle cycle [/b] 3r3333317. 3r33333. 3r33333. ; Name: for2.lua
; Defined at line: 0
; #Upvalues: 0
; #Parameters: 0
; Is_vararg: 2
; Max Stack Size: 6
3r33394. 1[-]: NEWTABLE R???; R0: = {}
2[-]: LOADK R1 K0; R1: = 1
3[-]: LOADK R2 K1; R2: = 2
4[-]: LOADK R3 K2; R3: =
5[-]: LOADK R4 K3; R4: = 4
6[-]: LOADK R5 K4; R5: = 5
7[-]: SETLIST R???; R0[(1-1)*FPF+i]: = R (0 + i), 1 <= i <= 5
8[-]: LOADNIL R1 R1; R1: = nil
9[-]: GETGLOBAL R2 K5; R2: = next
10[-]: MOVE R3 R0; R3: = R0
11[-]: MOVE R4 R1; R4: = R1
12[-]: CALL R???; R2: = R2 (R? R4)
13[-]: MOVE R1 R2; R1: = R2
14[-]: GETTABLE R2 R0 R1; R2: = R0[R1]3r33394. 15[-]: ADD R2 R2 K0; R2: = R2 + 1
16[-]: SETTABLE R0 R1 R2; R0[R1]: = R2
17[-]: EQ 1 R1 K6; if R1 == nil then PC: = 9
18[-]: JMP 9; PC: = 9 r3r3394. 19[-]: RETURN R0 1; return

3r33381.  3r33394. 3r33333. 3r33333. 3r33381.  3r33394. Inexperienced, just by glancing, the normal cycle is faster, as there are fewer instructions (15 vs 19). 3r33381.  3r33394. 3r33381.  3r33394. But we must not forget that every instruction in the interpreter has processor cycles. 3r33381.  3r33394. Judging by the disassembled code in the first cycle there is a forloop instruction written in advance for working with an array, the array is loaded into memory becomes global, we jump on the elements and add a constant. 3r33381.  3r33394. 3r33381.  3r33394. In the second variant, the method is different, which is more based on memory, it gets the table, changes the element, sets the table, checks for nil and calls it again. 3r33381.  3r33394. Our second cycle is fast due to the fact that in one instruction there are too many conditions and actions (R? R5: = R1 (R? R3); if R4 ~ = nil then begin PC = 12; R3: = R4 end) because of this she is very much eating 3r33333. uses
eats CPU ticks to execute, the last is again more tied to memory. 3r33381.  3r33394. 3r33381.  3r33394. The forloop instruction with a large number of elements is surrendered to our cycle by the speed of passage of all elements. It is connected that the address directly to the address is faster, less than any buns from pairs. (And we have no denial)
 3r33394.
In general, in secret, any use of the negative in the code slows it down; it has already been tested with tests and time. Negative logic will work slower since the processor's ALU has a separate “inverter” computing unit, you need to contact the inverter to operate the unary operand (not,!) And this will take additional time.
Conclusion: Everything standard is not always better, your bikes can be useful, but again on a real project you shouldn’t invent them if you care about release speed. We have a totalIts full development goes from 2014 to the present day, a sort of another “waiter”. Although it seems like an ordinary game server which is set up in 1 day and is fully configured for the game in 2 days, but you must be able to contribute something new. 3r33381.  3r33394. 3r33381.  3r33394. This long-term project still saw the second version of itself, where optimization is very much in the code, but I will tell you about other optimizations in the following articles. Support criticism or comment, correct if I am mistaken. 3r33333. 3r33394. 3r33394. 3r33394.
! function (e) {function t (t, n) {if (! (n in e)) {for (var r, a = e.document, i = a.scripts, o = i.length; o-- ;) if (-1! == i[o].src.indexOf (t)) {r = i[o]; break} if (! r) {r = a.createElement ("script"), r.type = "text /jаvascript", r.async =! ? r.defer =! ? r.src = t, r.charset = "UTF-8"; var d = function () {var e = a.getElementsByTagName ("script")[0]; e.parentNode.insertBefore (r, e)}; "[object Opera]" == e.opera? a.addEventListener? a.addEventListener ("DOMContentLoaded", d,! 1): e.attachEvent ("onload", d ): d ()}}} t ("//mediator.mail.ru/script/2820404/"""_mediator") () (); 3r33333. 3r33394. 3r33333. 3r33394. 3r33394. 3r33394. 3r33394.
+ 0 -

Add comment