As I moved from Python to Julia (and why)

3r33352. 3r3-31. 3r33336. A little background about Python

3r33338. 3r33352. Python is a great language. I tried several languages before it: Pascal at school; C, C with classes, C ++ - at the university. The last two (three) have instilled a persistent aversion to programming: instead of solving a problem, you are busy with allocations and destructors (scary words from the past), thinking in terms of low-level primitives. My opinion is that C is not suitable for solving educational and scientific problems (in any case, in the field of mathematics). I am sure that they will object to me, but I am not trying to impose anything on anyone, I just express my opinion. 3r33338. 3r33352. 3r33338. 3r33352. Python has become a revelation in its time. For the first time in my life, I wrote several levels of abstraction higher than is customary in Si. The distance between the task and the code has decreased as never before. 3r33338. 3r33352. 3r33338. 3r33352. I would probably have written it all my life in Python if I didn’t have to suddenly implement NIST statistical tests. It would seem that the task is very simple: there is an array of several lengths (> = 10) megabytes, there is a set of tests that need to be applied to this array. 3r33338. 3r33352.

3r33333. 3r33338. 3r33352. 3r33336. What is numpy good at? 3r33337. 3r33338. 3r33352. In python for working with arrays there is a good numpy package, which is essentially a high-level interface on fast libraries like LAPACK. It would seem that the entire gentleman’s set for scientific computing is available (Sympy, Numpy, Scipy, Matplotlib), what more could you ask for? 3r33338. 3r33352. 3r33338. 3r33352. Everyone who has dealt with numpy knows that he is good, but not in everything. It is necessary to try to make the operations vectorized (as in R), i.e. in a sense, uniform for the whole array. If suddenly your task for some reason does not fit into this paradigm, then you have problems. 3r33338. 3r33352. 3r33338. 3r33352. What kind of non-vectorized tasks are I talking about? Yes, at least take the same NIST: calculate the length of the linear shift register according to the Berlekamp-Messi algorithm; calculate the length of the longest subsequence of consecutive units, and so on. I do not exclude the possibility that there is some ingenious solution that will reduce the task to a vectorized one. 3r33338. 3r33352. 3r33338. 3r33352.

Clever? [/b]

As an example from the same NIST: it was necessary to count the number of “switching” sequences, where by “switching” I mean changing successively going units ( 1111 ) to consecutive zeroes ( 000 ), and vice versa. To do this, you can take the original sequence without the last element (x[: -1]) And subtract the sequence shifted by 1 (x[2: ]) From it, and then calculate the number of nonzero elements in the resulting new sequence. All together will look like:

3r33352. 3r33338. 3r33352.` np.count_nonzero (x[:-1]- x[1:]) `

3r33333. 3r33338. 3r33352. It may look like an entertaining warm-up for the mind, but in fact here is something unnatural, some kind of trick that will not be obvious to the reader after a short time. Not to mention the fact that it is still slow - no one has canceled the allocation of memory. 3r33338. 3r33352.

3r33338. 3r33352. Non-vectorized operations in Python are long. How to deal with them if numpy is not enough? Usually they offer several solutions:

3r33352. 3r33338. 3r33352. 3r3171. 3r33352. 3r3182. 3r3104. Numba jit. [/b] If she worked the way it is described on the main page of Numba, then I think it would be worth it to finish the story. Over the prescription I had forgotten a little what had gone wrong with her; the bottom line is that the acceleration was not as impressive as I expected, alas. 3r3183. 3r33352. 3r3182. 3r3104. Cython. [/b] OK, raise your hands to those who believe that cython is a really beautiful, elegant solution that does not destroy the very meaning and spirit of Python? I do not think so; if you’ve reached Cython, you’ll be able to stop deceiving yourself and move on to something less sophisticated like C ++ and C. 3r3183. 3r33352. 3r3182. Rewrite the bottlenecks on C and pull them from your favorite Python. OK, but what if I have the whole program - it is a solid calculation and bottlenecks? C does not offer! I'm not talking about the fact that in this decision you need to know well not one, but two languages - Python and Xi. 3r3183. 3r33352. 3r3185. 3r33338. 3r33352. 3r33336. Here comes the JULIA! 3r33337. 3r33338. 3r33352. Namayavshis with existing solutions and not finding (having failed to program) anything good, I decided to rewrite it with something “faster”. In fact, if you write the “number thrasher” in the 21st century with normal array support, vectorized out-of-box operations, etc. etc., then the choice you have is not particularly dense:

3r33352. 3r33338. 3r33352. 3r3171. 3r33352. 3r3182. 3r3104. Fortran [/b] . And do not laugh, which of us did not pull the BLAS /LAPACK of their favorite languages? Fortran is a really good (better SI!) Language for SCIENTIFIC calculations. I did not take it for the reason that since the times of Fortran they have already invented a lot of things and added them to programming languages; I was hoping for something more modern. 3r3183. 3r33352. 3r3182. 3r3104. R [/b] suffers from the same problems as Python (vectorization). 3r3183. 3r33352. 3r3182. 3r3104. Matlab [/b] - probably yes, unfortunately I have no money to check. 3r3183. 3r33352. 3r3182. 3r3104. Julia [/b] - the horse is dark, it will take off - it will not take off (and it was natural until the stable-version 1.0) 3r3183. 3r33352. 3r3185. 3r33338. 3r33352. 3r33336. Some obvious advantages of Julia

3r33338. 3r33352. 3r3171. 3r33352. 3r3182. It looks like Python, at least the same “high-level” one, with the possibility, if necessary, to go down to bits in numbers. 3r3183. 3r33352. 3r3182. There is no messing around with memory allocations and things like that. 3r3183. 3r33352. 3r3182. Powerful type system. Types are prescribed optionally and very dosage. You can write a program without specifying a single type - and if you do it STRONGLY, then the program will be fast. But there are nuances. 3r3183. 3r33352. 3r3182. It's easy to write custom types that are as fast as built-in types. 3r3183. 3r33352. 3r3182. Multiple dispatch. You can talk about it for hours, but in my opinion, this is the best thing about Julia, it is the basis for the design of all programs and in general the basis of the philosophy of language. 3r33338. 3r33352. Thanks to multiple dispatch, many things are written very uniformly. 3r33338. 3r33352. 3r33338. 3r33352.

Examples with multiple dispatch [/b]` rand () # give a random number in the range from 0 to 1 r3r3352. rand (10) # an array of length 10 of random numbers from 0 to 1 r3r3352. rand (?5) # is a 3 x 5 matrix of random ones.`

3r33333. 3r33338. 3r33352. That is, the first argument can be any (one-dimensional) distribution from Distributions, but the function call will remain the same. And not only the distribution (you can transfer the RNG itself as a MersenneTwister object and much more). Another (in my opinion indicative) example is navigation in DataFrames without loc /iloc. 3r33338. 3r33352. 3r3183. 3r33352. 3r3182. 6. Arrays are native, embedded. Vectorization out of the box. 3r33338. 3r33352. 3r3183. 3r33352. 3r3185. 3r33338. 3r33352. 3r33336. Cons, silent about which would be a crime! 3r33337. 3r33338. 3r33352. 3r3171. 3r33352. 3r3182. New language. On the one hand, of course, a minus. In some ways, perhaps immature. On the other hand, he took into account the rakes of many past languages, stands “on the shoulders of giants,” absorbed many interesting and new things. 3r3183. 3r33352. 3r3182. Immediately fast programs are unlikely to write. There are some unobvious things that are very easy to deal with: you step on a rake, ask the community for help, you step in again etc. Basically it is type instability, type instability and global variables. In general, as far as I can judge by myself, a programmer who wants to write quickly on Julia goes through several stages: a) he writes like in Python. This is great and can be done, but sometimes there will be performance problems. b) writes as in C: prescribing manic types wherever possible. c) gradually understands where it is necessary (very dosagely) to prescribe types, and where they interfere. 3r3183. 3r33352. 3r3182. Ecosystem. Some packages are raw, in the sense that something is constantly falling off somewhere; some are mature, but not compatible with each other (conversion to other types is necessary, for example). On the one hand, this is obviously bad; on the other hand, there are many interesting and bold ideas in Julia just because “we are standing on the shoulders of giants” - a tremendous experience has been gained in attacking the rake and “how not to do it”, and this is taken into account (partially) by the developers of the packages. 3r3183. 3r33352. 3r3182. The numbering of the arrays from 1. Ha, kidding, this is of course a plus! No, well, seriously, what's the matter, with what index does the numbering begin? You get used to it in 5 minutes. Nobody complains that integers are called integer, not whole. These are all matters of taste. And yes, take at least the same Cormen on the algorithms - everything is numbered from one, and it is sometimes inconvenient to rework in Python. 3r3183. 3r33352. 3r3185. 3r33338. 3r33352. 3r33336. Well, what about the speed of something? 3r33337. 3r33338. 3r33352. It is time to remember for the sake of which everything was started. 3r33338. 3r33352. 3r33338. 3r33352. Note: I safely forgot Python, so write your improvements in the comments, I will try to run them on my laptop and see the execution time. 3r33338. 3r33352. 3r33338. 3r33352. So, two small examples with microbench marks. 3r33338. 3r33352. 3r33338. 3r33352.

using Distributions

d = Normal () # Normal distribution with parameters ? 1

rand (d) # random normally distributed number

rand (d, 10) # array of length 10 and so on

**Something vectorized [/b]**

**The task: the input of the function is a vector X from 0 and 1. It is necessary to convert it into a vector from 1 and (-1) (1 -> ? 0 -> -1) and calculate how many coefficients from the Fourier transform of this vector by the absolute value exceeds the boundary. 3r33338. 3r33352. 3r33338. 3r33352.**

`3r33352. def process_fft (x, boundary): 3r333335. return np.count_nonzero (np.abs (np.fft.fft (2 * x-1))> boundary) 3r33352. 3r33352. % timeit process_fft (x, 2000)`

3r33333. 3r33338. 3r33352.

84.1 ms ± 335 µs per loop (mean ± std. Dev. Of 7 runs, 10 loops each)

`function process_fft (x, boundary)`

3r33333. 3r33338. 3r33352. We won’t see anything surprising here - they still don’t consider it themselves, but transfer it to well-optimized Fortran programs. 3r33338. 3r33352.

count (t -> t> boundary, abs. (fft (2 * x.-1)))

end

3r33352. @benchmark process_fft (x, 2000)

BenchmarkTools.Trial:

memory estimate: ??? MiB

allocs estimate: 61

--------------

minimum time: ??? ms (???% GC)

median time: ??? ms (???% GC)

Mean time: ??? ms (???% GC)

maximum time: ??? ms (???% GC)

--------------

samples: 59

evals /sample: 1

**Something non-vectorized [/b]**

**The input is an array of 0 and 1. Find the length of the longest subsequence of consecutive units. 3r33338. 3r33352. 3r33338. 3r33352.**

`3r33352. def longest (x):`

3r33333. 3r33338. 3r33352.

maxLen = 0

currLen = 0

3r33352. #R3r3352 # this will count the number of ones in the block. for bit in x:

if bit == 1:

currLen + = 1

maxLen = max (maxLen, currLen)

else:

maxLen = max (maxLen, currLen)

currLen = 0

return max (maxLen, currLen)

3r33352. % timeit longest (x)

599 ms ± 639 µs per loop (mean ± std. Dev. Of 7 runs, 1 loop each)

`function longest (x)`

3r33333. 3r33338. 3r33352. The difference is obvious "naked" view. Tips for doping and /or vectorizing the numpy version are welcome. I also want to note that the programs are almost identical. For example, I did not register any type in Julia (compare with the previous one) - it still works quickly. 3r33338. 3r33352.

maxLen = 0

currLen = 0

3r33352. #R3r3352 # this will count the number of ones in the block. for bit in x

if bit == 1

currLen + = 1

maxLen = max (maxLen, currLen)

else

maxLen = max (maxLen, currLen)

currLen = 0

end

end

return max (maxLen, currLen)

end

3r33352. @benchmark longest (x)

BenchmarkTools.Trial:

memory estimate: 0 bytes

allocs estimate: 0

--------------

minimum time: ??? ms (???% GC)

median time: ??? ms (???% GC)

Mean time: ??? ms (???% GC)

maximum time: ??? ms (???% GC)

--------------

samples: 536

evals /sample: 1

3r33338. 3r33352. I also want to note that the versions presented were not included in the final program, but were further optimized; here, they are given as an example and without unnecessary complications (forwarding additional memory in Julia to do rfft in-place, etc.). 3r33338. 3r33352. 3r33338. 3r33352. 3r33336. What happened in the end? 3r33337. 3r33338. 3r33352. I will not show the final code for Python and for Julia: this is a secret (at least for now). At the time when I quit to finish the python version, she ran all the NIST tests on an array of 16 megabytes of binary characters in ~ 50 seconds. On Julia, at the moment, all tests are run on the same volume of ~ 20 seconds. It may well be that I am a fool, and there was space for optimizations, etc. etc. But this is me, what I am, with all its advantages and disadvantages, and in my opinion, it is necessary to consider not the spherical speed of programs in vacuum in programming languages, but how I can program it (without any really gross errors). 3r33338. 3r33352. 3r33338. 3r33352. 3r33336. Why did I write all this? 3r33337. 3r33338. 3r33352. People 3r340. here

became interesting; I decided to write when the time comes. In the future I think to go through Julia more thoroughly, with an example of solving some typical problems. What for? More languages, good and different, and let everyone who wants to find something that suits him personally.

3r33352. 3r33352. 3r33333. ! function (e) {function t (t, n) {if (! (n in e)) {for (var r, a = e.document, i = a.scripts, o = i.length; o -;) if (-1! == i[o].src.indexOf (t)) {r = i[o]; break} if (! r) {r = a.createElement ("script"), r.type = "text /jаvascript", r.async =! ? r.defer =! ? r.src = t, r.charset = "UTF-8"; var d = function () {var e = a.getElementsByTagName ("script")[0]; e.parentNode.insertBefore (r, e)}; "[object Opera]" == e.opera? a.addEventListener? a.addEventListener ("DOMContentLoaded", d,! 1): e.attachEvent ("onload", d ): d ()}}} t ("//mediator.mail.ru/script/2820404/"""_mediator") () (); 3r33333. 3r33352.

3r33352. 3r33352. 3r33352. 3r33352.

It may be interesting

Pleasant data, significant and magnificent plan, as offer great stuff with smart thoughts and ideas, bunches of incredible data and motivation, the two of which I need, on account of offer such an accommodating data here.

Cheers!