Spark SQL. A bit about the query optimizer

Spark SQL. A bit about the query optimizerHello. As an introduction, I want to tell you how I came to such a life.
 
 
Before I met Big Data and Spark, in particular, I had many and often optimized SQL queries, first for MSSQL, then for Oracle, and now I'm faced with SparkSQL.
 
 
And if there are already a lot of good books describing the methodology and "handles" for DBMS that can be twisted to get the optimal query plan, I have not seen such books for Spark. I came across more articles and sets of practices, more related to working through the RDD /Dataset API, rather than pure SQL. For me, one of the reference books on SQL optimization ...
+ 0 -

Optimizing the placement of virtual machines on servers

Optimizing the placement of virtual machines on serversSome time ago, one of my colleagues said that the place in DC is running out, there is nowhere to place the server, and the load is growing and it's unclear what to do, and probably will have to change all available servers to more powerful ones.
 
 
At that time I was busy with the task of drawing up optimal schedules, and I thought - what if using optimization algorithms to increase server utilization in DC? Hence the project was born, which I want to write about.
 
 
For the advanced, I will immediately say that in this article we will talk about bin packing, and the rest, who wants to learn ...
+ 0 -

BIF pattern: clean front-end code and convenient work with server data

In the material, the translation of which we publish today, it will go on what to do in a situation where the data received from the server does not look like the client needs. Namely, first we will consider a typical problem of this kind, and then we will analyze several ways to solve it.
 
 
BIF pattern: clean front-end code and convenient work with server data
 
 


The problem of the failed server API is


 
Consider a conditional example, which is based on several real projects. Suppose we are developing a new website for some organization that has existed for some time. It already has REST endpoints, but they are not fully calculated on what we are going to create...[/h]
+ 0 -

Popular antipatterns: pagination

Hello, my name is Dmitry Karlovsky and I I do not like to read books, because while you turn over the page, you get out of the fascinating narrative. Also it is necessary to hesitate a little, as you forget on what the last sentence of the previous page was broken, and it is necessary to flip back to re-read it. And if it's not so terrible with physical books, then with the release of the rest-server everything is much sadder - because now on the page some data, and in a second - it's completely different. Let's think about how it happened, who is to blame and the main thing - what to do.
 
Popular antipatterns: pagination
 
Problem
 
So, we...
+ 0 -

Trace and Javascript

Trace and Javascript  
Have you ever traced the runtime of your application? Do you know how many queries you make that gray endpoint, which? And how long are those cross-references calculated for a similar resource type from each entity page that needs to be returned to the query? Have you tried to measure how long you have to wait for the user because of the optional query fields that he adds from time to time? Have you ever wondered if you would parallelize these six queries to those two databases?
 
If anything above sounds interesting, or at least familiar - welcome to the cut.
 
chrome: ...
+ 0 -

The desired HTTP headers are

The desired HTTP headers areOur clients in Fastly love to manipulate HTTP headers. Choosing the right combination of headings is one of the best things you can do for the security of your site and a significant contribution to its performance.
 
 
Most developers know about important and necessary HTTP headers. The most famous are Content-Type and Content-Length , these are almost universal headers. But lately headlines like have been used to improve security. Content-Security-Policy and Strict-Transport-Security , and to improve performance - Link rel = preload . Despite the wide ...
+ 0 -

The patented dream of programmers of the 80-90's

The patented dream of programmers of the 80-90'sThis spring, I finally managed to realize the old dream of the builders of designers: to build among the American patents of a quarter century ago a very simple solution to all their problems. In fact, it is emulation of the application database, in the construction of which all the rough, routine work of the programmer is rendered "beyond the brackets".
 
In the solution's solution, the data storage system and the way of processing them, the result is an alternative to existing ORMs. The stated advantages: Increase the reliability of the database by minimizing errors when adding new data and creating ...
+ 0 -

Compact serializer for the cache using System.Reflection.Emit

Compact serializer for the cache using System.Reflection.Emit
 
 
In modern services without a cache anywhere: access to data in a persistent database is a long and expensive business, so adding an intermediate storage for the most frequently used data significantly speeds it up. You can store information in a cache in a variety of ways, in various ways: rows, lists, session state, and much more. In this article, we will talk about one of the ways to store in the cache of "flat" objects that do not have nested classes and cyclic references.
 
does not guarantee returning properties ...
+ 0 -

The Nchan module of the nginx web server. Working with Websocket, EventSource (Server-Sent Events), Long-Polling

This article will review the capabilities of the module. Nchan web server nginx, which replaced the deprecated module NGiNX_HTTP_Push_Module. Module Nchan supports the basic technologies for sending Websocket messages, EventSource (Server-Sent Events), Long-Polling. For horizontal scaling, a cluster of redis servers is used.
 
statistics . Yes, not only 6% of web browsers are not supported. However, if the client has included in the contract an item on the support of Opera-mini, then without a fallback on Long-Polling can not do. There is one more thing that reduces the availability ...
+ 0 -

How to improve performance, using the serverless

How to improve performance, using the serverlessarchitecture.   Photo : Jesse Darland with Unsplash   In this article, we will talk about how to transfer the process of preliminary image processing from the application server to absolutely serverless architecture of the AWS platform. Paperclip or Dragonfly , which use ImageMagick for image processing.  This is a fairly simple approach, but it has its drawbacks:   Images are processed on the application server. This can lead to an increase in the overall response time due to the increased CPU load. The application server has limited performance and is not suitable ...
+ 0 -