Data-mining and Twitter

 3r? 3516. 3r3-31. 3r3502. Among social networks, Twitter is more suitable for extracting text data due to the hard limit on the length of the message in which users are forced to put all the most essential.
3r?500.  3r? 3516. 3r3502. 3r3164. I suggest to guess what technology is framed by this cloud of words? 3r3165.
3r?500.  3r? 3516. 3r3502. Data-mining and Twitter
3r?500.  3r? 3516. 3r3502. Using the Twitter API, you can extract and analyze a wide variety of information. An article on how to do this using the programming language R.
dressing in the US Congress 3r3504. in the wake of the investigation ...
+ 0 -

Hackathon on Data Science in SIBUR: how it was

 3r33333. 3r3-31. Hello!
 3r33333.
 3r33333. Since the beginning of the year, we have conducted about 10 hakatons and workshops throughout the country. In May, we are together with r3r36. AI-community
organized hakaton in the direction of "digitization of production." Before us, the hackathon about data science in production has not yet been done, and today we decided to tell in detail about how it was.
 3r33333.
 3r33333. Hackathon on Data Science in SIBUR: how it was  3r33333.
 3r33333. The goal was simple. It was necessary to digitize our business at all its stages (from the supply of raw materials to production and direct sales). Of course, applied ..
.
+ 0 -

Realization of minimization of logical functions by the Quine-McCloskey method with incomplete input set

This article is, to some extent, a continuation of my article on the minimization of logical functions by the method. Quine-McCloskey . It dealt with a case with completely defined logical functions (although this was not explicitly mentioned, but only implied). In reality, such a case is rare enough when the number of input variables is small. Partially or not completely defined are logical functions whose values ​​are given only for part Q of the complete set P =
Realization of minimization of logical functions by the Quine-McCloskey method with incomplete input set
Possible sets (terms) of ...
+ 0 -

Pancakes with ICO on a python or how to measure people and projects ICO

Friends, good afternoon.
 
There is a clear understanding that most of the ICO projects are essentially an intangible asset. ICO project is not a Mercedes-Benz car - which rides regardless of what its who likes or dislikes. And the main influence on the ICO is the mood of the people - both the mood for the founder of the ICO, and the project itself.
 
It would be good to somehow measure people's attitude towards the founder of the ICO and /or the ICO project. Which was done. The report is below.
 
The result was a tool for collecting positive negative sentiment from the Internet, in particular ...
+ 0 -

If you want to create something really cool, you need to dig deeper and know how your code works in the system, on hardware

Habr, greetings! I wonder how many programmers and developers have discovered data science or data engineering, and are building a successful career in the field of large data. Ilya Markin, Software engineer at Directual , - just one of the developers who switched to data engineering. Talked about the experience in the role of timlida, a favorite tool in data engineering, Ilya talked about conferences and interesting specialized channels of javists, about Directual from the user side and technical, about computer games, etc.
 
 
If you want to create something really cool, you need to dig deeper and know how your code works in the system, on hardware
 
 
-...
+ 0 -

Data mining Pubmed and Pubchem databases of medical and biochemical information

PubMed represents more than 28 million citations (abstracts and titles) of biomedical literature from the life sciences journals, online books and MEDLINE. Also, the citation can include the full text of the articles.
 
A typical request to Pabmed is type 2 diabetes natural compound
 
 
Pubchem - a database of more than 100 million chemical compounds and 236 million substances. Also in the database are the bioactivity results of ??? million compounds (eg, activity of compounds against cancer or inhibition of a particular gene).
 
At the moment, about 9 million organic chemical compounds ...
+ 0 -

A couple of thoughts about the features of the Russian Data Science

A couple of thoughts about the features of the Russian Data Science  
 
Today at Moscow Data Science Major he talked about privacy, ethical Data Science, and many interesting technical innovations. People listened attentively, asked questions, thanked. But what happened next was very revealing. About this under the cut.
 
 
 
 
And then there was a report about the new Russian developments on NLP with this here's a slide.
 
 
 
 
The only amendment I made to it publishing here is the gray fields that closed the name, surname, quality and address of the living person. A person whose personal data and medical secret were so calmly and casually revealed to a thousand people who are not burdened by any nondisclosure agreements.
 
 
And the most terrible thing is not even that at the same time a whole series of federal laws were violated (No. 323 Article 13 and No. 152 at least). The worst thing, in my opinion, is that almost no one saw anything unexpected or bad in it
 
 
I really want to believe that I'm wrong, and the author changed the ...
+ 0 -

What are the experts in data analysis really doing? Conclusions from 35 interviews

The author of the material conducted a series of interviews with experts in the field of analysis and data processing and made conclusions about the perspectives and development trends of data-sentientists.
 
 
What are the experts in data analysis really doing? Conclusions from 35 interviews
 
The theory and methods of data processing simplified the solution of various tasks in the field of technology. This includes the optimization of Google's search results, recommendations in LinkedIn, the formation of headlines for materials on Buzzfeed. However, working with data can significantly affect many sectors of the economy: from retail trade, telecommunications, agriculture to health...
+ 0 -

Data-driven decision on an example of a choice of color for painting walls

Data-driven decision on an example of a choice of color for painting wallsStarting to choose a color for painting the walls in the room, I came across an interesting thing. The entire process from the very beginning began to resemble work on some IT-ML-Blah-blah-blah-analytical project.
 
 
There is also a customer who does not really understand what exactly he wants, but he wants everything to be good and he liked it. There are still several interested parties on the part of the customer who can not agree on what is "good". There are some re-statements of the problem, which, under a big question, are relevant to this very "good", but at least somehow solvable. There ...
+ 0 -

KDD 201? the third day, the main program is

KDD 201? the third day, the main program is  
 
Today, at last, the main program of the conference has begun. Acceptance rate this year was only 8%, i.е. Perform the best of the best of the best. Explicitly divided application and research flows, plus there are several separate side events. Application flows look more interesting, there reports, mainly from major (Google, Amazon, Alibaba, etc.). I'll tell you about those performances that we managed to visit.
 
professor University of California (it should be noted that there are a lot of women on KDD, both among listeners and among speakers). All this is expressed in the abbreviation FATES:
 
 
 
Faireness - n...
+ 0 -