How and why we won the Big Data track on the Urban Tech Challenge

My name is Dmitry. And I want to talk about how our team reached the final of the hackathon Urban Tech Challenge on the Big Data track. I’ll say right away that this is not the first hackathon in which I participated, and not the first in which I won prizes. In this regard, in my story I want to voice some general observations and conclusions concerning the hackathon industry as a whole, and give my point of view as opposed to the negative reviews that appeared on the network immediately after the end of the Urban Tech Challenge (for example, 3-3r34. This 3rr342. ).
So, first some general observations.
task list and assessing their complexity, we focused on two tasks: a catalog of innovative enterprises from the DIPA and a chat bot from EFKO. The task from the DPIiR was chosen by the backender, the task from EFKO I chose, since Had experience writing chatbots on node.js and DialogFlow. The EFKO task also assumed ML, I have some, not very big, experience in ML. And according to the conditions of the problem it seemed to me that it is unlikely to be solved by means of ML. This feeling was strengthened when I went to the Mitap Urban Tech Challenge, where the organizers showed me a datasheet for EFKO, where there were about 100 photos of product layouts (taken from different angles) and about 20 classes of layout errors. And, at the same time, customers of the task wanted to get a classification success of 90%. As a result, I prepared a presentation of the solution without ML, the backeder prepared a presentation on the catalog, and jointly, having finalized the presentations, we sent them to the Urban Tech Challenge. Already at this stage, the level of motivation and contribution of each participant was revealed. Our designer did not participate in discussions, answered with a delay, and even filled out information about himself in the presentation at the last moment, in general, there were doubts.
As a result, we went through the task of PDiIR, and we were not upset that we didn’t go through EFKO, because the task seemed to us, to say the least, strange.
5. Preparing for the hackathon. When it became finally known that we passed to the hakaton, we began to prepare the blank. And here I don’t urge to start writing code a week before the hackathon begins. At a minimum, you should have a boilerplate ready, with which you can immediately get to work, without having to tune the tools, and without running into some kind of bugs that you decide to try on the hackathone for the first time. I know the story about the Angulyarshchik who came to the hackathon and spent all 2 days setting up the project build, so everything should be prepared in advance. We assumed to distribute the duties as follows: the backender writes the crawlers that search the Internet and put all the collected information in the database, I write the API on node.js, which requests this database and sends the data to the front. In this regard, I pre-made the server preset on express.js, made the frontend preset on react. I do not use CRA, I always customize the webpack for myself and I know perfectly well what risks this may pose (remember the story about angularists). At that moment I requested an interface template, or at least a mockup from our designer, in order to have an idea of ​​what I would impose. In theory, he, too, must do his own preparations and coordinate them with us, but I never received an answer. In the end, I borrowed the design from one of my old project. And so it began to turn out even faster, since all the styles for this project had already been written. Hence the conclusion: the designer is not always needed in a team))). With these achievements we came to the hackathon.
6. Work on the hackathon. I saw my team live for the first time only at the opening of the hackathon in the CDP. We met, discussed the solution and stages of work on the task. And although after the opening we had to go by buses to Red October, we went home to sleep, having agreed to come to the place by ???. Why? The organizers, apparently, wanted to squeeze the maximum out of the participants, so they arranged just such a schedule. But, in my experience, you can normally code, not sleeping one night. As for the second, I am no longer sure. Hackathon is a marathon, you need to adequately calculate and plan your strength. Especially since we had blanks.
How and why we won the Big Data track on the Urban Tech Challenge
Therefore, having sleep off, at ??? we were sitting on the sixth floor of the Dewocracy. Then our designer unexpectedly announced that he does not have a laptop, and that he will work from home, and we will communicate by telephone. This was the last straw. And so we from the four turned into the three, although the name of the team did not change. Again, this was not a strong blow for us; I already had a design from an old project. In general, at first everything went quite smoothly and according to the plan. We uploaded to the database (we decided to use neo4j) dataset innovative companies from the organizers. I started to typeset, then took up node.js, and then misfires went. I have never worked with neo4j before, and at first I was looking for a working driver for this database, then I figured out how to write a query, and then I was surprised to find that this database returns entities as an array of node objects and their edges. Those. when I requested the organization's TIN and all the data on it, instead of one organization object, I returned a long array of objects containing data on this organization and the relations between them. I wrote a mapper that went through the entire array, and glued all the objects of the organization into one object. But in combat, when queried for a base of ?000 organizations, it was carried out extremely slowly for about 20–30 seconds. I was thinking about optimization And then we stopped on time and moved to MongoDB, and it took us about 30 minutes. In total, neo4j was lost about 5 hours.
Remember, never take on hakaton technology with which you are not familiar, there may be surprises. But, in general, apart from this failure, everything went according to plan. And in the morning of December ? we had a fully working application. For the rest of the day we planned to add additional features to it. In the future, everything went relatively smoothly for me, but the backender had a whole bunch of problems with its crawler banning in search engines, in spam of legal entities aggregators, which came in the first places of search results when requesting each specific company. But he will tell about it better. The first additional feature that I screwed up is a search by full name. CEO VKontakte. It took a few hours.
So, on the company's page in our application appeared ava. Director, a link to his VK page and some other data. It was a good cherry on the cake, although perhaps it did not ensure our victory. Then, I wanted to wind up any analytics. But after a long search of options (there were many nuances with the UI) I stopped at the simplest aggregation of organizations by economic activity code. Already in the evening, in the last hours, I deployed a blank for displaying innovative products (in our application the Products and Services section is assumed), although the backend for this was not ready. At the same time, the base was swollen like yeast, the crawlers continued to work, the backeder experimented with NLP in order to distinguish innovative texts from non-innovative ones))). But it was already time for the delivery of the final presentation.
7. Presentation. From my own experience, I can say that the switch to the preparation of the presentation should be somewhere between 3 and 4 hours before its delivery. Especially if the video is supposed to be in it, it takes a lot of time to shoot and mount it. We were supposed to video. And we had a special person who was engaged in this, and also solved a number of other organizational issues. In this regard, until the very last moment we were not distracted from coding.
8. Pitch I did not like that the presentations and the final were made on a separate weekday (Monday). Here, most likely, the organizers continued the policy of squeezing the maximum out of the participants. I did not plan to ask for leave from work, I wanted to come only to the final, although the rest of my team took the weekend. However, the emotional immersion in the hackathon was already so high that at 8 am I wrote to the chat of my team (working, not the hackathon team) that I took the day at my own expense and went to the CPD for pitches. Our task turned out to be a lot of Scientists' clean dates, and this greatly affected the approach to solving the problem. Many had a good DS, but no one had a working prototype, many could not bypass the bans of their crawlers in search engines. We were the only team with a working prototype. And we knew how to solve the problem. In the end, we won the track, although we were very lucky that we chose the least competitive task. Looking at the pitches in other tracks, we realized that we would not have any chance there. I also want to say that we were very lucky with the jury, they meticulously checked the code. And judging by the reviews, this was not the case on all tracks.
9. Final. After we were summoned several times to the jury for a code review, we thought that we finally decided all the questions and went to dinner at Burger King. There, the organizers called us again, had to quickly pack orders and go back.
The organizer showed which room to go to, and, having entered there, we found ourselves at the training in oratorical skills for the winning teams. The guys who were supposed to perform on stage were well charged, everyone came out like real showmen.
And I have to admit, in the final, against the background of the strongest teams from other tracks, we looked pale, the victory in the nomination of the state customer deservedly left the team from the track real estate tech. I think that the key factors that contributed to our victory on the track were: the availability of a finished workpiece, due to which we were able to quickly make a prototype, the presence in the prototype of "highlights" (search for the general director in social networks) and NLP skills of our backender who are also very interested in the jury.

And in conclusion, the traditional thanks to all those who supported us, the jury of our track, Evgeny Evgrafyev (the author of the task, which we solved at the hackathon) and of course the organizers of the hackathon. It was probably the largest and coolest hackathon, of all in which I participated, it remains only to wish the guys to keep such a high mark in the future! 3r3108.
3r3105. ! function (e) {function t (t, n) {if (! (n in e)) {for (var r, a = e.document, i = a.scripts, o = i.length; o-- ;) if (-1! == i[o].src.indexOf (t)) {r = i[o]; break} if (! r) {r = a.createElement ("script"), r.type = "text /jаvascript", r.async =! ? r.defer =! ? r.src = t, r.charset = "UTF-8"; var d = function () {var e = a.getElementsByTagName ("script")[0]; e.parentNode.insertBefore (r, e)}; "[object Opera]" == e.opera? a.addEventListener? a.addEventListener ("DOMContentLoaded", d,! 1): e.attachEvent ("onload", d ): d ()}}} t ("//"""_mediator") () ();
+ 0 -

Add comment