How users teach Yandex to warn about telephone spam

All those who have flashed their number on the Internet, filled in a questionable questionnaire in offline or who simply were not lucky to get into numerous bases are familiar with the telephone spam. Today we will tell the readers of Habrahabra about how, through user feedback and machine learning, we taught the Yandex application to warn about unwanted calls.
 
 
How users teach Yandex to warn about telephone spam
 
 
Calls from unfamiliar numbers are always a difficult choice. Is this a long-awaited courier or another operator calling with a "unique" promotional offer? To solve this problem, there are mobile applications that work on the basis of directories of well-known organizations. In part, they solve the problem. But the most aggressive spammers, questionable collectors and intruders do not fall into such bases. What to do?
 
 
Toloka , whose users helped to disassemble and classify reviews.
 
 
 
 
So we started to collect data not only about well-known organizations with a relatively good reputation, but also about spammers, scammers, aggressive collectors, prankers and even about amateurs to remain silent in the tube. Although not all categories could be safely recorded in unwanted calls. For example, calls from courier services are usually useful.
 
 
Directory data and the first user reviews formed the basis for the Yandex number determinant, which was launched last year in the Web version of Search. Yandex began responding with verdicts to many requests containing phone numbers.
 
 
 
 
Soon, an early version of the caller ID was built into the Yandex.Mart application. She worked only on the basis of the Directory, as feedback on other categories was still not enough for quality work. This led us to the next stage in the development of the determinant. Collect reviews should be on the mobile device and immediately after the calls from unknown numbers, and not wait for them on the web. But how to do that? The first internal attempts to collect feedback after any call led to problems. Too often requests annoy users. Moreover, if any user can leave a response to any incoming call, then this provokes and simplifies the wrapping. It was necessary to act smarter.
 
 
Yandex specializes in machine learning. With its help, Search generates the issue, the Browser identifies malicious sites, and Music recommends the tracks. Machine learning allows us to identify non-obvious patterns in the analysis of a large number of heterogeneous factors. Therefore, we applied it in the new version of the caller ID, which now works in the Yandex application for Android. Our technology, based on the library CatBoost , analyzes more than two hundred factors when deciding on a recall request. For example, the frequency and duration of the call. On the other factors, we understandably for the silence, but this decision has allowed to reduce the obsession and make it more difficult to cheat reviews.
 
 
A few words about how it works now. If the user is application Yandex included in the settings of the determinant, then calls from unknown numbers send a request to our cloud, from which the verdict returns.
 
 

 
 
By the way, you can also look at the verdict for missed calls. This is convenient when you do not know if it's worth calling back.
 
 
If Yandex does not know exactly where the call comes from, then at the end of the call the user can see a request to leave a response. The probability of occurrence of this request just depends on the analysis of all factors in the cloud.
 
 

 
 
Now we are collecting new reviews, which will inevitably affect the development of the technology of the determinant of numbers in the future. If you have experience in creating such systems, or you see an alternative solution to the problem of telephone spam and other unwanted calls, then we would be interested to discuss this. Thank you.
+ 0 -

Add comment