Each word has several meanings, which change depending on the context. Sometimes a phrase consisting of neutral words can be offensive (as, for example, “the Law must be just white”), and the offer, which is full of swear words, may be simply expressive line from the song “Leningrad”.
People are able to distinguish the context, and here the machines it generally turns out bad. However, last month Facebook announced that it managed to create a mechanism of classification of text, which helps machines to understand the words within the context.
A new system called DeepText uses advanced artificial intelligence and the concept of “embedded words”, which imitates the principles of language in the human brain. When the system encounters a new word, she thinks just like us and trying to understand the meaning from the context.
For example, she realizes that “white” can have a completely different meaning, if there are used such words as “power” and “home”. And DeepText not only thinks like a man, but can learn.
DeepText is an internal tool engineers ‘ Facebook, which helps them to process large volumes of text, to create classification rules, and to give users relevant content. If you write something about the White house, the system will prompt you to read the latest news. But if in your comment the word “white” is near the word “snow”, the algorithm will know where to buy winter boots.
As soon as the Instagram user learned about DeepText, it was immediately realized that this system will help to solve the most annoying problem is the spam. People come to Instagram for photos, but often quickly leave because of the huge number of bots (and sometimes people) to advertise any products, begging for likes and subscriptions, or just insulting everyone.
First, Instagram has hired a special team of moderators that had to dismantle the comments and highlight those which are spam. Now in this monotonous work they will be helped by specially trained machine. 80% of the data will be redirected to DeepText, whose algorithms to identify spam and remove it.
The system analyzes the semantics of each sentence in the comment, and also checks the account from which it was sent. If your photo left a comment, the person you are not subscribed, the algorithm considers that, most likely, it’s spam. The same goes for the duplicate comments — the system understands that they probably sent a bot.
To check how well the machine analyzes the text in comparison with people, DeepText allowed to handle comments not screened by live moderators. Instagram was pleased with the results of the experiment in October last year, began openly to use the new system. The comment spam had begun to disappear, because the algorithms DeepText cleaned up it like a robot vacuum cleaner.
The company said, as fewer spam thanks to a new tool, and does not reveal the internal principles of his work. In the end, is Instagram to talk about their system of protection, as spammers will find a way to work around it.
CEO of Instagram Kevin SISTROM was so happy with the new algorithm that instructed him to solve more complex problems — namely the moderation of comments offensive or that violate the community guidelines Instagram in any other way.
Now with such comments fighting: the moderator is examining a questionable comment and attribute it to one of the categories of unacceptable behavior (e.g., insults, racism or sexual harassment). Usually the moderators know at least two languages. Overall, they analyzed some two million comments, and each comment is evaluated at least two times.
Initially the staff of Instagram tested the algorithm only on their own phones, because it needs completing. The system gives each comment a rating on a scale of 0 to 1, depending on how she is sure that this comment is offensive or unacceptable. Comments that have received a rating above a certain threshold are removed automatically.
29 Jun Instagram announced the official launch of a new spam filter and the moderation of inappropriate reviews. Now if you type some rude comment, the system will remove (the author of the review will continue to see him, but the rest — no). The algorithm will automatically handle the news feed of users, but it can be disabled in settings.
While the new system of moderation processes only English-language comments, but the company said it intends to expand its functionality. In the future the algorithm will be able to analyze comments in nine languages — English, Spanish, Portuguese, Arabic, French, German, Russian, Japanese and Chinese.
However, it is foolish to hope that the system will be able to completely solve the problem of negative reviews — in the end, it’s the Internet. In addition, the probability that the algorithm will delete innocuous comments. According to Thomas Davidson, who worked on a similar algorithm for Twitter, this is a very difficult problem. The machine certainly smart, but sometimes they don’t understand the subtleties of context.
According to statistics, the algorithm makes a mistake 1% of the cases, still it is not perfect.
Kevin Systrom, CEO of Instagram. Photo: Mashable.
“This is a classic problem — commented Kevin Systrom. — If we make a system that clearly responds to the curse words, it would be wrong to perceive innocuous phrase. If you and a friend swear in jest, Instagram needs to understand this. We do not want the system to block something that is not necessary. This will happen, but maybe this error should forgive for the sake of the number of really horrible reviews that were deleted? We don’t want to deprive people of freedom of speech. We don’t want to bother friends to joke among themselves. We just want to solve the problem of negative comments on Instagram”.
If the Sister is right, and his algorithm really works, then Instagram may be the most friendly place on the Internet. Maybe it will seem too right. Who knows, maybe then the algorithm will delete talking about politics or friendly chatter.
Sistromo to wonder how he will behave in the future. “Machine learning allows the system to understand all the nuances better than the previous algorithms and even than people said Systrom. — We need to understand the uncertainties and after some time to re-evaluate the effectiveness of the algorithm: we will see whether it benefits. If we see that the system gives trouble and is bad, we’ll fix it and think of something new.”.
Materials on the subject:.
Six characters that you write negative comments, and how to beat them.
38 Instagram-tricks, which not everyone knows.
These people do know that breaking the law? As Instagram has become the biggest marketplace for counterfeit.
How artificial intelligence helps Facebook to fight terrorism.