Yandex database leak: Everything we know

The latest code fragments leaked from Russian search engine giant Yandex have sent shockwaves through the SEO community worldwide. As reported around news agencies, nearly 50 GB worth of stolen data from the world’s fourth largest search engine was leaked to the public. According to experts, the leak from the company will provide some interesting insights into how search engines work and how the SEO market will be affected by it.
The leak took place sometime on January 25. Several files allegedly stolen last July from the company’s repository from February 2022 were part of the leak. Incidentally, the creation of the depot coincides with the time when Russia invaded Ukraine. The source code files were allegedly leaked by a disgruntled former employee of the Russian tech giant.
The leaker posted a magnet link claiming to be “Yandex git sources”. The code stores allegedly contained all important source codes for Yandex’s services. Following the development, the company issued a statement saying: “Yandex was not hacked. Our security service found code fragments from an internal repository in the public domain, but the content is different from the current version of the repository used in Yandex services.” The company also said it is conducting an internal investigation into the causes that led to the leak.
What is the Yandex leak about?
Although the company continues to brush off the code leak that occurred via Torrent, there may be a lot of useful information about how Yandex operates its search engine. The torrent has not provided any data other than the source code of all Yandex services. However, several SEO experts have taken to social media to share their findings.
On his website, Arseniy Shestakov, co-founder of the game development company Hack The Publisher, published the list of major Yandex services whose source codes were part of the leak. The list includes search engine and indexing robots; Maps similar to Google Maps and Street View; Alice – a voice assistant like Alexa; Taxi – an Uber-like service; Directly similar to Google Ads; email service – Mail; file storage service – Disk; Travel, a travel service similar to Booking.com; Yandex360 a service similar to Google Workspace; Pay – a payment processing service such as Stripe; Metrika – a service similar to Google Analytics. The latest code leaks reportedly include all of these services.
Based on the documentation available in the public domain, Yandex’s codebase was combined into a single large repository called Arcadia in 2013. The leaked codebase is essentially a subsection of all projects that fall under Arcadia. Components related to search engines such as Kernel, Search, Robot, Library, etc., were found among the leaked files.
How might the Yandex leak affect the SEO industry?
Ever since the leak, the SEO industry has given mixed signals with some hailing it and others hardly labeling it as consequential. The leaks contained 1,922 search ranking factors which, according to SEO expert, Alex Buraks, are the most interesting part for the SEO community.
You’ve probably heard of Yandex, it’s the fourth largest search engine by market share worldwide. Yesterday, the proprietary source code of Yandex was leaked.
The most interesting part for the SEO community is: the list of all 1922 ranking factors used in the search algorithm
[🧵THREAD] pic.twitter.com/6x82AAmbON
— Alex Buraks (@alex_buraks) 27 January 2023
Igor Rudnyk, an SEO expert from Ukraine, took to his Twitter account to list his top takeaways for backlinks from Yandex leaked files. His takeaways from the episode include – emphasis on the growth of referring domains and backlinks; the importance of the number of links from the main pages; the importance of anchor text and exact word order in URLs; long text without links is unfavorable; traffic from Wikipedia is important; local backlinks are key to country SERP, etc.
What can we learn about backlinks from Yandex leaked files?
Yandex is a TOP-5 search engine in the world and now you can check their public list of 1922 ranking factors
I have read them all
So let’s dive deep into my list
Thread🧵
— Igor Rudnyk🇺🇦 (@IRudnyk) 29 January 2023
Yandex vs. Google
Yandex and Google are similar in theory, as they follow similar algorithms. According to Buraks, Yandex uses PageRank in the same way as Google and it consists of several similar text algorithms. Yandex was built as an analogue of Google, and SEO specialists in Russia implemented similar white hat SEO techniques for Yandex and Google.
Although there are many technical differences, the approach and key ranking factors appear to be similar according to Buraks. There appears to be a 70 percent match between the search results on Google and Yandex. When it comes to market share, Yandex is closer to Yahoo and Bing.
Yandex was founded by Arkady Volozh, Arkady Borkovsky and Ilya Segalovich in 1997. In addition to being a search engine, it offers several other Internet-related products and services.
The latest leak from a Russian company as big as Google, Amazon or Netflix comes at a time when Russia is facing an unprecedented increase in cyber attacks. In a recent survey released by Swedish VPN service company Surfshark, Russia was found to be the nation with the most cyber breaches in the world in 2022.