Apple Conducted AI Testing to Improve App Store Search Results

Publication Date: 07.03.2026

Rate this article:

4.6/5 ( 58 votes )

Table of Contents:

Apple researchers conducted an A/B test to measure the impact of AI-generated relevance labels on App Store search rankings and app downloads. Here are the results they found.

AI-generated relevance labels slightly improved App Store search conversions

In a new study titled "Scaling Search Relevance: Augmenting App Store Ranking with LLM-Generated Judgments," a group of Apple researchers investigated whether LLMs could help improve App Store search results. This involves generating relevance labels used to train the ranking system.

As mentioned in the study, relevance is a key element in helping users find the apps they are searching for. While there are many signals that can contribute to search ranking, the researchers focused on two main signals:

Behavioral relevance reflects how users interact with the results; for example, whether they click on or download an app.
Textual relevance measures how meaningfully an app's metadata (such as name, description, and keywords) matches a user's search query.

In the study, the researchers note that while there is a wealth of data available on behavioral relevance (as it can be easily measured), the same is not true for textual relevance:

While behavioral relevance labels are abundant, textual relevance labels produced by human judgments are much rarer. This creates a fundamental problem: high-quality textual relevance labels are scarce and expensive to produce, creating a bottleneck in scalability and giving weak power to the textual relevance objective.

To overcome this issue, the researchers fine-tuned a 3 billion parameter LLM on existing human judgments so that it could learn to assign relevance labels to apps based on a user's search query and the app's metadata.

Subsequently, they generated millions of new relevance labels with this model and retrained the App Store ranking system using both the original data and the labels generated by the LLM.

After completing this process, they conducted an offline evaluation and then performed a global A/B test on live App Store traffic:

“(…) The llm-augmented model showed a statistically significant +0.24% increase in the conversion rate, defined as the ratio of app downloads to at least one search session, our primary metric. While this number may seem small, it is considered a significant improvement for a mature industrial ranking system. This gain was observed in 89% of showcases.”

In other words, users who saw search results ranked using the LLM-augmented model downloaded apps 0.24% more than those who saw results presented by the traditional ranking model.

And while a 0.24% increase may seem very small, considering that total App Store downloads are projected to be around 38 billion by 2025, this scales quite rapidly. In practice, this could mean tens of millions of additional downloads from App Store searches, which developers would certainly appreciate.

Follow this link to read the full study.

Accessory Deals on Amazon

Logitech MX Master 4
AirPods Pro 3
AirTag (2nd Generation) – 4 Pack
Apple Watch Series 11
Wireless CarPlay Adapter

Comments

(10 Comments)

TY

Tuna Yıldırım

The results of this AI test are really interesting. Even a small increase can make a big difference.
SK

Seda Korkmaz

The development of the ranking system in the App Store can enhance the user experience. I am looking forward to it!
MŞ

Mavi Şimşek

I am more curious about how the labeling process works with AI. Is there detailed information available?
EA

Elif Aydın

A 0.24% increase may seem small, but it is a significant gain on such a large platform.
KG

Kerem Güngör

These kinds of innovations offer great opportunities for developers. I hope more improvements come.
ZA

Zeynep Arslan

The increase in app download rates may lead to better applications emerging for users.
EÇ

Ekin Çelik

AI's role in this field excites me. How will it affect future application development processes?
BD

Berkay Demirtaş

I need more information to understand how the ranking system in the App Store works.
DK

Derya Kılıç

Such research shows how fast changes are happening in the tech world.
MY

Mira Yalçın

The results are very interesting; I hope these kinds of tests continue and better results are achieved.