Why Trust & Safety Matters in Audio Ads

Aya
Mar 3
4 min read

Updated: Mar 7

I recently downgraded my Spotify membership from Premium to clean up my long list of subscriptions. I love trying new apps, and it is always easy to sign up for subscriptions when I find one I like. Spotify has been one of the few apps I have paid for consistently over the years. But recently I became curious about what the ad-supported experience would be like. Honestly, it felt a little like betraying a longtime friend :( Still, it actually ended up making me think about audio ads, a format I am much less familiar with than display or video ads.

Before getting into audio ads, it is probably worth mentioning that I am generally quite positive about online advertising. Over the years, I have discovered an endless number of brands, products, and apps through social media ads that I genuinely added to my favorites. This is not just a position talk from my background I once worked in the social media advertising space. Even recently, my feeds seem to be increasingly filled with health-related ads, almost as if the algorithms are a little concerned about my health. Thanks to those ads, I discovered some amazing fitness apps and converted on two app install ads within the past month.

As you already know, my attitude toward advertising is generally optimistic. If ads are going to appear anyway, I would much rather see ones that align with my interests. I am therefore quite open to sharing certain data if it helps improve ad relevance. Advertising is often a trade-off between your privacy and personalization. Personally, I hate irrelevant ads dominating my screen and capturing my attention, so I lean slightly toward experience optimization and allow certain forms of tracking. Because of this, my overall advertising experience has become fairly predictable as it rarely surprises me or provokes any reaction.

Spotify ads felt different. Surprisingly, the very first ad I got was for sports gambling. If I encounter random ads in places where tracking is limited, I usually expect something like gaming or dating apps. Gambling felt like a new and much grayer category for me, so seeing that kind of ad intrigued my curiosity.

My first thought was that the ad targeting must not be optimized. So I went straight to check my account settings. On Spotify, users can configure data sharing for ad personalization directly in the account settings page.

My setting was ON. Interesting. Even with personalization enabled, gambling ads were still being shown. Spotify explains the data “Your use of Spotify over time” in its privacy policy. As the list of usage data suggests, the signals Spotify can derive from listening behavior are somewhat limited compared with platforms like Instagram or YouTube, where user interests are reflected through a much wider range of content consumptions. Music and podcast listening histories simply reveal fewer explicit preference signals.

At the same time, it is also possible that the advertising ecosystem on Spotify is still relatively narrow compared with large social platforms. Fewer advertisers competing for highly specific audience segments could mean that targeting remains coarse in certain areas. In that space where optimization is still imperfect, such aggressive categories like gambling ads may occasionally surface.

Discussions about gambling ads on Spotify have surfaced on platforms like X and Reddit as well.

However, there is a small piece of good news. In the account settings, Spotify provides an Ad preferences option that allows users to hide certain categories of ads. Some sensitive categories, including gambling, can be disabled manually if users prefer not to see them. I guess this is normally ON by default.

That said, this setting is buried fairly deep in the account settings, which suggests there may be room for improvement from a UI perspective. That is a discussion for another time.

What caught my attention instead was a different question: why audio ads might require stricter consideration when it comes to more sensitive or aggressive categories. That is what I want to discuss in this post.

Audio ad environments are fundamentally different from visual/text platforms. They have timing and placement sensitivity: they can interrupt flows abruptly (e.g., mid-playlist) and cannot be easily skipped by users. Audio listeners also often multitask or lack visual cues, such as when they are driving, exercising, or doing household tasks. As a result, when users encounter an unpleasant ad, the options such as skipping the ad or sending feedback like “Not interested in this ad” are not always easily available. Users are placed in a much more passive position when consuming audio ads.

In audio-only media, "context is hidden" in spoken words and tone rather than visible images or text. Content moderators and algorithms cannot "see" the ad and they must rely on transcripts or metadata, which may not be comprehensive or accurate. This makes detection of harmful/misleading content harder. Overall, these differences complicate brand-safety enforcement.

User Perception: Studies show audio ads can have strong recall and engagement due to the immersive listening environment. But this also means any brand-safety mishap can strongly impact listener trust in the brand and platform.
Harmful Claims Detection: In text or video, algorithms can scan for banned words/images. In audio, harmful claims must be detected via speech recognition and language models, which are still error-prone. For example, a host saying “this supplement cured my diabetes” in passing could slip past a keyword blocklist if ASR mis-transcribes it.

Together, these factors make audio advertising enforcement harder, not easier, than for visual media. The lack of on-screen context and the fluid nature of spoken content remove many automated safety nets. Brand-safety systems must compensate by adding layers of automated detection and human review specifically adapted for audio.

Comments