During Spotify’s second quarter earnings call this morning, CEO Daniel Ek teased some of the ways the streaming service could introduce additional AI-powered functionality. Ek alludes to how AI can be used to create more personalized experiences, streamline podcasts and create ads.
Earlier this year, the company launched a DJ feature that brings you a curated selection of music along with AI-powered spoken commentary about the tracks and artists you love. Ek said consumers can expect to see similar AI-powered features aimed at contextualizing and personalizing content across streaming services in the future.
“DJ is a phenomenal product,” Ek said on the call. “This is probably one of my personal favorites over the last few years that we’ve developed, and we’ve seen very strong consumer engagement with it. And that just speaks to our ability to contextualize and personalize all of the amazing content that we have on the Spotify platform. So I think you’re going to see a lot more things where we can contextualize and personalize content across platforms to make it more accessible.”
One of the ways Spotify uses generative AI is by leveraging it to summarize what podcasts are about, since it can be a bit difficult to jump into new podcasts. Ek says doing so could make it easier to trade new podcasts for consumers, which in turn will lead to higher engagement and more growth for content creators.
Ek says another way AI can be used to make Spotify more efficient is through AI-generated audio ads.
“By using our generative AI and tools here, I think you’ll be able to see that we can significantly reduce the cost advertisers need to develop new ad formats,” said Ek. “And that obviously means that you as an advertiser instead of having one ad, you can imagine having thousands of ads and being tested across the Spotify network, things that you can easily do today with text but you can’t do yet with video or audio.”
Ek’s comment came while Spotify was looking for a patent for an AI-powered “text-to-speech synthesis” system. The patent was published on July 20 and filed again in February. The technology can take text and turn it into human-like speech audio that mixes emotion and intent. The system can create realistic utterances capable of conveying emotions such as anger, happiness or sadness, along with intentions such as sarcasm. It is also capable of doing so in a whisper or shout, and with an accent.
The patent indicates that Spotify wants to go beyond its DJ feature, which simply speaks a few AI-powered lines between songs. Text-to-speech synthesis systems have the potential to be used for things like narrating audiobooks using AI in natural-sounding ways. It’s worth noting that Apple rolled out AI-powered audio narration for select titles in Apple Books earlier this year.
The patent filing and Ek’s comments come as Spotify has invested heavily in AI voice technology. Last year, the streaming service acquired Sonantic, a London-based startup that has built an AI engine to create realistic-sounding human voices from text, for an undisclosed amount. Spotify leverages the acquisition to power its AI DJ feature.
Spotify’s second quarter earnings reveal that there are currently 220 million paying Spotify subscriptions worldwide — that figure is up 17% year-over-year. Overall, Spotify now has 551 million monthly active users. The company reported revenue of nearly €3.2 billion ($3.5 billion at today’s exchange rates) for the most recent quarter. Revenue was up 11% year over year. However, Spotify also reported an operating loss of €247 million ($274 million).
Spotify also announced a price increase for its premium plan yesterday. In the US, individual premium plans will now cost $10.99 per month instead of $9.99. The duo plan will cost $14.99 instead of $12.99 and the family plan will cost $16.99 instead of $15.99.