Instant Translation, Lookahead Scrubbing, and More: The Future of Google Podcasts
Google has a new podcasting strategy that completely reimagines how people find and listen to shows. And Google’s podcasting team has set a bold and game-changing goal: to double podcast audiences globally. This is the final installment in our exclusive five-part series.
In Case You Missed It:
- Part one — the bold plan of Google’s podcasting team to double worldwide audiences by focusing on audio search and Android
- Part two: Google’s plan to make audio a first-class citizen
- Part three: Google’s new way to find your next podcast
- Part four: Google’s plan to deliver the right audio at the right time
Understanding Podcasts
According to Google Podcasts Product Manager Zack Reneau-Wedeen, in the future, Google will have the ability to “transcribe the podcast and use that to understand more details about the podcast, including when they are discussing different topics in the episode.
“It’s important to say that this technology is still improving, and some of our vision here is probably a little more long-term than what we’ve talked about so far. Still, it’s an exciting motivator for us to try to make these experiences possible.”
Imagine this:
Google’s AI “listens” to every podcast published, converts all spoken word content into timestamped, searchable text, and indexes the contents of every episode. All the content of all episodes of all podcasts become searchable, sort of like a text article. And not just the entire episode: by analyzing podcast transcripts and/or publisher-created chapter markers, Google could begin to understand specific segments or topics within episodes.
In the future, Google Search and Google Assistant could allow listeners to go beyond finding the right episode of a podcast. It could help them jump straight to the right section that is of interest to them. This could be particularly useful on a smart speaker like Google Home, when a user may want a specific answer to a voice query and might prefer a specific piece of audio content as an answer instead of an entire podcast episode.
Zack gave an example:
“There’s this great episode of You Made it Weird with Pete Holmes, where [Green Bay Packers’ Quarterback] Aaron Rodgers talks with Pete about all sorts of things, including that he tried ‘The Impossible Burger’ and thought it was very tasty.
“Suppose you’re a Packers fan and you asked a smart speaker, ‘How does The Impossible Burger taste?’ What if you actually got Aaron Rodgers telling you what he thinks of The Impossible Burger? Right now most smart speakers have computerized voices, which are getting better and are already much better than they used to be, but hearing it from a voice that you recognize and a personality that you’re familiar with and trust could be a really cool experience.”
If desired, podcasters could help Google make the impossible (burger) possible. Potentially thinking about your episodes in ‘segments’ might be a start. For example, are there a clear starting and ending points to different topics, and would those segments make sense to a listener if they only heard that isolated piece of audio? This would hypothetically make it easier for Google to help listeners find specific content they are looking for inside an individual episode.
This is an amazing opportunity for podcast discovery.
Google’s speech-to-text technology will likely open up new possibilities for podcasters, but not all publishers will want Google to index and automatically transcribe their episodes. Zack says Google’s podcast technology already obeys industry-standard robots.txt files, so publishers could easily opt-out if they so chose.
Google’s developer documentation explains the best practices today for structuring your podcast website and RSS feed.
Aside from adding metadata that would make it easier for search to include segments of your shows as results, there is also a content strategy decision to evaluate:
Podcasters might start thinking about the questions they want their podcast episodes to be the answer for.
New Feature: “Lookahead Scrubbing”
When you have powerful voice-to-text AI and you can translate every podcast episode into text, then map that text to exact timecode in the podcast audio, new user interface options become possible. For example:
“My favorite little feature in my head right now that we’re interested in exploring is ‘lookahead scrubbing,’” Zack explains. “If you watch YouTube videos or videos on any platform, when you scrub forward, it’ll show you a preview frame of where you’re scrubbing to. If you’re browsing through or seeking for a specific part in an hour-long podcast, you’re kind of scrubbing blind, and so you have to stop to hear what they’re saying. If we can know the words being said at each point, we can give you a preview as you scrub, sort of as though it’s a video, so you can navigate more precisely and more quickly to your desired location. It sounds like a tiny thing, but this is actually a pretty common action. We think it could resonate with people if we can make it easier in an intuitive and futuristic way.”
Translation
Unless you speak every language, you are going to miss out on a lot of world’s great audio content. Enter Google Translate and another future possibility from Zack:
“Imagine if, as a podcaster, you could opt into publishing your podcast in every language, automatically. The technology is still improving, but the way it would work is speech recognition could generate a transcript of the podcast, which would then be translated into another language, and synthesized into speech in the language of the user’s choice. A young girl in Jakarta would be able to listen to Radiolab, read in Bahasa by Jad and Robert’s voices!”
Google Translate and Google Pixel Buds already exist. Google Home and Google Assistant understand real-time voice commands already. The groundwork is being laid.
“This one sounds pretty futuristic, but all three of the required technologies are improving, and Google is at the center of things in each case. Speech recognition error rates have dropped precipitously over the past few years, Google Translate is using more and more AI, and WaveNet has made its way into production services like Google Assistant. At Google, it’s exciting how much we’re investing in hard, long-term problems like these. We think it can transform the listening experience for a lot of people, and help further democratize the podcast space. When we think about making the world’s information accessible, ideas like this are energizing.”
If automated translations of podcasts and contextual search become realities, Zack’s goal of doubling global podcast audiences might end up being pretty conservative.
What should podcasters ask themselves?
Well, to be honest, there is nothing podcasters should be asking themselves today about this stuff. Dynamic segmentation, lookahead scrubbing, and automatic translation are all very exciting future visions, but are not an actual Google product yet. However, if and when these ideas eventually do become reality, these are some of the questions we will be asking our future selves:
- Should I think about formatting my podcast into discrete segments or topics to make it easier for audiences to understand if it they hear it as an audio search result?
- Are there segments or portions of my show that would be a great fit for how people interact with smart speakers like Google Home, Homepod, Amazon Echo, etc.? If not, should I create segments or tag my show into segments so that this is possible?
- What are the implications of my podcast someday being listened to by people in other languages, countries, and cultures? How much context would I need to provide to make is accessible?
Summary: What Works Today and What’s Still To Come
Here’s the Google Podcasts strategy as it stands today:
- The Google Podcasts team wants to double worldwide podcast audiences in the next couple years.
- On Android, podcasts are now included in Google Search results with native play buttons.
- You can already subscribe to shows and add a shortcut to your home screen.
- There is already seamless ‘device interoperability’ for podcast listening between some Google devices like Google Home and Android phones. Over time the goal is to have this expand to all devices.
- Android will be a major source of growth for new podcast listeners and increasing podcast consumption. This will mean more diverse podcast audiences, and subsequently, more diverse content types and formats.
- Google is committed to supporting publishers and helping them succeed using a business model that works well for them.
There are a lot of big, exciting ideas that have the potential to transform the industry. And it’s just the beginning.
“We’ve started with a simple podcast experience for Android, integrated with Google Search and Google Assistant, and we think it’s an ideal jumping off point over the coming months and years to build toward our future vision.”
Again, that future vision could translate podcasts into any language, pioneer lookahead scrubbing in audio, and help smart speakers provide authentic answers to questions about Impossible Burgers.
This is bold, visionary, and very exciting.
One last bit of awesomeness about all this…
Apart from all the forward-thinking and non-traditional ideas in this strategy, I’m grateful and very pleasantly surprised that Zack and the team at Google are being so transparent with what they are building and why. It feels like an iterative, agile approach that will evolve with feedback from real users and the podcasting industry.
This is the final installment of our five-part series on Google’s new podcast strategy.
Check out all five installments here:
- Part one — the bold plan of Google’s podcasting team to double worldwide audiences by focusing on audio search and Android
- Part two: Google’s plan to make audio a first-class citizen
- Part three: Google’s new way to find your next podcast
- Part four: Google’s plan to deliver the right audio at the right time
- Part five: Instant Translation, Lookahead Scrubbing, and More: The Future of Google Podcasts
Sign up for the Pacific Content Newsletter: audio strategy, analysis, and insight in your inbox. Once a week.