How ChatGPT’s new model could transform podcasting and public media

Print More

Mike Janssen, using DALL-E 3

In a groundbreaking announcement this week, OpenAI introduced GPT-4o, the latest iteration of its language model, now boasting enhanced capabilities across multiple modalities. The “o” in GPT-4o stands for “omni,” emphasizing its proficiency in real-time reasoning across audio, vision and text.

This development marks a significant leap forward for public media newsrooms and the podcast industry, offering the potential for streamlined multilingual communication and audio analysis in the near future.

Scenario 1: KUNM Audio Story

I tested GPT-4o on KUNM’s website by translating an English-language news story into Spanish and German. The AI handled the task, showcasing its potential to facilitate real-time multilingual communication. Clearly, we need humans involved in every step and native speakers to check the translation. 

For public radio producers, this capability can be transformative, enabling content to reach and resonate with a broader, more diverse audience almost instantly. While these are early days, the potential is significant. After initial trials, some organizations may choose third-party AI tools, while others might invest in dedicated teams. 

It’s crucial to select languages that resonate with your community and to collaborate with language experts for several months to test different tools. This ensures that you will select a solution that is both accurate and ethical. By “ethical,” we mean ensuring that voice actors behind synthetic voices are compensated, or that the synthetic voice effort is community-led and open-source.

Scenario 2: Helen, My Grandmother

I shared a cherished 1958 recording of my late grandmother and asked GPT-4o to translate it into Persian. The AI not only translated the content but preserved some of the emotional nuance. While the accent wasn’t perfect, the overall result was impressive. This highlights GPT-4o’s ability to preserve and share oral histories and personal stories across languages and cultures, a feature invaluable for public radio stations focusing on global heritage and cultural preservation.

Scenario 3: ‘The Long Sought Podcast

For the third example, I tested GPT-4o with a podcast episode, requesting a summary and translation into French. The AI adeptly provided both, demonstrating its versatility in handling various audio formats and content types. 

Scenario 4: ‘The Times of Karachi

I used a text news story from the Times of Karachi, asking GPT-4o to translate it into Urdu. The AI delivered a translation and even suggested ways to verify and improve its output through feedback from native speakers. This collaborative approach is crucial for maintaining journalistic integrity, particularly when working with international partners.

Scenario 5: Reciting Rumi

In the final test, the AI was able to communicate with me in Persian, sharing a famous poem by Rumi.

Piloting GPT-4o in public radio newsrooms

Public media companies should start piloting audio AI technologies to leverage their efficiency and speed in processing and translating content almost instantaneously. As translations become more reliable, GPT-4o enables media outlets to break language barriers and engage with a global audience. Public radio stations can now offer content in multiple languages, broadening their reach and impact.

However, while GPT-4o is very good, it is not infallible. Human oversight is necessary to ensure translations are contextually appropriate and culturally sensitive. Newsrooms must implement quality checks to maintain high standards. 

Additionally, AI models like GPT-4o consume significant energy and water resources for data center cooling. Adopting sustainable practices, such as limiting how often and when you use these tools, is essential to mitigate these impacts.

Finally, while automation might displace traditional roles, it also creates new opportunities in AI oversight and integration. Newsrooms can explore new roles focused on managing and optimizing AI tools, ensuring a balanced approach to technology adoption.

With that in mind, here’s a basic team structure for incorporating AI in your newsroom. The New York Times recently announced a similar core team but without the audio elements as a standout role:

  1. Editorial Director of AI Initiatives
    • Collaborations: Works with machine-learning engineers, design editors and audio specialists to integrate AI tools.
    • Leadership: Aligns AI projects with editorial goals.
  2. Senior Machine-Learning Engineers
    • Collaborations: Coordinate with the editorial director and design editor to develop visually appealing, functional AI tools and integrate AI-enhanced audio. Works to make sure business goals are aligned.
    • Technical Leadership: Develop AI prototypes addressing journalistic challenges.
  3. Senior Design Editor
    • Collaborations: Work with engineers and audio specialists to ensure AI tools are user-friendly and seamlessly integrated.
    • Creative Input: Design AI applications that align with the newsroom’s brand.
  4. Audio Specialist with Synthetic Voice Expertise
    • Collaborations: Work with journalists, engineers and design editors to implement synthetic voice content.
    • Innovative Applications: Utilize audio tech for diverse and engaging storytelling.

This structure emphasizes the collaborative dynamics essential for leveraging each member’s strengths in developing innovative and ethically sound AI applications in journalism. It fosters interdisciplinary cooperation and ensures meticulous alignment of AI integration with the newsroom’s goals.

A personal note

Navigating the intersection of technology and media, tools like GPT-4o reveal the transformative potential we hold. They not only help us reach global audiences but also preserve and share voices and stories that matter.

The AI and media startup I founded, TulipAI, offers training programs for newsrooms that emphasize responsible AI application, hands-on experience and advanced techniques for AI-driven insights. We focus on AI-driven audio innovations, leadership in AI implementation, and sustainable practices to minimize environmental impact. Upcoming sessions include virtual and in-person workshops in collaboration with the Centre for Excellence in Journalism and the Online News Association.

Join us:

  • Online News Association July 24: AI Audio Innovations: Transforming Content Creation and Engagement
  • Online News Association Aug. 15: Mini Lab: AI Tools for Audio Journalists

Davar Ardalan is an AI and Media Specialist known for her leadership in Cultural AI initiatives. With experience at National Geographic, NPR News, The White House Presidential Innovation Fellowship Program, IVOW AI and TulipAI, she has championed principles like fairness and cultural preservation in AI. Ardalan has also developed AI training and educational workshops. As the leader of TulipAI, she focuses on AI and cultural heritage preservation. She serves as a Webby Award Judge for AI and is researching a book on AI and community, underscoring the importance of harnessing AI not only for innovation but also for fostering stronger, more connected communities. 

This content was created with the help of artificial intelligence, which helped organize the narrative, check grammar, and summarize important information to improve clarity and flow.

Leave a Reply

Your email address will not be published. Required fields are marked *