How AI tools could be used to reimagine live audio

More

Mike Janssen, using DALL-E 3

As a longtime public media producer, storytelling technologist and now AI strategist, it’s only natural for me to imagine how AI tools can help the audio craft I honed for many years. 

Given my interest in AI, I revisited my notes from when I was in charge of NPR’s Weekend Edition Sunday and compared them to the show’s rundown to understand our process. The early morning of Aug. 17, 2008, was a whirlwind at NPR as the team raced to get Weekend Edition Sunday on the air. 

The director for the day was stretched thin as he juggled directing and mixing a reporter’s piece, a task normally handled by another producer. Production Assistant (PA) 1 was deep into finalizing a package on race and politics when a problem with tracks for a story by another reporter forced her to retrack and remix on the fly. 

PA 2 was similarly swamped, working to finalize multiple segments, while a producer managed the DACS and stepped in to assist with other pieces. Meanwhile, our editorial assistant faced a critical moment when tracks from a story from a third reporter went missing, causing panic in the newsroom. 

As the deadline loomed, another producer’s computer stalled, forcing her to switch to mine, while the second producer frantically trimmed a reporter piece. The pressure peaked as the team searched for a reporter’s tracks, finally locating them just in time. The team’s quick thinking and collaboration pulled them through, demonstrating the relentless effort required to deliver a live show on time.

This experience underscores how AI, whether through off-the-shelf products or proprietary tools, could significantly reduce pressure in live production environments by automating routine tasks such as track retrieval and mixing, which are often time-consuming and prone to last-minute issues. 

By allowing for piloting and experimentation with AI tools, you can better understand how they can help streamline workflows, prevent crises and enable your team to focus more on creative and strategic aspects of production. This approach ultimately leads to smoother broadcasts and more innovative content, enhancing both productivity and program quality.

TimeEventHuman Role
OvernightFive pieces arriving overnightManaging and organizing incoming pieces
Early MorningHost piece finished Sunday morningFinishing and filing host piece
6:30 AMOne reporter piece mixedMixing piece, staff overloaded
6:45 AMSecond reporter piece filedManaging file tracking and ensuring accuracy
7:00 AMProblem with 2nd reporter tracks - re-trackingRe-tracking due to file error
7:15 AMThird reporter piece edited and filedEditing and filing piece
7:30 AMDACS finished and sentManaging DACS and sending
7:45 AMFourth reporter re-tracks to fix errorCorrecting tracking errors and ensuring accuracy
7:55 AMMove to mix on Davar's computer due to technical issuesTroubleshooting technical issues with mix
8:15 AMReporter tracks found after technical issueLocating and organizing tracks

This intense morning highlights the importance of meticulously documenting processes. By carefully tracking each step of the workflow, hosts, producers, managers and audio engineers can collaborate to identify areas where generative AI tools could streamline routine tasks.

AI has the potential to optimize many repetitive tasks, such as scheduling, file management and audio editing, reducing the risk of technical glitches and freeing up energy for crafting compelling stories.

By streamlining these processes, AI allows producers to focus more on the creative and strategic aspects of their work, ultimately leading to smoother broadcasts and more innovative content. For public media managers, this means a more efficient workflow and a better allocation of team resources, enhancing both productivity and program quality.

TimeEventHuman roleAI efficiencySuggestions for producers and managers
OvernightFive pieces arriving overnightManaging and organizing incoming piecesAutomatically manage and organize incoming pieces with human oversight for quality controlIdentify and document routine tasks in detail to understand where AI can be beneficial
Early MorningHost piece finished Sunday morningFinishing and filing host pieceCollaborate with technical teams to integrate AI in areas where it can handle repetitive tasks
6:30 AMOne reporter piece mixedMixing piece, staff overloadedAssist in mixing, preventing staff overload while ensuring human editorial oversightEnsure that AI tools are used to augment the team's work, not replace critical human elements
6:45 AMSecond reporter piece filedManaging file tracking and ensuring accuracyManage file tracking and error detection, allowing humans to focus on storytelling and narrativeContinuously monitor AI performance and adjust its role to align with editorial standards
7:00 AMProblem with 2nd reporter tracks - re-trackingRe-tracking due to file errorIdentify and correct file errors with voice AI before air, with humans approving the final outputPermissions from reporters to use their voice AI for last minute retracks/Reducing workload while maintaining quality
7:15 AMThird reporter piece edited and filedEditing and filing pieceAssist in editing and automatically file pieces, with humans making the final creative decisionsFoster a culture of innovation where AI is seen as a partner in the creative process
7:30 AMDACS finished and sentManaging DACS and sendingAutomate DACS management and sending in real-time, while humans finalize and ensure accuracyRegularly assess the impact of AI tools on production efficiency and content quality
7:45 AMFourth reporter re-tracks to fix errorCorrecting tracking errors and ensuring accuracyAutomatically correct tracking errors, with humans verifying fixes for accuracyIncorporate feedback from producers and editors to refine AI's role in the workflow
7:55 AMMove to mix on Davar's computer due to technical issuesTroubleshooting technical issues with mixPrevent technical issues by managing resources, with humans troubleshooting if neededStay informed on emerging AI technologies to continuously improve the production process
8:15 AMReporter tracks found after technical issueLocating and organizing tracksAutomatically find and organize tracks, allowing humans to focus on content and narrative structureDevelop guidelines to ensure AI is used ethically and effectively in content creation

Integrating AI successfully requires a shared commitment to using it as a tool that enhances, rather than replaces, the human touch in content creation. As I’ve discussed, integrating AI ethically is imperative. We must ensure that it enhances storytelling, upholds journalistic integrity and respects cultural diversity.

Public media must lead by example, creating proprietary AI tools that reflect community values. The controversy surrounding OpenAI’s “Sky” voice, which mimicked Scarlett Johansson without the actor’s consent, underscored the need for transparency and ethical considerations in AI. 

Here is a summary of how generative AI could be used for show production:

  1. Automated Transcription and Editing:
    • Tools like Otter.ai and Descript provide instant transcriptions and suggest edits, speeding up production. They reduce manual effort, allowing producers more time for creativity.
  2. Audio Quality Optimization:
    • AI tools like Podcastle.io can automatically adjust sound levels and detect issues. They prevent disruption by ensuring high audio quality from the start.
  3. Track and File Management:
    • A proprietary AI could organize and label tracks, ensuring that files are easily found and correctly managed. It could prevent delays like the one caused by missing tracks.
  4. System Monitoring:
    • A proprietary AI could monitor editing stations and suggest alternatives if hardware fails, reducing the risk of technical issues.
  5. Voice AI Tools for Text-to-Speech:
    • We could work with correspondents and hosts to create AI voices, ensuring proper compensation. These voices could be used for retracks on tight deadlines, maintaining quality even under time constraints.
    • AI-generated voices could be created from recordings of hosts and reporters, with their permission. Funders’ messages and commercials can be created using voice AI tools, maintaining quality even when time is limited. For example, NBC used AI to recreate Al Michaels’ voice for daily Summer Olympic Games recaps for subscribers with his permission. 
  6. AI Scheduling Assistants: 
  • Tools like Google Calendar’s AI features could automate scheduling and reminders for interviews and pre-tapes, ensuring timely preparation and coordination.

While the potential of AI is exciting, I approach it with a healthy dose of skepticism. All of this is new, and while the possibilities are promising, the time to pilot and test these tools is now. 

It’s essential that we not only imagine how AI can support our work but also rigorously evaluate its impact to ensure that it truly enhances our processes without compromising the quality that audiences expect. This is an evolving landscape, and our role is to lead with caution, innovation and a steadfast commitment to the craft we’ve spent years developing.

TimeEventHuman Role
OvernightThree pieces arriving overnight, AI-assistedSupervising AI file management, ensuring quality
Early MorningHost piece finalized with AI editing toolsFinalizing content, focusing on narrative flow
6:30 AMReporter story mixed with AI supportOverseeing AI-supported mixing, making creative decisions
6:45 AMSecond piece auto-filed by AIReviewing AI filing, ensuring accuracy
7:00 AMAI identifies minor error in reporter tracks, automatically corrects w/voiceAIApproving AI corrections, making final content decisions
7:15 AMAnother reporter piece edited and filed with AI assistanceFinalizing edits, concentrating on storytelling
7:30 AMDACS automatically generated and sentReviewing AI-generated DACS, confirming all elements
7:45 AMAI rechecks piece, final adjustments made by the producerFinal editorial supervision, making creative adjustments
7:55 AMAI resolves minor technical issue with reporter's mixTroubleshooting with AI, focusing on narrative integrity
8:15 AMReporter tracks auto-organized by AI, ready for final reviewFinal review and approval, planning for future projects

To set the stage for the discussion of AI and audio production, it’s important to recognize that AI has made significant strides in recent years, particularly in generating realistic audio and music. These advancements allow AI to produce everything from lifelike voiceovers to complex musical compositions, transforming industries like entertainment and advertising. However, these new capabilities also introduce challenges that must be addressed to ensure that the technology serves all communities equitably.

One of the most pressing issues is the lack of diverse training data for AI models, which has significant implications for public media stations across the country. Stations represent diverse communities, yet the audio training data used by AI systems often does not represent this diversity. This means that when you use an audio generation or music generation platform, you may notice a lack of cultural variety in the outputs, which can result in content that fails to accurately reflect the richness of global cultures.

Moreover, there is often a troubling lack of transparency regarding the sources of training data. Was it pulled from platforms like YouTube? Was it scraped from across the internet without regard for cultural context or ethical considerations? The opacity surrounding where and how this data is collected raises concerns about the authenticity and ethical integrity of AI-generated content. This gap in diversity and transparency is not just a technical oversight; it carries significant cultural implications, as it risks perpetuating stereotypes, erasing important cultural nuances and ultimately diminishing the quality of content produced for diverse audiences.

Recognizing this, while leading my startup TulipAI, I initiated CulturaFX, a research collaboration with Florida Gulf Coast University software engineering students. This project was designed to directly address the issue of cultural authenticity in AI-generated audio by ethically sourcing and curating diverse datasets. For example, in the case of mariachi music, we proved that the AI model was not trained on recordings that truly reflect the cultural depth and variation within this genre.

This is a great opportunity for public media. By fostering collaboration with cultural experts, public media can come together to create an open-source platform to preserve the authenticity and richness of these sounds, enhancing the quality and relevance of AI-generated audio in public media and beyond.

The risk of cultural insensitivity or inaccuracy in AI-generated content is real, and it’s our responsibility to ensure that these tools are used thoughtfully and responsibly. As we continue to integrate AI tools like GPT-4o into our workflows, we must remain vigilant about these ethical concerns. GPT-4o’s enhanced multimodal capabilities offer exciting possibilities for multilingual communication and cultural preservation, but they also underscore the need for human oversight. 

The evolution of our work with AI is not just about efficiency; it’s about enhancing our ability to tell stories that are true to the diverse cultures and communities we serve. Whether you’re directly involved in AI projects or just starting to explore these tools, it’s essential to engage with these ethical challenges. Don’t just be enamored by these tools or afraid of them — get involved and do the hard work of making them better, more ethical and more transparent.

To cultivate a more thoughtful and responsible integration of AI in creative processes, we must engage in pilot projects, foster meaningful debate and ask critical questions. This approach will ensure that our technological advancements enhance inclusivity and cultural sensitivity in public media.

Davar Ardalan is an AI and Media Specialist known for her leadership in Cultural AI initiatives. With experience at National Geographic, NPR News, The White House Presidential Innovation Fellowship Program, IVOW AI and TulipAI, she has championed principles like fairness and cultural preservation in AI. Ardalan has also developed AI training and educational workshops. As the leader of TulipAI, she focuses on AI and cultural heritage preservation. She is offering an upcoming course about AI and audio that will cover how to leverage AI for content creation, historical reenactments and more while mastering techniques to enhance sound quality and produce multilingual, ethical and culturally rich audio. 

This content was crafted with the assistance of artificial intelligence, which contributed to structuring the narrative, ensuring grammatical accuracy, summarizing key points and enhancing the readability and coherence of the material. 

One thought on “How AI tools could be used to reimagine live audio

  1. You should also add to the list of AI tasks that of curating and creating playlists of past archived materials on a given subject.

    Also, you might want to add to metadata GPS coordinates relevant to the story…so if you do a later archival search, you don’t just search for a city name, but rather an area around a city as you expand the diameter of the search.

Leave a Reply

Your email address will not be published. Required fields are marked *