How NPR’s Research, Archives & Data Strategy team is saving sounds of the past for the future

Print More

Wanyu Zhang/NPR

RAD Chief Laura Soto-Barra holding a reel-to-reel tape in the NPR RAD digitization lab.

Librarian of Congress Carla Hayden recently observed that the “sounds of the past enrich our understanding of the nation’s cultural history and our history in general.” For over 46 years, NPR has produced high-quality journalism and programming in partnership with member stations located throughout the country. Members of NPR’s Research, Archives & Data Strategy team (RAD), known in the past as the NPR Library, have spent decades preserving and providing access to this important record of American cultural history.

NPR has been using sound to tell the stories of major events in American politics, culture and foreign affairs ever since the debut of its first program, All Things Considered on May 3, 1971. On that day, over 20,000 protesters gathered in Washington, D.C., to demonstrate against the Vietnam War. NPR reporters used portable analog recorders to capture the voices of protesters and the sound of helicopters, motorcycle engines and police sirens.


The on-the-spot reporting introduced a “new and different kind of voice on the radio.” NPR’s Susan Stamberg, who was in the newsroom that day, recalled “there we were, from the beginning, just ‘guerilla-radioing it’ in a way that had really never been heard before.”

Earlier this year, Hayden selected the debut episode of ATC to be added to the National Recording Registry in recognition of its “historical importance to American society and the nation’s audio heritage.” Thankfully this broadcast, along with thousands of other hours of programming in the NPR archives, was saved by generations of NPR librarians and data strategists.

Before the NPR Library was RAD

NPR RAD was formerly known as the NPR Library. There were in fact three different libraries: the Broadcast, Reference and Music libraries. The NPR Library began as a critical resource for NPR’s journalists even before ATC went on the air. The research provided by NPR reference librarians gave authority and context to NPR reporting and storytelling.

In 1972, NPR received a grant from the Corporation for Public Broadcasting to establish a broadcast tape library with the mandate to provide for the preservation of all NPR programs. NPR President Donald Quayle wrote in a memo, “It is our intention to make this library an active depository which will make programs and program materials available for production and broadcast purposes. It is also intended to be an archive in the academic sense for historical and scholarly purposes.”

NPR took a step towards the second goal when it decided in 1976 to make its broadcast archives publicly accessible through collaborations with the Library of Congress (LOC) and the National Archives and Records Administration. Variety reported that it marked “the first time any broadcast organization’s entire collection will be available to the public.” In addition to ATC, the NPR broadcast tape archive included coverage of congressional hearings, speeches recorded at the National Press Club, the weekly arts show Voices in the Wind and recordings of orchestral, operatic and jazz performances. In November 1979, Morning Edition joined ATC as one of NPR’s flagship news and information programs.

The sounds of all NPR programs were captured on 10 1/2 inch magnetic open reel tape. Every day, NPR librarians retrieved the master recordings, placed a paper rundown inside the reel-to-reel box and cataloged each program. Catalogers listened to each story and assigned subject terms from NPR’s own list, capturing information such as names, bylines and geographic locations. Over the next three decades, NPR librarians migrated program metadata from typed index cards to microfiche to database, ensuring that the archives remained discoverable and usable for NPR producers and journalists.

Today, the LOC holds NPR cultural programming produced between 1971 and 1992, and the University of Maryland (UMD) is the official repository for the rest of the NPR audio archives. The UMD Libraries’ National Public Broadcasting Archives hold wide-ranging resources documenting the history of radio and television public broadcasting. This collection includes NPR’s institutional archives: physical papers, documents and photographs that provide valuable context and information about the network and its programming. The curators at UMD administer the NPR collection for public research and study.

As NPR grew and evolved, the NPR Library added transcription and music librarianship to its core services. NPR began transcribing select programs Sept. 1, 1990, and as NPR embraced digital technologies, the NPR Library began the search for a content management system that could keep up with the changing needs of the newsroom and other stakeholders at NPR. An ambitious music digitization project started in 2007, inspiring the NPR Library to develop its own digital tools. Thus Orpheus, NPR’s internal music database, was born. In 2008, with their in-depth knowledge of NPR information needs and archival workflows, NPR librarians created a vision for a digital asset management system. This vision served as the basis for the creation of the first version of Artemis, the internal database for NPR-produced stories.

NPR’s 2013 move to its current headquarters spurred the next step in the evolution of the NPR Library. The new building intentionally lacked separate physical space for a library. Instead, NPR created a media collection room, with the librarians’ workspaces located throughout the building to be in closer proximity to colleagues in different departments.

A 1972 map of NPR’s interconnection system, discovered by the RAD team and preserved in the NPR Historical Archive.

During the moving process, a group of NPR librarians noticed that colleagues were leaving behind objects, papers and photographs. Some of the finds were of great importance to NPR history, such as a 1972 map of NPR’s interconnection system. Others represented NPR culture and helped to tell the NPR story from a different perspective, such as Audrey, a stuffed parrot that hung in the ME space for many years. NPR librarians saved these items and other notable documents and ephemera in the NPR Historical Archive.

In 2015, the NPR Library was rebranded as the Research, Archives & Data Strategy team (RAD) to more appropriately and accurately describe the team’s broader and deeper core duties and functions. Today, RAD team members are product owners, taxonomists, researchers, archivists, historians, trainers, data analysts and developers. Led by RAD Chief Laura Soto-Barra and Deputy RAD Chief Mary Glendinning, RAD staff is embedded in NPR’s newsroom and RAD’s products are integrated with NPR’s core workflows and production tools.

From analog to Artemis

NPR’s digital audio archive, named Artemis in honor of the goddess of the hunt, was developed and designed by RAD to support the unique needs of NPR. In 2016, RAD restructured the Artemis database to be faster, more nimble and more agile. Artemis includes metadata, audio and transcripts for NPR programming dating back to NPR’s first broadcast of the Senate Foreign Relations Committee hearings on the Vietnam War on April 20, 1971.

Artemis contains hundreds of unique program titles: NPR-produced newsmagazines, as well as the network’s first Spanish-language national news program, Enfoque Nacional, distributed by NPR from 1979–88; programs produced by NPR member stations and distributed by NPR; and live events and specials such as NPR’s 1981 radio dramatization of Star Wars. At the heart of the archive are the more than 850,000 story records that unite metadata, full-text searchable transcripts and audio.

A digital scan of the handwritten rundown for the May 3, 1971 broadcast of “All Things Considered.”

Transcripts make NPR’s audio discoverable, allowing for full-text search of stories and segments. NPR transcripts have a separate life from the audio in syndication; scholars, educators, students and researchers are able to use them to discover and cite NPR journalism and content.

But not all story records include transcripts or digital audio. Thousands of hours of NPR audio is stored on obsolete physical formats, including reel-to-reel tape and optical disc. NPR’s earliest programming, produced between 1971 and 1983, is stored on magnetic tape, at risk of degradation and susceptible to sticky-shed syndrome and chemical leaching.

In 2015, RAD invested in staff and equipment to establish two state-of-the-art digitization labs at NPR headquarters and began developing a workflow for transferring archival audio from physical formats to digital preservation files. The lab is outfitted with more than a dozen vintage Studer playback machines donated to NPR by radio stations located across the country eager to help this tremendous effort. RAD staff have digitized and reformatted thousands of hours of audio produced by NPR. It used to take up to three days for NPR journalists or researchers at UMD or LOC to request and receive audio stored in physical formats. Now these sounds and voices from the past are instantly accessible in Artemis.

To paraphrase Susan Stamberg, the NPR audio archive is not a history book, but history is “etched in the voices that gave these decades their vitality.” The NPR archives provide a valuable glimpse into the everyday life of the American past. The 46-year collection also demonstrates the changing landscape of American media. The artifacts from NPR’s coverage of the 1971 May Day protests include the broadcast recording and a handwritten rundown of the day’s stories. When NPR journalists covered the 2017 Women’s March on Washington, they captured audio, video photographs, videos and social media posts in their reporting.

The May 3, 1971 broadcast of “All Things Considered” as it appears today in NPR’s digital archive Artemis.

The RAD team is expanding its efforts to preserve and archive other forms of content and data. Product owners and developers on the team are currently investigating what it means and what it will take to store and analyze videos, images, blog posts and social media content. Next steps in preserving these media will also include identifying and creating the metadata appropriate for each platform.

Other initiatives include equipping Artemis with auto-tagging solutions that take full advantage of RAD’s unique taxonomy, making more stories and other content more rapidly discoverable and retrievable.

Today RAD continues to provide the in-depth research and fact-checking that NPR reporters have relied upon since 1971, supporting NPR storytelling from conception to preservation. The RAD team is actively seeking solutions to preserve and make accessible present-day reporting for the future.

Follow NPR RAD on Twitter and Instagram @npr_rad. For a selection of curated posts from the archives, follow @nprchives on Tumblr and Twitter.

Ayda Pourasad is a researcher and Julie Rogers is a historian with NPR’s Research, Archives & Data Strategy team.

This essay appears as part of Rewind: The Roots of Public Media, Current’s series of commentaries about the history of public media. The series is created in partnership with the RPTF, an initiative of the Library of Congress. Josh Shepperd, assistant professor of media studies at Catholic University in Washington, D.C., and national research director of the Radio Preservation Task Force, is Faculty Curator of the Rewind series. Email: [email protected]

4 thoughts on “How NPR’s Research, Archives & Data Strategy team is saving sounds of the past for the future

  1. Pingback: controlaccess: Relevant Subjects in Archives and Related Fields 2017-05-14 | SNAP Section

  2. Pingback: NPR Audio Archive – Data Archive Infrastructure 2018

Leave a Reply

Your email address will not be published. Required fields are marked *