Software team receives grant to develop AI search
Fueled by a significant grant, a team of Libraries software developers this fall released a new search tool empowered by generative artificial intelligence, a trailblazing foray into using AI to discover the full depth of a library collection.
After months of testing an early version of an AI-assisted search tool for the Libraries’ digital image collection, developers on the Digital Products and Data Curation team received a $439,000, two-year leadership grant from the Institute for Museum and Library Services to take this innovation further. The tool that debuted this fall allows users to ask complex, multisentence questions of the digital image repository, which then performs a semantic search—based not on keywords but on context— and delivers image results with a textual frame of reference about how those images meet the search criteria.
A user could ask, “What design techniques were used on WW2 posters to effectively communicate with the public?” The tool’s language model first interprets “WW2” (which is not an actual Library of Congress catalog heading) and considers both layers of the question (techniques and effectiveness). The tool returns several images from the World War II Poster Collection and describes why it selected them.
The poster “Help send them what it takes to win,” for example, “used vivid imagery of military supplies and activities, which visually emphasized the importance of purchasing war stamps and bonds.” The description concludes with relevant keywords that could help the user refine the search.
“This opens up the way we can search,” said digital initiatives product manager Dave Schober. “The most basic problem it solves is that end users don’t need to know how to do a keyword search. If you can get to the search bar, you can ask a question and expect an answer.”
Library searches already rely on a fluent command of language and search skills, two barriers that can complicate the process for someone searching in a language other than their primary tongue, Schober said. “You can ask this tool a question in your native language and it will reply in that language. It knows almost every language—it can reply in Morse code if you really want it to.”
Through a technique known as retrieval augmented generation, the tool resists “hallucinations,” a common problem of AI tools. RAG performs a search based on the semantic meaning of the user’s question and then generates text grounded in the results from the search. The system didn’t take the bait when Schober asked, “I’m interested in the moon landing. When did aliens get there?” It responded instead that an alien landing had never happened.
The IMLS grant also supports creating a toolkit for other cultural heritage institutions— libraries, museums, and history centers—to implement a similar search tool for their own collections.