Among the tasks required to license Yucca Mountain as a storage facility for spent nuclear fuel and high-level radioactive waste is for DOE to certify that it has made available to the Nuclear Regulatory Commission
(NRC) — the licensing agency — all necessary “documentary material,” including emails, reports, and other correspondence.
But how do you weed through the thousands of documents and email records associated with the Yucca Mountain Project to determine which are relevant to licensure?
Yucca Mountain personnel found the solution in algorithms developed by Sandia’s Cognitive Science and Technology Program.
They turned to a text analysis system built for the program that can quickly differentiate relevant documents from nonrelevant ones or to determine relationships between documents. Specifically, they are using two software systems that perform different functions and generate different types of data — Licensing Support Network Archive Assistant (LSNAA) and Data Trace Tool. Both are part of STACY, a suite of tools used for document analysis.
LSNAA is helping validate the way text-based materials, like emails, are identified as relevant — i.e., pertinent to the licensing process. All of the documents are originally categorized by members of the workforce using guidance provided by DOE. LSNAA provides an automated means of validating that individuals are applying the guidance consistently and correctly.
“A good [human] reviewer can look at 500 emails a day,” says Justin Basilico (6341), who led design and development of the algorithms used in the STACY LSNAA. “That means to review 10,000 emails a day requires 20-person days. LSNAA saves time and money by reducing the effort as much as 90 percent.”
The LSNAA software analyzes messages that have been categorized by subject-matter experts and learns how to differentiate relevant from nonrelevant email messages. When applied to a database of emails, for example, LSNAA shows the user what messages appear to have incorrectly implemented the guidance, making it faster to find potential inconsistencies in categorization. The tool provides a search capability that allows users to search for specific information by key word, date, and categorization.
Justin says the cognitive software makes the second of three reviews categorizing the emails. Human originators make the first categorization, and human reviewers always make the final decision as to which emails are truly relevant.
The other Sandia software tool used at Yucca Mountain to prepare the license defense is the Data Trace Tool.
Data Trace watches analysts while they trace from high-level analysis model reports down to raw data collected in lab notebooks, representing that work as a graph with “nodes.” This provides a means to qualify and support the validity of the model reports that can be saved and accessed again later.
“Previously everything had to be done by hand,” says Zach Benz (6341), Data Trace Tool lead developer. “We provided a new tool that delivers a visual representation of the user’s tracing history.”
The tool is in use by the analysts now.
Wendy Shaneyfelt (6341), member of the cognition team and project manager for development of the two tools, says she is pleased that tools developed as part of Sandia’s augmented cognition research benefited the Yucca Mountain project.
“This represents tech transfer in its best sense from our cognitive research to a real-world application,” she says.