Publications

Publications / Conference Poster

Improving analysis and decision-making through intelligent web crawling

McClain, Jonathan T.; Avina, Glory E.; Trumbo, Derek T.; Kittinger, Robert

Analysts across national security domains are required to sift through large amounts of data to find and compile relevant information in a form that enables decision makers to take action in high-consequence scenarios. However, even the most experienced analysts are unable to be 100 % consistent and accurate based on the entire dataset, unbiased towards familiar documentation, and are unable to synthesize and process large amounts of information in a small amount of time. Sandia National Laboratories has attempted to solve this problem by developing an intelligent web crawler called Huntsman. Huntsman acts as a personal research assistant by browsing the internet or offline datasets in a way similar to the human search process, only much faster (millions of documents per day), by submitting queries to search engines and assessing the usefulness of page results through analysis of full-page content with a suite of text analytics. This paper will discuss Huntsman’s capability to both mirror and enhance human analysts using intelligent web crawling with analysts-in-the-loop. The goal is to demonstrate how weaknesses in human cognitive processing can be compensated for by fusing human processes with text analytics and web crawling systems, which ultimately reduces analysts’ cognitive burden and increases mission effectiveness.