Publications Details
The Bird project: Using Big Data tools to support Search Analytics
Herzer, John A.; Zhang, Pengchu Z.
The Bird project explored the use of big data analytics tool to improve the findability of information within the Sandia internal network. We were able to perform query classification utilizing the supervised learning algorithms in the Apache Spark library. By relying on the distributed processing capabilities provided by the Apache Hadoop framework, we successfully processed the large query log files needed to train the models in this effort. The capabilities developed in this project are being used to enhance the effectiveness of the enterprise search engine.