Machine learning methodologies can provide insight into Brønsted-Guggenheim-Scatchard specific ion interaction theory (SIT) parameter values where experimental data availability may be limited. This study develops and executes machine learning frameworks to model the SIT interaction coefficient, ε. Key findings include successful estimations of ε via artificial neural networks using clustering and value prediction approaches. Applicability to other chemical parameters is also assessed briefly. Models developed here provide support for a use-case of machine learning in geologic nuclear waste disposal research applications, namely in predictions of chemical behaviors of high ionic strength solutions (i.e., subsurface brines).
The Bird project explored the use of big data analytics tool to improve the findability of information within the Sandia internal network. We were able to perform query classification utilizing the supervised learning algorithms in the Apache Spark library. By relying on the distributed processing capabilities provided by the Apache Hadoop framework, we successfully processed the large query log files needed to train the models in this effort. The capabilities developed in this project are being used to enhance the effectiveness of the enterprise search engine.
For applications such as force protection, an effective decision maker needs to maintain an unambiguous grasp of the environment. Opportunities exist to leverage computational mechanisms for the adaptive fusion of diverse information sources. The current research employs neural networks and Markov chains to process information from sources including sensors, weather data, and law enforcement. Furthermore, the system operator's input is used as a point of reference for the machine learning algorithms. More detailed features of the approach are provided, along with an example force protection scenario.
As information systems become increasingly complex and pervasive, they become inextricably intertwined with the critical infrastructure of national, public, and private organizations. The problem of recognizing and evaluating threats against these complex, heterogeneous networks of cyber and physical components is a difficult one, yet a solution is vital to ensuring security. In this paper we investigate profile-based anomaly detection techniques that can be used to address this problem. We focus primarily on the area of network anomaly detection, but the approach could be extended to other problem domains. We investigate using several data analysis techniques to create profiles of network hosts and perform anomaly detection using those profiles. The ''profiles'' reduce multi-dimensional vectors representing ''normal behavior'' into fewer dimensions, thus allowing pattern and cluster discovery. New events are compared against the profiles, producing a quantitative measure of how ''anomalous'' the event is. Most network intrusion detection systems (IDSs) detect malicious behavior by searching for known patterns in the network traffic. This approach suffers from several weaknesses, including a lack of generalizability, an inability to detect stealthy or novel attacks, and lack of flexibility regarding alarm thresholds. Our research focuses on enhancing current IDS capabilities by addressing some of these shortcomings. We identify and evaluate promising techniques for data mining and machine-learning. The algorithms are ''trained'' by providing them with a series of data-points from ''normal'' network traffic. A successful algorithm can be trained automatically and efficiently, will have a low error rate (low false alarm and miss rates), and will be able to identify anomalies in ''pseudo real-time'' (i.e., while the intrusion is still in progress, rather than after the fact). We also build a prototype anomaly detection tool that demonstrates how the techniques might be integrated into an operational intrusion detection framework.