Publications

Results 26–50 of 52

Search results

Jump to search filters

Machine learning models of errors in large eddy simulation predictions of surface pressure fluctuations

47th AIAA Fluid Dynamics Conference, 2017

Barone, Matthew F.; Fike, Jeffrey A.; Chowdhary, Kamaljit S.; Davis, Warren L.; Ling, Julia L.; Martin, Shawn

We investigate a novel application of deep neural networks to modeling of errors in prediction of surface pressure fluctuations beneath a compressible, turbulent flow. In this context, the truth solution is given by Direct Numerical Simulation (DNS) data, while the predictive model is a wall-modeled Large Eddy Simulation (LES). The neural network provides a means to map relevant statistical flow-features within the LES solution to errors in prediction of wall pressure spectra. We simulate a number of flat plate turbulent boundary layers using both DNS and wall-modeled LES to build up a database with which to train the neural network. We then apply machine learning techniques to develop an optimized neural network model for the error in terms of relevant flow features.

More Details

Visualizing Wind Farm Wakes Using SCADA Data

Martin, Shawn; Westergaard, Carsten H.; White, Jonathan; Karlson, Benjamin K.

As wind farms scale to include more and more turbines, questions about turbine wake interactions become increasingly important. Turbine wakes reduce wind speed and downwind turbines suffer decreased performance. The cumulative effect of the wakes throughout a wind farm will therefore decrease the performance of the entire farm. These interactions are dynamic and complicated, and it is difficult to quantify the overall effect of the wakes. This problem has attracted some attention in terms of computational modelling for siting turbines on new farms, but less attention in terms of empirical studies and performance validation of existing farms. In this report, Supervisory Control and Data Acquisition (SCADA) data from an existing wind farm is analyzed in order to explore methods for documenting wake interactions. Visualization techniques are proposed and used to analyze wakes in a 67 turbine farm. The visualizations are based on directional analysis using power measurements, and can be considered to be normalized capacity factors below rated power. Wind speed measurements are not used in the analysis except for data pre-processing. Four wake effects are observed; including wake deficit, channel speed up, and two potentially new effects, single and multiple shear point speed up. In addition, an attempt is made to quantify wake losses using the same SCADA data. Power losses for the specific wind farm investigated are relatively low, estimated to be in the range of 3-5%. Finally, a simple model based on the wind farm geometrical layout is proposed. Key parameters for the model have been estimated by comparing wake profiles at different ranges and making some ad hoc assumptions. A preliminary comparison of six selected profiles shows excellent agreement with the model. Where discrepancies are observed, reasonable explanations can be found in multi-turbine speedup effects and landscape features, which are yet to be modelled.

More Details

Continuous Reliability Enhancement for Wind (CREW). Program Update

Karlson, Benjamin K.; Carter, Charles M.; Martin, Shawn; Westergaard, Carsten

Sandia's Continuous Reliability Enhancement for Wind (CREW) Program is a follow on project to the Wind Plant Reliability Database and Analysis Program. The goal of CREW is to characterize the reliability performance of the US fleet to serve as a basis for improved reliability and increased availability of turbines. This document states the objectives of CREW and describes how data collected for CREW will be used in analysis. A critical aspect to the success of the CREW project is data input from participating owner/operators. The level of detail and the quality of input data provided dictates the type of analysis that can be accomplished. Options for analysis range from high level availability summaries to detailed analysis of failure modes for individual equipment items. Specific types of input data are identified followed by samples of the type of output that can be expected along with a discussion of benefits to the user community.

More Details

Interactive visualization of multivariate time series data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Martin, Shawn; Quach, Tu T.

Organizing multivariate time series data for presentation to an analyst is a challenging task. Typically, a dataset contains hundreds or thousands of datapoints, and each datapoint consists of dozens of time series measurements. Analysts are interested in how the datapoints are related, which measurements drive trends and/or produce clusters, and how the clusters are related to available metadata. In addition, interest in particular time series measurements will change depending on what the analyst is trying to understand about the dataset. Rather than providing a monolithic single use machine learning solution, we have developed a system that encourages analyst interaction. This system, Dial-A-Cluster (DAC), uses multidimensional scaling to provide a visualization of the datapoints depending on distance measures provided for each time series. The analyst can interactively adjust (dial) the relative influence of each time series to change the visualization (and resulting clusters). Additional computations are provided which optimize the visualization according to metadata of interest and rank time series measurements according to their influence on analyst selected clusters. The DAC system is a plug-in for Slycat (slycat.readthedocs.org), a framework which provides a web server, database, and Python infrastructure. The DAC web application allows an analyst to keep track of multiple datasets and interact with each as described above. It requires no installation, runs on any platform, and enables analyst collaboration. We anticipate an open source release in the near future.

More Details

Encoding and analyzing aerial imagery using geospatial semantic graphs

Rintoul, Mark D.; Watson, Jean-Paul W.; McLendon, William C.; Parekh, Ojas D.; Martin, Shawn

While collection capabilities have yielded an ever-increasing volume of aerial imagery, analytic techniques for identifying patterns in and extracting relevant information from this data have seriously lagged. The vast majority of imagery is never examined, due to a combination of the limited bandwidth of human analysts and limitations of existing analysis tools. In this report, we describe an alternative, novel approach to both encoding and analyzing aerial imagery, using the concept of a geospatial semantic graph. The advantages of our approach are twofold. First, intuitive templates can be easily specified in terms of the domain language in which an analyst converses. These templates can be used to automatically and efficiently search large graph databases, for specific patterns of interest. Second, unsupervised machine learning techniques can be applied to automatically identify patterns in the graph databases, exposing recurring motifs in imagery. We illustrate our approach using real-world data for Anne Arundel County, Maryland, and compare the performance of our approach to that of an expert human analyst.

More Details

Innovative high pressure gas MEM's based neutron detector for ICF and active SNM detection

Chandler, Gordon A.; Renzi, Ronald F.; Derzon, Mark S.; Martin, Shawn

An innovative helium3 high pressure gas detection system, made possible by utilizing Sandia's expertise in Micro-electrical Mechanical fluidic systems, is proposed which appears to have many beneficial performance characteristics with regards to making these neutron measurements in the high bremsstrahlung and electrical noise environments found in High Energy Density Physics experiments and especially on the very high noise environment generated on the fast pulsed power experiments performed here at Sandia. This same system may dramatically improve active WMD and contraband detection as well when employed with ultrafast (10-50 ns) pulsed neutron sources.

More Details

Predicting building contamination using machine learning

Proceedings - 6th International Conference on Machine Learning and Applications, ICMLA 2007

Martin, Shawn; Mckenna, Sean A.

Potential events involving biological or chemical contamination of buildings are of major concern in the area of homeland security. Tools are needed to provide rapid, onsite predictions of contaminant levels given only approximate measurements in limited locations throughout a building. In principal, such tools could use calculations based on physical process models to provide accurate predictions. In practice, however, physical process models are too complex and computationally costly to be used in a real-time scenario. In this paper, we investigate the feasibility of using machine learning to provide easily computed but approximate models that would be applicable in the field. We develop a machine learning method based on Support Vector Machine regression and classification. We apply our method to problems of estimating contamination levels and contaminant source location. © 2007 IEEE.

More Details

Boolean dynamics of genetic regulatory networks inferred from microarray time series data

Bioinformatics

Martin, Shawn; Zhang, Zhaoduo Z.; Martino, Anthony M.; Faulon, Jean-Loup M.

Motivation: Methods available for the inference of genetic regulatory networks strive to produce a single network, usually by optimizing some quantity to fit the experimental observations. In this article we investigate the possibility that multiple networks can be inferred, all resulting in similar dynamics. This idea is motivated by theoretical work which suggests that biological networks are robust and adaptable to change, and that the overall behavior of a genetic regulatory network might be captured in terms of dynamical basins of attraction. Results: We have developed and implemented a method for inferring genetic regulatory networks for time series microarray data. Our method first clusters and discretizes the gene expression data using k-means and support vector regression. We then enumerate Boolean activation-inhibition networks to match the discretized data. Finally, the dynamics of the Boolean networks are examined. We have tested our method on two immunology microarray datasets: an IL-2-stimulated T cell response dataset and a LPS-stimulated macrophage response dataset. In both cases, we discovered that many networks matched the data, and that most of these networks had similar dynamics. © The Author 2007. Published by Oxford University Press. All rights reserved.

More Details

Developing algorithms for predicting protein-protein interactions of homology modeled proteins

Roe, Diana C.; Sale, Kenneth L.; Faulon, Jean-Loup M.; Martin, Shawn

The goal of this project was to examine the protein-protein docking problem, especially as it relates to homology-based structures, identify the key bottlenecks in current software tools, and evaluate and prototype new algorithms that may be developed to improve these bottlenecks. This report describes the current challenges in the protein-protein docking problem: correctly predicting the binding site for the protein-protein interaction and correctly placing the sidechains. Two different and complementary approaches are taken that can help with the protein-protein docking problem. The first approach is to predict interaction sites prior to docking, and uses bioinformatics studies of protein-protein interactions to predict theses interaction site. The second approach is to improve validation of predicted complexes after docking, and uses an improved scoring function for evaluating proposed docked poses, incorporating a solvation term. This scoring function demonstrates significant improvement over current state-of-the art functions. Initial studies on both these approaches are promising, and argue for full development of these algorithms.

More Details

Application of multidisciplinary analysis to gene expression

Davidson, George S.; Haaland, David M.; Martin, Shawn

Molecular analysis of cancer, at the genomic level, could lead to individualized patient diagnostics and treatments. The developments to follow will signal a significant paradigm shift in the clinical management of human cancer. Despite our initial hopes, however, it seems that simple analysis of microarray data cannot elucidate clinically significant gene functions and mechanisms. Extracting biological information from microarray data requires a complicated path involving multidisciplinary teams of biomedical researchers, computer scientists, mathematicians, statisticians, and computational linguists. The integration of the diverse outputs of each team is the limiting factor in the progress to discover candidate genes and pathways associated with the molecular biology of cancer. Specifically, one must deal with sets of significant genes identified by each method and extract whatever useful information may be found by comparing these different gene lists. Here we present our experience with such comparisons, and share methods developed in the analysis of an infant leukemia cohort studied on Affymetrix HG-U95A arrays. In particular, spatial gene clustering, hyper-dimensional projections, and computational linguistics were used to compare different gene lists. In spatial gene clustering, different gene lists are grouped together and visualized on a three-dimensional expression map, where genes with similar expressions are co-located. In another approach, projections from gene expression space onto a sphere clarify how groups of genes can jointly have more predictive power than groups of individually selected genes. Finally, online literature is automatically rearranged to present information about genes common to multiple groups, or to contrast the differences between the lists. The combination of these methods has improved our understanding of infant leukemia. While the complicated reality of the biology dashed our initial, optimistic hopes for simple answers from microarrays, we have made progress by combining very different analytic approaches.

More Details
Results 26–50 of 52
Results 26–50 of 52