Open Source Software Project Review
Abstract not provided.
Abstract not provided.
Our aim is to determine the network of events, or the regulatory network, that defines an immune response to a bio-toxin. As a model system, we are studying T cell regulatory network triggered through tyrosine kinase receptor activation using a combination of pathway stimulation and time-series microarray experiments. Our approach is composed of five steps (1) microarray experiments and data error analysis, (2) data clustering, (3) data smoothing and discretization, (4) network reverse engineering, and (5) network dynamics analysis and fingerprint identification. The technological outcome of this study is a suite of experimental protocols and computational tools that reverse engineer regulatory networks provided gene expression data. The practical biological outcome of this work is an immune response fingerprint in terms of gene expression levels. Inferring regulatory networks from microarray data is a new field of investigation that is no more than five years old. To the best of our knowledge, this work is the first attempt that integrates experiments, error analyses, data clustering, inference, and network analysis to solve a practical problem. Our systematic approach of counting, enumeration, and sampling networks matching experimental data is new to the field of network reverse engineering. The resulting mathematical analyses and computational tools lead to new results on their own and should be useful to others who analyze and infer networks.
The thermal challenge problem has been developed at Sandia National Laboratories as a testbed for demonstrating various types of validation approaches and prediction methods. This report discusses one particular methodology to assess the validity of a computational model given experimental data. This methodology is based on Bayesian Belief Networks (BBNs) and can incorporate uncertainty in experimental measurements, in physical quantities, and model uncertainties. The approach uses the prior and posterior distributions of model output to compute a validation metric based on Bayesian hypothesis testing (a Bayes' factor). This report discusses various aspects of the BBN, specifically in the context of the thermal challenge problem. A BBN is developed for a given set of experimental data in a particular experimental configuration. The development of the BBN and the method for ''solving'' the BBN to develop the posterior distribution of model output through Monte Carlo Markov Chain sampling is discussed in detail. The use of the BBN to compute a Bayes' factor is demonstrated.
The formation and functions of living materials and organisms are fundamentally different from those of synthetic materials and devices. Synthetic materials tend to have static structures, and are not capable of adapting to the functional needs of changing environments. In contrast, living systems utilize energy to create, heal, reconfigure, and dismantle materials in a dynamic, non-equilibrium fashion. The overall goal of the project was to organize and reconfigure functional assemblies of nanoparticles using strategies that mimic those found in living systems. Active assembly of nanostructures was studied using active biomolecules to drive the organization and assembly of nanocomposite materials. In this system, kinesin motor proteins and microtubules were used to direct the transport and interactions of nanoparticles at synthetic interfaces. In addition, the kinesin/microtubule transport system was used to actively assemble nanocomposite materials capable of storing significant elastic energy. Novel biophysical measurement tools were also developed for measuring the collective force generated by kinesin motor proteins, which will provide insight on the mechanical constraints of active assembly processes. Responsive reconfiguration of nanostructures was studied in terms of using active biomolecules to mediate the optical properties of quantum dot (QD) arrays through modulation of inter-particle spacing and associated energy transfer interaction. Design rules for kinesin-based transport of a wide range of nanoscale cargo (e.g., nanocrystal quantum dots, micron-sized polymer spheres) were developed. Three-dimensional microtubule organizing centers were assembled in which the polar orientation of the microtubules was controlled by a multi-staged assembly process. Overall, a number of enabling technologies were developed over the course of this project, and will drive the exploitation of energy-driven processes to regulate the assembly, disassembly, and dynamic reorganization of nanomaterials.
Understanding the properties and behavior of biomembranes is fundamental to many biological processes and technologies. Microdomains in biomembranes or ''lipid rafts'' are now known to be an integral part of cell signaling, vesicle formation, fusion processes, protein trafficking, and viral and toxin infection processes. Understanding how microdomains form, how they depend on membrane constituents, and how they act not only has biological implications, but also will impact Sandia's effort in development of membranes that structurally adapt to their environment in a controlled manner. To provide such understanding, we created physically-based models of biomembranes. Molecular dynamics (MD) simulations and classical density functional theory (DFT) calculations using these models were applied to phenomena such as microdomain formation, membrane fusion, pattern formation, and protein insertion. Because lipid dynamics and self-organization in membranes occur on length and time scales beyond atomistic MD, we used coarse-grained models of double tail lipid molecules that spontaneously self-assemble into bilayers. DFT provided equilibrium information on membrane structure. Experimental work was performed to further help elucidate the fundamental membrane organization principles.
This report documents research to develop robust and efficient solution techniques for solving large-scale systems of nonlinear equations. The most widely used method for solving systems of nonlinear equations is Newton's method. While much research has been devoted to augmenting Newton-based solvers (usually with globalization techniques), little has been devoted to exploring the application of different models. Our research has been directed at evaluating techniques using different models than Newton's method: a lower order model, Broyden's method, and a higher order model, the tensor method. We have developed large-scale versions of each of these models and have demonstrated their use in important applications at Sandia. Broyden's method replaces the Jacobian with an approximation, allowing codes that cannot evaluate a Jacobian or have an inaccurate Jacobian to converge to a solution. Limited-memory methods, which have been successful in optimization, allow us to extend this approach to large-scale problems. We compare the robustness and efficiency of Newton's method, modified Newton's method, Jacobian-free Newton-Krylov method, and our limited-memory Broyden method. Comparisons are carried out for large-scale applications of fluid flow simulations and electronic circuit simulations. Results show that, in cases where the Jacobian was inaccurate or could not be computed, Broyden's method converged in some cases where Newton's method failed to converge. We identify conditions where Broyden's method can be more efficient than Newton's method. We also present modifications to a large-scale tensor method, originally proposed by Bouaricha, for greater efficiency, better robustness, and wider applicability. Tensor methods are an alternative to Newton-based methods and are based on computing a step based on a local quadratic model rather than a linear model. The advantage of Bouaricha's method is that it can use any existing linear solver, which makes it simple to write and easily portable. However, the method usually takes twice as long to solve as Newton-GMRES on general problems because it solves two linear systems at each iteration. In this paper, we discuss modifications to Bouaricha's method for a practical implementation, including a special globalization technique and other modifications for greater efficiency. We present numerical results showing computational advantages over Newton-GMRES on some realistic problems. We further discuss a new approach for dealing with singular (or ill-conditioned) matrices. In particular, we modify an algorithm for identifying a turning point so that an increasingly ill-conditioned Jacobian does not prevent convergence.
Proposed for publication in the Journal of Fluid Mechanics.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This SAND report provides the technical progress through April 2005 of the Sandia-led project, ''Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling'', funded by the DOE Office of Science GenomicsGTL Program. Understanding, predicting, and perhaps manipulating carbon fixation in the oceans has long been a major focus of biological oceanography and has more recently been of interest to a broader audience of scientists and policy makers. It is clear that the oceanic sinks and sources of CO{sub 2} are important terms in the global environmental response to anthropogenic atmospheric inputs of CO{sub 2} and that oceanic microorganisms play a key role in this response. However, the relationship between this global phenomenon and the biochemical mechanisms of carbon fixation in these microorganisms is poorly understood. In this project, we will investigate the carbon sequestration behavior of Synechococcus Sp., an abundant marine cyanobacteria known to be important to environmental responses to carbon dioxide levels, through experimental and computational methods. This project is a combined experimental and computational effort with emphasis on developing and applying new computational tools and methods. Our experimental effort will provide the biology and data to drive the computational efforts and include significant investment in developing new experimental methods for uncovering protein partners, characterizing protein complexes, identifying new binding domains. We will also develop and apply new data measurement and statistical methods for analyzing microamy experiments. Computational tools will be essential to our efforts to discover and characterize the function of the molecular machines of Synechococcus. To this end, molecular simulation methods will be coupled with knowledge discovery from diverse biological data sets for high-throughput discovery and characterization of protein-protein complexes. In addition, we will develop a set of novel capabilities for inference of regulatory pathways in microbial genomes across multiple sources of information through the integration of computational and experimental technologies. These capabilities will be applied to Synechococcus regulatory pathways to characterize their interaction map and identify component proteins in these pathways. We will also investigate methods for combining experimental and computational results with visualization and natural language tools to accelerate discovery of regulatory pathways. The ultimate goal of this effort is develop and apply new experimental and computational methods needed to generate a new level of understanding of how the Synechococcus genome affects carbon fixation at the global scale. Anticipated experimental and computational methods will provide ever-increasing insight about the individual elements and steps in the carbon fixation process, however relating an organism's genome to its cellular response in the presence of varying environments will require systems biology approaches. Thus a primary goal for this effort is to integrate the genomic data generated from experiments and lower level simulations with data from the existing body of literature into a whole cell model. We plan to accomplish this by developing and applying a set of tools for capturing the carbon fixation behavior of complex of Synechococcus at different levels of resolution. Finally, the explosion of data being produced by high-throughput experiments requires data analysis and models which are more computationally complex, more heterogeneous, and require coupling to ever increasing amounts of experimentally obtained data in varying formats. These challenges are unprecedented in high performance scientific computing and necessitate the development of a companion computational infrastructure to support this effort.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This report describes the test and evaluation methods by which the Teraflops Operating System, or TOS, that resides on Sandia's massively-parallel computer Janus is verified for production release. Also discussed are methods used to build TOS before testing and evaluating, miscellaneous utility scripts, a sample test plan, and a proposed post-test method for quickly examining the large number of test results. The purpose of the report is threefold: (1) to provide a guide to T&E procedures, (2) to aid and guide others who will run T&E procedures on the new ASCI Red Storm machine, and (3) to document some of the history of evaluation and testing of TOS. This report is not intended to serve as an exhaustive manual for testers to conduct T&E procedures.
Threats to water distribution systems include release of contaminants and Denial of Service (DoS) attacks. A better understanding, and validated computational models, of the flow in water distribution systems would enable determination of sensor placement in real water distribution networks, allow source identification, and guide mitigation/minimization efforts. Validation data are needed to evaluate numerical models of network operations. Some data can be acquired in real-world tests, but these are limited by 1) unknown demand, 2) lack of repeatability, 3) too many sources of uncertainty (demand, friction factors, etc.), and 4) expense. In addition, real-world tests have limited numbers of network access points. A scale-model water distribution system was fabricated, and validation data were acquired over a range of flow (demand) conditions. Standard operating variables included system layout, demand at various nodes in the system, and pressure drop across various pipe sections. In addition, the location of contaminant (salt or dye) introduction was varied. Measurements of pressure, flowrate, and concentration at a large number of points, and overall visualization of dye transport through the flow network were completed. Scale-up issues that that were incorporated in the experiment design include Reynolds number, pressure drop across nodes, and pipe friction and roughness. The scale was chosen to be 20:1, so the 10 inch main was modeled with a 0.5 inch pipe in the physical model. Controlled validation tracer tests were run to provide validation to flow and transport models, especially of the degree of mixing at pipe junctions. Results of the pipe mixing experiments showed large deviations from predicted behavior and these have a large impact on standard network operations models.3
Abstract not provided.
This work focuses on different methods to generate confidence regions for nonlinear parameter identification problems. Three methods for confidence region estimation are considered: a linear approximation method, an F-test method, and a Log-Likelihood method. Each of these methods are applied to three case studies. One case study is a problem with synthetic data, and the other two case studies identify hydraulic parameters in groundwater flow problems based on experimental well-test results. The confidence regions for each case study are analyzed and compared. Although the F-test and Log-Likelihood methods result in similar regions, there are differences between these regions and the regions generated by the linear approximation method for nonlinear problems. The differing results, capabilities, and drawbacks of all three methods are discussed.
International Journal for Numerical Methods in Engineering
The goal of our paper is to compare a number of algorithms for computing a large number of eigenvectors of the generalized symmetric eigenvalue problem arising from a modal analysis of elastic structures. The shift-invert Lanczos algorithm has emerged as the workhorse for the solution of this generalized eigenvalue problem; however, a sparse direct factorization is required for the resulting set of linear equations. Instead, our paper considers the use of preconditioned iterative methods. We present a brief review of available preconditioned eigensolvers followed by a numerical comparison on three problems using a scalable algebraic multigrid (AMG) preconditioner. Copyright © 2005 John Wiley & Sons, Ltd.
Semantic graphs offer one promising avenue for intelligence analysis in homeland security. They provide a mechanism for describing a wide variety of relationships between entities of potential interest. The vertices are nouns of various types, e.g. people, organizations, events, etc. Edges in the graph represent different types of relationships between entities, e.g. 'is friends with', 'belongs-to', etc. Semantic graphs offer a number of potential advantages as a knowledge representation system. They allow information of different kinds, and collected in differing ways, to be combined in a seamless manner. A semantic graph is a very compressed representation of some of relationship information. It has been reported that the semantic graph can be two orders of magnitude smaller than the processed intelligence data. This allows for much larger portions of the data universe to be resident in computer memory. Many intelligence queries that are relevant to the terrorist threat are naturally expressed in the language of semantic graphs. One example is the search for 'interesting' relationships between two individuals or between an individual and an event, which can be phrased as a search for short paths in the graph. Another example is the search for an analyst-specified threat pattern, which can be cast as an instance of subgraph isomorphism. It is important to note than many kinds of analysis are not relationship based, so these are not good candidates for semantic graphs. Thus, a semantic graph should always be used in conjunction with traditional knowledge representation and interface methods. Operations that involve looking for chains of relationships (e.g. friend of a friend) are not efficiently executable in a traditional relational database. However, the semantic graph can be thought of as a pre-join of the database, and it is ideally suited for these kinds of operations. Researchers at Sandia National Laboratories are working to facilitate semantic graph analysis. Since intelligence datasets can be extremely large, the focus of this work is on the use of parallel computers. We have been working to develop scalable parallel algorithms that will be at the core of a semantic graph analysis infrastructure. Our work has involved two different thrusts, corresponding to two different computer architectures. The first architecture of interest is distributed memory, message passing computers. These machines are ubiquitous and affordable, but they are challenging targets for graph algorithms. Much of our distributed-memory work to date has been collaborative with researchers at Lawrence Livermore National Laboratory and has focused on finding short paths on distributed memory parallel machines. Our implementation on 32K processors of BlueGene/Light finds shortest paths between two specified vertices in just over a second for random graphs with 4 billion vertices.
Abstract not provided.