The maximum contact map overlap (MAX-CMO) between a pair of protein structures can be used as a measure of protein similarity. It is a purely topological measure and does not depend on the sequence of the pairs involved in the comparison. More importantly, the MAX-CMO present a very favorable mathematical structure which allows the formulation of integer, linear and Lagrangian models that can be used to obtain guarantees of optimality. It is not the intention of this paper to discuss the mathematical properties of MAX-CMO in detail as this has been dealt elsewhere. In this paper we compare three algorithms that can be used to obtain maximum contact map overlaps between protein structures. We will point to the weaknesses and strengths of each one. It is our hope that this paper will encourage researchers to develop new and improve methods for protein comparison based on MAX-CMO.
We introduce a filter-based evolutionary algorithm (FEA) for constrained optimization. The filter used by an FEA explicitly imposes the concept of dominance on a partially ordered solution set. We show that the algorithm is provably robust for both linear and nonlinear problems and constraints. FEAs use a finite pattern of mutation offsets, and our analysis is closely related to recent convergence results for pattern search methods. We discuss how properties of this pattern impact the ability of an FEA to converge to a constrained local optimum.
The authors describe a naturalistic behavioral model for the simulation of small unit combat. This model, Klein's recognition-primed decision making (RPD) model, is driven by situational awareness rather than a rational process of selecting from a set of action options. They argue that simulated combatants modeled with RPD will have more flexible and realistic responses to a broad range of small-scale combat scenarios. Furthermore, they note that the predictability of a simulation using an RPD framework can be easily controlled to provide multiple evaluations of a given combat scenario. Finally, they discuss computational issues for building an RPD-based behavior engine for fully automated combatants in small conflict scenarios, which are being investigated within Sandia's Next Generation Site Security project.
A key strategy for protecting municipal water supplies is the use of sensors to detect the presence of contaminants in associated water distribution systems. Deploying a contamination warning system involves the placement of a limited number of sensors—placed in order to maximize the level of protection afforded. Researchers have proposed several models and algorithms for generating such placements, each optimizing with respect to a different design objective. The use of disparate design objectives raises several questions: (1) What is the relationship between optimal sensor placements for different design objectives? and (2) Is there any risk in focusing on specific design objectives? We model the sensor placement problem via a mixed-integer programming formulation of the well-known p-median problem from facility location theory to answer these questions. Our model can express a broad range of design objectives. Using three large test networks, we show that optimal solutions with respect to one design objective are often highly sub-optimal with respect to other design objectives. However, it is sometimes possible to construct solutions that are simultaneously near-optimal with respect to a range of design objectives. The design of contamination warning systems thus requires careful and simultaneous consideration of multiple, disparate design objectives.
We describe COLIN, a Common Optimization Library INterface for C++. COLIN provides C++ template classes that define a generic interface for both optimization problems and optimization solvers. COLIN is specifically designed to facilitate the development of hybrid optimizers, for which one optimizer calls another to solve an optimization subproblem. We illustrate the capabilities of COLIN with an example of a memetic genetic programming solver.
Biofouling, the unwanted growth of biofilms on a surface, of water-treatment membranes negatively impacts in desalination and water treatment. With biofouling there is a decrease in permeate production, degradation of permeate water quality, and an increase in energy expenditure due to increased cross-flow pressure needed. To date, a universal successful and cost-effect method for controlling biofouling has not been implemented. The overall goal of the work described in this report was to use high-performance computing to direct polymer, material, and biological research to create the next generation of water-treatment membranes. Both physical (micromixers - UV-curable epoxy traces printed on the surface of a water-treatment membrane that promote chaotic mixing) and chemical (quaternary ammonium groups) modifications of the membranes for the purpose of increasing resistance to biofouling were evaluated. Creation of low-cost, efficient water-treatment membranes helps assure the availability of fresh water for human use, a growing need in both the U. S. and the world.
The U.S. Department of Energy recently announced the first five grants for the Genomes to Life (GTL) Program. The goal of this program is to ''achieve the most far-reaching of all biological goals: a fundamental, comprehensive, and systematic understanding of life.'' While more information about the program can be found at the GTL website (www.doegenomestolife.org), this paper provides an overview of one of the five GTL projects funded, ''Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling.'' This project is a combined experimental and computational effort emphasizing developing, prototyping, and applying new computational tools and methods to elucidate the biochemical mechanisms of the carbon sequestration of Synechococcus Sp., an abundant marine cyanobacteria known to play an important role in the global carbon cycle. Understanding, predicting, and perhaps manipulating carbon fixation in the oceans has long been a major focus of biological oceanography and has more recently been of interest to a broader audience of scientists and policy makers. It is clear that the oceanic sinks and sources of CO(2) are important terms in the global environmental response to anthropogenic atmospheric inputs of CO(2) and that oceanic microorganisms play a key role in this response. However, the relationship between this global phenomenon and the biochemical mechanisms of carbon fixation in these microorganisms is poorly understood. The project includes five subprojects: an experimental investigation, three computational biology efforts, and a fifth which deals with addressing computational infrastructure challenges of relevance to this project and the Genomes to Life program as a whole. Our experimental effort is designed to provide biology and data to drive the computational efforts and includes significant investment in developing new experimental methods for uncovering protein partners, characterizing protein complexes, identifying new binding domains. We will also develop and apply new data measurement and statistical methods for analyzing microarray experiments. Our computational efforts include coupling molecular simulation methods with knowledge discovery from diverse biological data sets for high-throughput discovery and characterization of protein-protein complexes and developing a set of novel capabilities for inference of regulatory pathways in microbial genomes across multiple sources of information through the integration of computational and experimental technologies. These capabilities will be applied to Synechococcus regulatory pathways to characterize their interaction map and identify component proteins in these pathways. We will also investigate methods for combining experimental and computational results with visualization and natural language tools to accelerate discovery of regulatory pathways. Furthermore, given that the ultimate goal of this effort is to develop a systems-level of understanding of how the Synechococcus genome affects carbon fixation at the global scale, we will develop and apply a set of tools for capturing the carbon fixation behavior of complex of Synechococcus at different levels of resolution. Finally, because the explosion of data being produced by high-throughput experiments requires data analysis and models which are more computationally complex, more heterogeneous, and require coupling to ever increasing amounts of experimentally obtained data in varying formats, we have also established a companion computational infrastructure to support this effort as well as the Genomes to Life program as a whole.
We consider the convergence properties of a non-elitist self-adaptive evolutionary strategy (ES) on multi-dimensional problems. In particular, we apply our recent convergence theory for a discretized (1,{lambda})-ES to design a related (1,{lambda})-ES that converges on a class of seperable, unimodal multi-dimensional problems. The distinguishing feature of self-adaptive evolutionary algorithms (EAs) is that the control parameters (like mutation step lengths) are evolved by the evolutionary algorithm. Thus the control parameters are adapted in an implicit manner that relies on the evolutionary dynamics to ensure that more effective control parameters are propagated during the search. Self-adaptation is a central feature of EAs like evolutionary stategies (ES) and evolutionary programming (EP), which are applied to continuous design spaces. Rudolph summarizes theoretical results concerning self-adaptive EAs and notes that the theoretical underpinnings for these methods are essentially unexplored. In particular, convergence theories that ensure convergence to a limit point on continuous spaces have only been developed by Rudolph, Hart, DeLaurentis and Ferguson, and Auger et al. In this paper, we illustrate how our analysis of a (1,{lambda})-ES for one-dimensional unimodal functions can be used to ensure convergence of a related ES on multidimensional functions. This (1,{lambda})-ES randomly selects a search dimension in each iteration, along which points generated. For a general class of separable functions, our analysis shows that the ES searches along each dimension independently, and thus this ES converges to the (global) minimum.
The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic finite element methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a developers manual for the DAKOTA software and describes the DAKOTA class hierarchies and their interrelationships. It derives directly from annotation of the actual source code and provides detailed class documentation, including all member functions and attributes.
The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic finite element methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a reference manual for the commands specification for the DAKOTA software, providing input overviews, option descriptions, and example specifications.
The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, analytic reliability, and stochastic finite element methods; parameter estimation with nonlinear least squares methods; and sensitivity analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a user's manual for the DAKOTA software and provides capability overviews and procedures for software execution, as well as a variety of example studies.