Model Discrepancy Calibration and Propagation Across Experimental Settings
Abstract not provided.
Abstract not provided.
This report summarizes the work performed under the project "Linear Programming in Strongly Polynomial Time." Linear programming (LP) is a classic combinatorial optimization problem heavily used directly and as an enabling subroutine in integer programming (IP). Specifically IP is the same as LP except that some solution variables must take integer values (e.g. to represent yes/no decisions). Together LP and IP have many applications in resource allocation including general logistics, and infrastructure design and vulnerability analysis. The project was motivated by the PI's recent success developing methods to efficiently sample Voronoi vertices (essentially finding nearest neighbors in high-dimensional point sets) in arbitrary dimension. His method seems applicable to exploring the high-dimensional convex feasible space of an LP problem. Although the project did not provably find a strongly-polynomial algorithm, it explored multiple algorithm classes. The new medial simplex algorithms may still lead to solvers with improved provable complexity. We describe medial simplex algorithms and some relevant structural/complexity results. We also designed a novel parallel LP algorithm based on our geometric insights and implemented it in the Spoke-LP code. A major part of the computational step is many independent vector dot products. Our parallel algorithm distributes the problem constraints across processors. Current commercial and high-quality free LP solvers require all problem details to fit onto a single processor or multicore. Our new algorithm might enable the solution of problems too large for any current LP solvers. We describe our new algorithm, give preliminary proof-of-concept experiments, and describe a new generator for arbitrarily large LP instances.
Abstract not provided.
Abstract not provided.
Abstract not provided.
The Dakota toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. Dakota contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic expansion methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the Dakota toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a user’s manual for the Dakota software and provides capability overviews and procedures for software execution, as well as a variety of example studies.
Abstract not provided.
Proceedings of IPDRM 2020: 4th Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware, Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis
As network speeds increase, the overhead of processing incoming messages is becoming onerous enough that many manufacturers now provide network interface cards (NICs) with offload capabilities to handle these overheads. This increase in NIC capabilities creates an opportunity to enable computation on data in-situ on the NIC. These enhanced NICs can be classified into several different categories of SmartNICs. SmartNICs present an interesting opportunity for future runtime software designs. Designing runtime software to be located in the network as opposed to the host level leads to new radical distributed runtime possibilities that were not practical prior to SmartNICs. In the process of transitioning to a radically different runtime software design for SmartNICs there are intermediary steps of migrating current runtime software to be offloaded onto a SmartNIC that also present interesting possibilities. This paper will describe SmartNIC design and how SmartNICs can be leveraged to offload current generation runtime software and lead to future radically different in-network distributed runtime systems.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
The Computer Science Research Institute (CSRI) brings university faculty and students to Sandia for focused collaborative research on Department of Energy (DOE) computer and computational science problems. The institute provides an opportunity for university researchers to learn about problems in computer and computational science at DOE laboratories. Participants conduct leading-edge research, interact with scientists and engineers at the laboratories, and help transfer results of their research to programs at the labs. Some specific CSRI research interest areas are: scalable solvers, optimization, adaptivity and mesh refinement, graph-based, discrete, and combinatorial algorithms, uncertainty estimation, mesh generation, dynamic load-balancing, virus and other malicious-code defense, visualization, scalable cluster computers, data-intensive computing, environments for scalable computing, parallel input/output, advanced architectures, and theoretical computer science. The CSRI Summer Program is organized by CSRI and typically includes the organization of a weekly seminar series and the publication of a summer proceedings. In 2020, the CSRI summer program was executed completely virtually; all student interns worked from home, due to the COVID-19 pandemic.
Proceedings of IA3 2020: 10th Workshop on Irregular Applications: Architectures and Algorithms, Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis
Graph coloring is often used in parallelizing scientific computations that run in distributed and multi-GPU environments; it identifies sets of independent data that can be updated in parallel. Many algorithms exist for graph coloring on a single GPU or in distributed memory, but hybrid MPI+GPU algorithms have been unexplored until this work, to the best of our knowledge. We present several MPI+GPU coloring approaches that use implementations of the distributed coloring algorithms of Gebremedhin et al. and the shared-memory algorithms of Deveci et al. The on-node parallel coloring uses implementations in KokkosKernels, which provide parallelization for both multicore CPUs and GPUs. We further extend our approaches to solve for distance-2 coloring, giving the first known distributed and multi-GPU algorithm for this problem. In addition, we propose novel methods to reduce communication in distributed graph coloring. Our experiments show that our approaches operate efficiently on inputs too large to fit on a single GPU and scale up to graphs with 76.7 billion edges running on 128 GPUs.
Abstract not provided.
Abstract not provided.
Optimization Online Repository
Abstract not provided.