The CCR collaborates on innovations in tool development, component development, and scalable algorithm research with partners and customers around the world through open source projects. Current software projects focus on enabling technologies for scientific computing in areas such as machine learning, graph algorithms, cognitive modeling, visualization, optimization, large-scale multi-physics simulation, HPC miniapplications, HPC system simulation and HPC system software.
CIME is the Common Infrastructure for Modelling the Earth. It is the full-featured software engineering system for global Earth system or climate models. CIME is a set Python scripts configured with XML data files as well as Fortran soure code, and owns the model configuration, build system, test harness, test suites, portability to many specific HPC platforms, input data set management, results archiving.CIME is jointly developed by NCAR, Sandia, and Argonne, and is the software engineering system for several important climate/weather models: E3SM, CESM, and UFS.
CrossSim is a crossbar simulator designed to model resistive memory crossbars for both neuromorphic computing and (in a future release) digital memories. It provides a clean python API so that different algorithms can be built upon crossbars while modeling realistic device properties and variability. The crossbar can be modeled using multiple fast approximate numerical models including both analytic noise models as well as experimentally derived lookup tables. A slower, but more accurate circuit simulation of the devices using the parallel spice simulator Xyce is also being developed and will be included in a future release.
D2T – Doubly Distributed Transactions
Typical distributed transactions are a single client and multiple servers. For the supercomputing simulations, we have multiple clients to multiple servers when attempting to do atomic actions, hence doubly distributed transactions.
This project created a library that can handle scalably offering a two-phase commit style transaction for both storage oriented operations and also for more complex system operations like reconfiguring or redeploying services as atomic operations. The code has been released and is hosted at github.
Dakota: Optimization and Uncertainty Quantification Algorithms for Design Exploration and Simulation Credibility.
The Dakota toolkit provides a flexible, extensible interface between analysis codes and iterative systems analysis methods. Dakota contains algorithms for:
- optimization with gradient and nongradient-based methods;
- uncertainty quantification with sampling, reliability, stochastic expansion, and epistemic methods;
- parameter estimation with nonlinear least squares methods; and
- sensitivity/variance analysis with design of experiments and parameter study methods.
These capabilities may be used on their own or as components within advanced strategies such as hybrid optimization, surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty.
DGM provides a software framework for forward modeling and optimization of partial differential equations using the discontinuous Galerkin method (DGM). Examples are provided for solution to generic Navier-Stokes (with and without subgrid-scale turbulence models), Euler, Burgers, Poisson, Helmholtz, Darcy, linear transport, and Advection Diffusion equations.
DGM supports a discontinuous Galerkin discretization in space using mixed nodal and modal representations with a variety of time advancement methods. Includes modeling capabilities for boundary conditions, source terms, and material models with the ability to solve optimization problems for parameters describing each. Also includes the ability to perform variational multiscale modeling for turbulence simulation.
For more information, send email to contact listed below.
Digital Image Correlation Engine (DICe)
DICe (pronounced /dis/ as in "roll the dice") is an open source digital image correlation (DIC) tool intended for use as a module in an external application or as a standalone analysis code. Its primary capabilities are computing full-field displacements and strains from sequences of digital images and rigid body motion tracking of objects. The images analyzed are typically of a material sample undergoing a characterization experiment, but DICe is also useful for other applications (for example, trajectory tracking). DICe is machine portable (Windows, Linux, and Mac) and can be effectively deployed on a high performance computing platform (DICe uses MPI parallelism as well as threaded on-core parallelism). Capabilities from DICe can be invoked through a customized library interface, via source code integration of DICe classes or through a standalone executable.
Dynamic is a simple software tool, written in Python 3, for simulating networks of continuous dynamic variables linked by interaction Hamiltonians.
It was used to generate the results shown in the "Chaotic Logic" presentation at the ICRC 2016 conference; the concept for that work was described in the associated paper.
E3SM is an Earth System Model being developed by the DOE Energy Exascale Earth System Model (E3SM) project. E3SM Version 1 was released in 2018. E3SM Version 2 was released in 2021. The E3SM atmosphere model runs with the spectral element dynamical core from HOMME, upgraded to include new aerosol and cloud physics and improved convection and treatment of the pressure gradient term, and a formulation for elevation classes to better handle atmosphere and land processes more realistically in the vicinity of topography. The E3SM Land Model (ALM) is based on Community Land Model (CLM) version 4.5 updated to include a full suite of new biochemistry and VIC hydrology and dynamic land units and extensions to couple to land ice sheets. The E3SM ocean, sea ice and land ice components are built on the MPAS framework.
EMPRESS – Metadata management for scientific simulations
With the growth of simulation data, finding relevant data within the sea of raw data can be a much more daunting task than running the simulation at scale in the first place. EMPRESS provides a separate metadata system allowing storing detailed run information as well as user-defined at runtime different data characteristics and associate those characteristics with a variable, a timestep, or run.
It has been published at PDSW-DISCS @ SC17 and SC18 and is available as open source on github.com
Genten: Software for Generalized Tensor Decompositions
Tensors, or multidimensional arrays, are a powerful mathematical means of describing multiway data. This software provides computational means for decomposing or approximating a given tensor in terms of smaller tensors of lower dimension, focusing on decomposition of large, sparse tensors. These techniques have applications in many scientific areas, including signal processing, linear algebra, computer vision, numerical analysis, data mining, graph analysis, neuroscience and more. The software is designed to take advantage of parallelism present in emerging computer architectures such has multi-core CPUs, many-core accelerators such as the Intel Xeon Phi, and computation-oriented GPUs to enable efficient processing of large tensors.
The Image Composition Engine for Tiles (IceT) is a high-performance sort-last parallel rendering library. In addition to providing accelerated rendering for a standard display, IceT provides the unique ability to generate images for tiled displays. The overall resolution of the display may be several times larger than any viewport that may be rendered by a single machine.
IceT is currently available for use in large scale, high performance visualization and graphics applications. It is used in multiple production products like ParaView and VisIt.
Kitten Lightweight Kernel
Kitten is a current-generation lightweight kernel (LWK) compute node operating system designed for large-scale parallel computing systems. Kitten is the latest in a long line of successful LWKs, including SUNMOS, Puma, Cougar, and Catamount. Kitten distinguishes itself from these prior LWKs by providing a Linux-compatible user environment, a more modern and extendable open-source codebase, and a virtual machine monitor capability via Palacios that allows full-featured guest operating systems to be loaded on-demand.
Modern high performance computing (HPC) nodes have diverse and heterogeneous types of cores and memory. For applications and domain-specific libraries/languages to scale, port, and perform well on these next generation architectures, their on-node algorithms must be re-engineered for thread scalability and performance portability. The Kokkos programming model and C++ library implementation helps HPC applications and domain libraries implement intra-node thread-scalable algorithms that are performance portable across diverse manycore architectures such as multicore CPUs, Intel Xeon Phi, NVIDIA GPU, and AMD GPU.
MR-MPI is an open-source implementation of MapReduce written for distributed-memory parallel machines on top of standard MPI message passing.
MapReduce is the programming paradigm, popularized by Google, which is widely used for processing large data sets in parallel. Its salient feature is that if a task can be formulated as a MapReduce, the user can perform it in parallel without writing any parallel code. Instead the user writes serial functions (maps and reduces) which operate on portions of the data set independently. The data-movement and other necessary parallel operations can be performed in an application-independent fashion, in this case by the MR-MPI library.
The MR-MPI library was developed to solve informatics problems on traditional distributed-memory parallel computers. It includes C++ and C interfaces callable from most hi-level languages, and also a Python wrapper and our own OINK scripting wrapper, which can be used to develop and chain MapReduce operations together. MR-MPI and OINK are open-source codes, distributed freely under the terms of the modified Berkeley Software Distribution (BSD) license.
MultiThreaded Graph Library (MTGL)
The MTGL is a generic graph library in the Boost style that does not rely on Boost. It runs on multicore machines via Qthreads or OpenMP, and also on the Cray XMT supercomputer. We are currently developing a new version based upon Kokkos that will leverage both multi-core and GPU-accelerated compute nodes, but we do not have a release yet.
NimbleSM is a Lagrangian finite-element code for solid mechanics. Its primary application is the solution of mechanics problems on nonuniform, three-dimensional meshes using either explicit transient dynamic or implicit quasi-static time integration. The NimbleSM code base is designed for performance portability across varying hardware architectures.
Omega_h is a C++ library that implements tetrahedron and triangle mesh adaptivity, with a focus on scalable HPC performance using (optionally) MPI, OpenMP, or CUDA. It is intended to provided adaptive functionality to existing simulation codes. Mesh adaptivity allows one to minimize both discretization error and number of degrees of freedom live during the simulation, as well as enabling moving object and evolving geometry simulations. Omega_h will do this for you in a way that is fast, memory-efficient, and portable across many different architectures.
ParaView is an open-source, multi-platform data analysis and visualization application. ParaView users can quickly build visualizations to analyze their data using qualitative and quantitative techniques. The data exploration can be done interactively in 3D or programmatically using ParaView’s batch processing capabilities.
ParaView was developed to analyze extremely large datasets using distributed memory computing resources. It can be run on supercomputers to analyze datasets of exascale size as well as on laptops for smaller data.
ParaView is maintained by Kitware, Inc. Sandia collaborates with Kitware to address our large-scale data analysis and visualization needs through ParaView.
ParseGen is a C++ library for generating parsers for text file formats.
Peridigm is an open-source computational peridynamics code. It is a massively-parallel simulation code for implicit and explicit multi-physics simulations centering on solid mechanics and material failure. Peridigm is a C++ code utilizing foundational software components from Sandia’s Trilinos project and is fully compatible with the Cubit mesh generator and Paraview visualization codes.
pMEMCPY – PMEM optimized IO library
Demonstration system for how to best use Persistent Memory devices, such as Optane, for IO.
Poblano is a Matlab toolbox of large-scale algorithms for unconstrained nonlinear optimization problems. The algorithms in Poblano require only first-order derivative information (e.g., gradients for scalar-valued objective functions), and therefore can scale to very large problems. The driving application for Poblano development has been tensor decompositions in data analysis applications (bibliometric analysis, social network analysis, chemometrics, etc.).
Poblano optimizers find local minimizers of scalar-valued objective functions taking vector inputs. The gradient (i.e., first derivative) of the objective function is required for all Poblano optimizers. The optimizers converge to a stationary point where the gradient is approximately zero. A line search satisfying the strong Wolfe conditions is used to guarantee global convergence of the Poblano optimizers. The optimization methods in Poblano include several nonlinear conjugate gradient methods (Fletcher-Reeves, Polak-Ribiere, Hestenes-Stiefel), a limited-memory quasi-Newton method using BFGS updates to approximate second-order derivative information, and a truncated Newton method using finite differences to approximate second-order derivative information.
Portals is a message passing interface intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals is designed to operate on scales ranging from a small number of commodity desktops connected via Ethernet to massively parallel platforms connected with custom designed networks.
Portals attempts to provide a cohesive set of building blocks with which a wide variety of upper layer protocols (such as MPI, SHMEM, or UPC) may be built, while maintaining high performance and scalability. Many interfaces today either provide a simple mapping to hardware which may not be conducive to building upper layer protocols (InfiniBand/Open Fabrics Enterprise Distribution) or which are only a good fit for a subset of upper layer protocols (PSM, UCX).
Prove-It uses a powerful yet simple approach to theorem-proving. It is not designed with automation as the primary goal. The primary goal is flexibility in order to be able to follow, ideally, any valid and complete (indivisible) chain of reasoning. To that end, users can write additional Python scripts to make their own mathematical operations and LaTeX formatting rules. Axioms are statements that are taken to be true without proof. They can be added by the user at will in order to define their new mathematical objects and operations. Users can also add theorems that are to be proven. Theorems may be proven in any order. Theorem proofs are constructed in Jupyter notebooks (interactive Python sessions in a web browser). These notebooks render LaTeX-formatted mathematical expressions inline. Proofs are constructed by invoking axioms and other theorems and employing fundamental derivation steps (modus ponens, deduction, instantiation, generalization, or axiom elimination). Axioms and theorems may be invoked indirectly via convenience methods or automation (methods that are automatically invoked when attempting to prove something or as side-effects when something is proven). Theorem proofs and their axiom/theorem dependencies are stored in a kind of database (filesystem based). This database is used to prevent circular logic. It also allows users to track axioms and unproven theorems required by any particular proof. Convenience methods and automation tools may be added which utilize new theorems and aid future proofs. Mathematical objects and operations, axioms, and theorems are organized into built-in and user-defined packages.
PyApprox provides flexible and efficient tools for credible data-informed decision making. PyApprox implements methods addressing various issues surrounding high-dimensional parameter spaces and limited evaluations of expensive simulation models with the goal of facilitating simulation-aided knowledge discovery, prediction and design. Methods are available for: low-rank tensor-decomposition; Gaussian processes; polynomial chaos expansions; sparse-grids; risk-adverse regression; compressed sensing; Bayesian inference; push-forward based inference; optimal design of computer experiments for interpolation regression and compressed sensing; and risk-adverse optimal experimental design.
pyGSTi is an open-source software for modeling and characterizing noisy quantum information processors (QIPs), i.e., systems of one or more qubits. For more information and to download the code see: http://www.pygsti.info/ .
Source code is available on GitHub as follows:
1. Install Git on your machine from github.com 2. Initialize the target directory with "git init" 3. Download the software using "git pull git://github.com/dwbarne/PYLOTDB" 4. Read README_first.rtf file using Microsoft Word and follow directions.
PYLOTDB software is completely open source. PYLOTDB consists of two codes, PylotDB and Co-PylotDB. The software allows users to access either local or remote MySQL servers and easily display table data in an intuitive user-friendly GUI format. In PylotDB, data can be filtered and table columns are indexed with radiobuttons and checkboxes so data can be easily selected for plotting or statistical analysis. Plotting capability includes X-Y, semi-log, log-log, Kiviat (radar charts), scatter, and scatter with polynomial curve fits of the data. Entire databases, selected databases, or selected tables in selected databases can be backed up and restored as desired. Interfaces are provided for individual entry edits and additions.
PylotDB’s companion software Co-PylotDB is used to send data files to a user-selected database table. If data files are in YAML format, PylotDB can then automatically extract each entry in the file, expand the database table with new fields with names taken from the YAML entries, and insert the data in those fields for plotting and analysis. This is a tremendous time saver for analysts. This allows the cycle of "data capture to storage to analysis" to be completed in a matter of minutes.
Another powerful feature of PylotDB is the storage buffer where selected data from any table on any server can be stored and mathematically combined to generate new data not currently in any accessed table. The new data can then be plotted along with other data from the currently displayed table. This allows the user to generate desired data on the fly without the necessity of modifying any of the stored database tables.
PYLOTDB’s dependencies include matplotlib, numpy, MySQLdb, Python MegaWidgets (Pmw), and YAML. All of these dependencies are included in the download for Windows machines only. They can easily be found on the web for Mac or Linux machines.
Sample databases are also included to help users get up to speed quickly.
This software is being used at Sandia National Laboratories for analyzing results from our computer performance modeling analysis, but these codes can be used for any type of analysis where database storage and analysis is desired.
Pyomo is a Python-based open-source software package that supports a diverse set of capabilities for formulating, solving, and analyzing optimization models.
A core capability of Pyomo is modeling structured optimization applications. The Pyomo software package can be used to define general symbolic problems, create specific problem instances, and solve these instances using standard commercial and open-source solvers. Pyomo’s modeling objects are embedded within a full-featured high-level programming language with a rich set of supporting libraries that distinguishes it from other algebraic modeling languages such as AMPL, AIMMS and GAMS.
Pyomo supports a wide range of problem types, including:
- Linear programming
- Quadratic programming
- Nonlinear programming
- Mixed-integer linear programming
- Mixed-integer quadratic programming
- Mixed-integer nonlinear programming
- Mixed-integer stochastic programming
- Generalized disjunctive programming
- Differential algebraic equations
- Bilevel programming
- Mathematical programming with equilibrium constraints
Pyomo also supports iterative analysis and scripting within a full-featured programming language providing an effective framework for developing high-level optimization and analysis tools.
The Qthreads API is designed to make using large numbers of threads convenient and easy, and to allow portable access to threading constructs used in massively parallel shared memory environments. The API maps well to both MTA-style threading and PIM-style threading, and we provide an implementation of this interface in both a standard SMP context as well as the SST context. The qthreads API provides access to full/empty-bit (FEB) semantics, where every word of memory can be marked either full or empty, and a thread can wait for any word to attain either state.
Rapid Optimization Library (ROL)
Rapid Optimization Library (ROL) is a C++ package for large-scale optimization. It is used for the solution of optimal design, optimal control and inverse problems in large-scale engineering applications. Other uses include mesh optimization and image processing.
SmartBlock reusable workflow components
SmartBlock offers a way to compose workflow glue components using generic functionality rather than having to write that code directly. The initial release has examples of a few different operators and how to compose them for different applications and data formats.
The underlying transport technology is ADIOS + FlexPath, but could be replaced with any other transport mechanism, such as Mercury or CCA.
SparTen provides capabilities for computing reduced-dimension representations of sparse multidimensional count value data. The software consists of the data decompositions methods described in published journal papers. These decomposition methods consist of several numerical optimization methods (one based on a multiplicative update iterative approach, one based on quasi-Newton optimization, and one based on damped Newton optimization) for fitting the input data to a reduced-dimension model of the data with the lowest amount of error. The software also consists of generalized computation that leverages Kokkos to compute the reduced-data representations on multiple computer architectures, including multicore and GPU systems.
Progressive data storage IO library. This library enables computing on a small part of the simulation domain at a time and then stitching together a coherent domain view based on a time epoch on request. Initial demonstration is for metal additive manufacturing. Paper at IPDPS 2020: DOI: 10.1109/IPDPS47924.2020.00016
The Trilinos Project is an effort to develop algorithms and enabling technologies within an object-oriented software framework for the solution of large-scale, complex multi-physics engineering and scientific problems. A unique design feature of Trilinos is its focus on packages.
One of the biggest recent changes in high-performance computing is the increasing use of accelerators. Accelerators contain processing cores that independently are inferior to a core in a typical CPU, but these cores are replicated and grouped such that their aggregate execution provides a very high computation rate at a much lower power. Current and future CPU processors also require much more explicit parallelism. Each successive version of the hardware packs more cores into each processor, and technologies like hyperthreading and vector operations require even more parallel processing to leverage each core’s full potential.
VTK-m is a toolkit of scientific visualization algorithms for emerging processor architectures. VTK-m supports the fine-grained concurrency for data analysis and visualization algorithms required to drive extreme scale computing by providing abstract models for data and execution that can be applied to a variety of algorithms across many different processor architectures.
Zoltan is a toolkit of parallel algorithms for dynamic load balancing, geometric and hypergraph-based partitioning, graph coloring, matrix ordering, and distributed directories.
Zoltan is open-source software, distributed both as part of the Trilinos solver framework and as a stand-alone toolkit.