The atomic cluster expansion is a general polynomial expansion of the atomic energy in multi-atom basis functions. Here we implement the atomic cluster expansion in the performant C++ code PACE that is suitable for use in large-scale atomistic simulations. We briefly review the atomic cluster expansion and give detailed expressions for energies and forces as well as efficient algorithms for their evaluation. We demonstrate that the atomic cluster expansion as implemented in PACE shifts a previously established Pareto front for machine learning interatomic potentials toward faster and more accurate calculations. Moreover, general purpose parameterizations are presented for copper and silicon and evaluated in detail. We show that the Cu and Si potentials significantly improve on the best available potentials for highly accurate large-scale atomistic simulations.
We present a novel approach to information retrieval and document analysis based on graph analytic methods. Traditional information retrieval methods use a set of terms to define a query that is applied against a document corpus to identify the documents most related to those terms. In contrast, we define a query as a set of documents of interest and apply the query by computing mean hitting times between this set and all other documents on a document similarity graph abstraction of the semantic relationships between all pairs of documents. We present the steps of our approach along with a simple example application illustrating how this approach can be used to find documents related to two or more documents or topics of interest.
The Message Passing Interface (MPI) remains the dominant programming model for scientific applications running on today's high-performance computing (HPC) systems. This dominance stems from MPI's powerful semantics for inter-process communication that has enabled scientists to write applications for simulating important physical phenomena. MPI does not, however, specify how messages and synchronization should be carried out. Those details are typically dependent on low-level architecture details and the message characteristics of the application. Therefore, analyzing an application's MPI resource usage is critical to tuning MPI's performance on a particular platform. The result of this analysis is typically a discussion of the mean message sizes, queue search lengths and message arrival times for a workload or set of workloads. While a discussion of the arithmetic mean in MPI resource usage might be the most intuitive summary statistic, it is not always the most accurate in terms of representing the underlying data. In this paper, we analyze MPI resource usage for a number of key MPI workloads using an existing MPI trace collector and discrete-event simulator. Our analysis demonstrates that the average, while easy and efficient to calculate, is a useful metric for characterizing latency and bandwidth measurements, but may not be a good representation of application message sizes, match list search depths, or MPI inter-operation times. Additionally, we show that the median and mode are superior choices in many cases. We also observe that the arithmetic mean is not the best representation of central tendency for data that are drawn from distributions that are multi-modal or have heavy tails. The results and analysis of our work provide valuable guidance on how we, as a community, should discuss and analyze MPI resource usage data for scientific applications.
This report summarizes the guidance provided to Sustainable Engineering to help them learn about equation-oriented optimization and the Sandia-developed software packages Pyomo and IDAESPSE. This was a short 10-week project (October 2021 – December 2021) and the goal was to help the company learn about the IDAES framework and how it could be used for their future projects. The company submitted an SBIR proposal related to developing a green ammonia process model with IDAES and if that proposal is successful this NMSBA project could lead to future collaboration opportunities.
We report progress on a task to model transformers in ALEGRA using the “Transient Magnetics” option. We specifically evaluate limits of the approach resolving individual coil wires. There are practical limits to the number of turns in a coil that can be numerically modeled, but calculated inductance can be scaled to the correct number of turns in a simple way. Our testing essentially confirmed this “turns scaling” hypothesis. We developed a conceptual transformer design, representative of practical designs of interest, and that focused our analysis. That design includes three coils wrapped around a rectangular ferromagnetic core. The secondary and tertiary coils have multiple layers. The tertiary has three layers of 13 turns each; the secondary has five layers of 44 turns; the primary has one layer of 20 turns. We validated the turns scaling of inductance for simple (one-layer) coils in air (no core) by comparison to available independent calculations for simple rectangular coils. These comparisons quantified the errors versus reduced number of turns modeled. For more than 3 turns, the errors are <5%. The magnetic field solver failed to converge (within 5000 iterations) for >10 turns. Including the core introduced some complications. It was necessary to capture the core surfaces in thin grid sheaths to minimize errors in computed magnetic energy. We do not yet have quantitative benchmarks with which to compare, but calculated results are qualitatively reasonable.
High-quality factor resonant cavities are challenging structures to model in electromagnetics owing to their large sensitivity to minute parameter changes. Therefore, uncertainty quantification (UQ) strategies are pivotal to understanding key parameters affecting the cavity response. We discuss here some of these strategies focusing on shielding effectiveness (SE) properties of a canonical slotted cylindrical cavity that will be used to develop credibility evidence in support of predictions made using computational simulations for this application.
Given an input stream S of size N, a φ-heavy hitter is an item that occurs at least φN times in S. The problem of finding heavy-hitters is extensively studied in the database literature.We study a real-time heavy-hitters variant in which an element must be reported shortly after we see its T = φN-th occurrence (and hence it becomes a heavy hitter). We call this the Timely Event Detection (TED) Problem. The TED problem models the needs of many real-world monitoring systems, which demand accurate (i.e., no false negatives) and timely reporting of all events from large, high-speed streams with a low reporting threshold (high sensitivity).Like the classic heavy-hitters problem, solving the TED problem without false-positives requires large space (ω (N) words). Thus in-RAM heavy-hitters algorithms typically sacrifice accuracy (i.e., allow false positives), sensitivity, or timeliness (i.e., use multiple passes).We show how to adapt heavy-hitters algorithms to external memory to solve the TED problem on large high-speed streams while guaranteeing accuracy, sensitivity, and timeliness. Our data structures are limited only by I/O-bandwidth (not latency) and support a tunable tradeoff between reporting delay and I/O overhead. With a small bounded reporting delay, our algorithms incur only a logarithmic I/O overhead.We implement and validate our data structures empirically using the Firehose streaming benchmark. Multi-threaded versions of our structures can scale to process 11M observations per second before becoming CPU bound. In comparison, a naive adaptation of the standard heavy-hitters algorithm to external memory would be limited by the storage device's random I/O throughput, i.e., ≈100K observations per second.
Compared to the classical Lanczos algorithm, the s ‐step Lanczos variant has the potential to improve performance by asymptotically decreasing the synchronization cost per iteration. However, this comes at a price; despite being mathematically equivalent, the s ‐step variant may behave quite differently in finite precision, potentially exhibiting greater loss of accuracy and slower convergence relative to the classical algorithm. It has previously been shown that the errors in the s ‐step version follow the same structure as the errors in the classical algorithm, but are amplified by a factor depending on the square of the condition number of the ‐dimensional Krylov bases computed in each outer loop. As the condition number of these s ‐step bases grows (in some cases very quickly) with s , this limits the s values that can be chosen and thus can limit the attainable performance. In this work, we show that if a select few computations in s ‐step Lanczos are performed in double the working precision, the error terms then depend only linearly on the conditioning of the s ‐step bases. This has the potential for drastically improving the numerical behavior of the algorithm with little impact on per‐iteration performance. Our numerical experiments demonstrate the improved numerical behavior possible with the mixed precision approach, and also show that this improved behavior extends to mixed precision s ‐step CG. We present preliminary performance results on NVIDIA V100 GPUs that show that the overhead of extra precision is minimal if one uses precisions implemented in hardware.
A new empirical potential for efficient, large scale molecular dynamics simulation of water is presented. The HIPPO (Hydrogen-like Intermolecular Polarizable POtential) force field is based upon the model electron density of a hydrogen-like atom. This framework is used to derive and parametrize individual terms describing charge penetration damped permanent electrostatics, damped polarization, charge transfer, anisotropic Pauli repulsion, and damped dispersion interactions. Initial parameter values were fit to Symmetry Adapted Perturbation Theory (SAPT) energy components for ten water dimer configurations, as well as the radial and angular dependence of the canonical dimer. The SAPT-based parameters were then systematically refined to extend the treatment to water bulk phases. The final HIPPO water model provides a balanced representation of a wide variety of properties of gas phase clusters, liquid water, and ice polymorphs, across a range of temperatures and pressures. This water potential yields a rationalization of water structure, dynamics, and thermodynamics explicitly correlated with an ab initio energy decomposition, while providing a level of accuracy comparable or superior to previous polarizable atomic multipole force fields. The HIPPO water model serves as a cornerstone around which similarly detailed physics-based models can be developed for additional molecular species.
Reverse engineering (RE) analysts struggle to address critical questions about the safety of binary code accurately and promptly, and their supporting program analysis tools are simply wrong sometimes. The analysis tools have to approximate in order to provide any information at all, but this means that they introduce uncertainty into their results. And those uncertainties chain from analysis to analysis. We hypothesize that exposing sources, impacts, and control of uncertainty to human binary analysts will allow the analysts to approach their hardest problems with high-powered analytic techniques that they know when to trust. Combining expertise in binary analysis algorithms, human cognition, uncertainty quantification, verification and validation, and visualization, we pursue research that should benefit binary software analysis efforts across the board. We find a strong analogy between RE and exploratory data analysis (EDA); we begin to characterize sources and types of uncertainty found in practice in RE (both in the process and in supporting analyses); we explore a domain-specific focus on uncertainty in pointer analysis, showing that more precise models do help analysts answer small information flow questions faster and more accurately; and we test a general population with domain-general sudoku problems, showing that adding "knobs" to an analysis does not significantly slow down performance. This document describes our explorations in uncertainty in binary analysis.