Publications

Results 126–150 of 9,998

Search results

Jump to search filters

Mesostructure Evolution During Powder Compression: Micro-CT Experiments and Particle-Based Simulations

Conference Proceedings of the Society for Experimental Mechanics Series

Cooper, Marcia; Clemmer, Joel T.; Silling, Stewart; Bufford, Daniel C.; Bolintineanu, Dan S.

Powders under compression form mesostructures of particle agglomerations in response to both inter- and intra-particle forces. The ability to computationally predict the resulting mesostructures with reasonable accuracy requires models that capture the distributions associated with particle size and shape, contact forces, and mechanical response during deformation and fracture. The following report presents experimental data obtained for the purpose of validating emerging mesostructures simulated by discrete element method and peridynamic approaches. A custom compression apparatus, suitable for integration with our micro-computed tomography (micro-CT) system, was used to collect 3-D scans of a bulk powder at discrete steps of increasing compression. Details of the apparatus and the microcrystalline cellulose particles, with a nearly spherical shape and mean particle size, are presented. Comparative simulations were performed with an initial arrangement of particles and particle shapes directly extracted from the validation experiment. The experimental volumetric reconstruction was segmented to extract the relative positions and shapes of individual particles in the ensemble, including internal voids in the case of the microcrystalline cellulose particles. These computationally determined particles were then compressed within the computational domain and the evolving mesostructures compared directly to those in the validation experiment. The ability of the computational models to simulate the experimental mesostructures and particle behavior at increasing compression is discussed.

More Details

In Their Shoes: Persona-Based Approaches to Software Quality Practice Incentivization

Computing in Science and Engineering

Mundt, Miranda R.; Milewicz, Reed M.; Raybourn, Elaine M.

Many teams struggle to adapt and right-size software engineering best practices for quality assurance to fit their context. Introducing software quality is not usually framed in a way that motivates teams to take action, thus resulting in it becoming a "check the box for compliance"activity instead of a cultural practice that values software quality and the effort to achieve it. When and how can we provide effective incentives for software teams to adopt and integrate meaningful and enduring software quality practices? We explored this question through a persona-based ideation exercise at the 2021 Collegeville Workshop on Scientific Software in which we created three unique personas that represent different scientific software developer perspectives.

More Details

Processing Particle Data Flows with SmartNICs

2022 IEEE High Performance Extreme Computing Conference, HPEC 2022

Liu, Jianshen; Maltzahn, Carlos; Curry, Matthew L.; Ulmer, Craig

Many distributed applications implement complex data flows and need a flexible mechanism for routing data between producers and consumers. Recent advances in programmable network interface cards, or SmartNICs, represent an opportunity to offload data-flow tasks into the network fabric, thereby freeing the hosts to perform other work. System architects in this space face multiple questions about the best way to leverage SmartNICs as processing elements in data flows. In this paper, we advocate the use of Apache Arrow as a foundation for implementing data-flow tasks on SmartNICs. We report on our experiences adapting a partitioning algorithm for particle data to Apache Arrow and measure the on-card processing performance for the BlueField-2 SmartNIC. Our experiments confirm that the BlueField-2's (de)compression hardware can have a significant impact on in-transit workflows where data must be unpacked, processed, and repacked.

More Details

FROSch PRECONDITIONERS FOR LAND ICE SIMULATIONS OF GREENLAND AND ANTARCTICA

SIAM Journal on Scientific Computing

Heinlein, Alexander; Perego, Mauro; Rajamanickam, Sivasankaran

Numerical simulations of Greenland and Antarctic ice sheets involve the solution of large-scale highly nonlinear systems of equations on complex shallow geometries. This work is concerned with the construction of Schwarz preconditioners for the solution of the associated tangent problems, which are challenging for solvers mainly because of the strong anisotropy of the meshes and wildly changing boundary conditions that can lead to poorly constrained problems on large portions of the domain. Here, two-level generalized Dryja-Smith-Widlund (GDSW)-type Schwarz preconditioners are applied to different land ice problems, i.e., a velocity problem, a temperature problem, as well as the coupling of the former two problems. We employ the message passing interface (MPI)- parallel implementation of multilevel Schwarz preconditioners provided by the package FROSch (fast and robust Schwarz) from the Trilinos library. The strength of the proposed preconditioner is that it yields out-of-the-box scalable and robust preconditioners for the single physics problems. To the best of our knowledge, this is the first time two-level Schwarz preconditioners have been applied to the ice sheet problem and a scalable preconditioner has been used for the coupled problem. The preconditioner for the coupled problem differs from previous monolithic GDSW preconditioners in the sense that decoupled extension operators are used to compute the values in the interior of the subdomains. Several approaches for improving the performance, such as reuse strategies and shared memory OpenMP parallelization, are explored as well. In our numerical study we target both uniform meshes of varying resolution for the Antarctic ice sheet as well as nonuniform meshes for the Greenland ice sheet. We present several weak and strong scaling studies confirming the robustness of the approach and the parallel scalability of the FROSch implementation. Among the highlights of the numerical results are a weak scaling study for up to 32 K processor cores (8 K MPI ranks and 4 OpenMP threads) and 566 M degrees of freedom for the velocity problem as well as a strong scaling study for up to 4 K processor cores (and MPI ranks) and 68 M degrees of freedom for the coupled problem.

More Details

Reverse-mode differentiation in arbitrary tensor network format: with application to supervised learning

Journal of Machine Learning Research

Safta, Cosmin; Jakeman, John D.; Gorodetsky, Alex A.

This paper describes an efficient reverse-mode differentiation algorithm for contraction operations for arbitrary and unconventional tensor network topologies. The approach leverages the tensor contraction tree of Evenbly and Pfeifer (2014), which provides an instruction set for the contraction sequence of a network. We show that this tree can be efficiently leveraged for differentiation of a full tensor network contraction using a recursive scheme that exploits (1) the bilinear property of contraction and (2) the property that trees have a single path from root to leaves. While differentiation of tensor-tensor contraction is already possible in most automatic differentiation packages, we show that exploiting these two additional properties in the specific context of contraction sequences can improve eficiency. Following a description of the algorithm and computational complexity analysis, we investigate its utility for gradient-based supervised learning for low-rank function recovery and for fitting real-world unstructured datasets. We demonstrate improved performance over alternating least-squares optimization approaches and the capability to handle heterogeneous and arbitrary tensor network formats. When compared to alternating minimization algorithms, we find that the gradient-based approach requires a smaller oversampling ratio (number of samples compared to number model parameters) for recovery. This increased efficiency extends to fitting unstructured data of varying dimensionality and when employing a variety of tensor network formats. Here, we show improved learning using the hierarchical Tucker method over the tensor-train in high-dimensional settings on a number of benchmark problems.

More Details

Nonlocal Kernel Network (NKN): a Stable and Resolution-Independent Deep Neural Network

D'Elia, Marta; Silling, Stewart; Yu, Yue; You, Huaiqian; Gao, Tian

Neural operators have recently become popular tools for designing solution maps between function spaces in the form of neural networks. Differently from classical scientific machine learning approaches that learn parameters of a known partial differential equation (PDE) for a single instance of the input parameters at a fixed resolution, neural operators approximate the solution map of a family of PDEs [6, 7]. Despite their success, the uses of neural operators are so far restricted to relatively shallow neural networks and confined to learning hidden governing laws. In this work, we propose a novel nonlocal neural operator, which we refer to as nonlocal kernel network (NKN), that is resolution independent, characterized by deep neural networks, and capable of handling a variety of tasks such as learning governing equations and classifying images. Our NKN stems from the interpretation of the neural network as a discrete nonlocal diffusion reaction equation that, in the limit of infinite layers, is equivalent to a parabolic nonlocal equation, whose stability is analyzed via nonlocal vector calculus. The resemblance with integral forms of neural operators allows NKNs to capture long-range dependencies in the feature space, while the continuous treatment of node-to-node interactions makes NKNs resolution independent. The resemblance with neural ODEs, reinterpreted in a nonlocal sense, and the stable network dynamics between layers allow for generalization of NKN’s optimal parameters from shallow to deep networks. This fact enables the use of shallow-to-deep initialization techniques [8]. Our tests show that NKNs outperform baseline methods in both learning governing equations and image classification tasks and generalize well to different resolutions and depths.

More Details

Assessing the predictive impact of factor fixing with an adaptive uncertainty-based approach

Environmental Modelling and Software

Wang, Qian; Guillaume, Joseph; Jakeman, John D.; Yang, Tao; Iwanaga, Takuya; Croke, Barry; Jakeman, Tony

Despite widespread use of factor fixing in environmental modeling, its effect on model predictions has received little attention and is instead commonly presumed to be negligible. We propose a proof-of-concept adaptive method for systematically investigating the impact of factor fixing. The method uses Global Sensitivity Analysis methods to identify groups of sensitive parameters, then quantifies which groups can be safely fixed at nominal values without exceeding a maximum acceptable error, demonstrated using the 21-dimensional Sobol’ G-function. Furthermore, three error measures are considered for quantities of interest, namely Relative Mean Absolute Error, Pearson Product-Moment Correlation and Relative Variance. Results demonstrate that factor fixing may cause large errors in the model results unexpectedly, when preliminary analysis suggests otherwise, and that the default value selected affects the number of factors to fix. To improve the applicability and methodological development of factor fixing, a new research agenda encompassing five opportunities is discussed for further attention.

More Details

Performant implementation of the atomic cluster expansion (PACE) and application to copper and silicon

npj Computational Materials

Lysogorskiy, Yury; Van Der Oord, Van; Bochkarev, Anton; Menon, Sarath; Rinaldi, Matteo; Hammerschmidt, Thomas; Mrovec, Matous; Thompson, A.P.; Csanyi, Gabor; Ortner, Christoph; Drautz, Ralf

The atomic cluster expansion is a general polynomial expansion of the atomic energy in multi-atom basis functions. Here we implement the atomic cluster expansion in the performant C++ code PACE that is suitable for use in large-scale atomistic simulations. We briefly review the atomic cluster expansion and give detailed expressions for energies and forces as well as efficient algorithms for their evaluation. We demonstrate that the atomic cluster expansion as implemented in PACE shifts a previously established Pareto front for machine learning interatomic potentials toward faster and more accurate calculations. Moreover, general purpose parameterizations are presented for copper and silicon and evaluated in detail. We show that the Cu and Si potentials significantly improve on the best available potentials for highly accurate large-scale atomistic simulations.

More Details

Quantifying the unknown impact of segmentation uncertainty on image-based simulations

Nature Communications

Krygier, Michael; Labonte, Tyler; Martinez, Carianne; Norris, Chance; Sharma, Krish; Collins, Lincoln N.; Mukherjee, Partha P.; Foulk, James W.

Image-based simulation, the use of 3D images to calculate physical quantities, relies on image segmentation for geometry creation. However, this process introduces image segmentation uncertainty because different segmentation tools (both manual and machine-learning-based) will each produce a unique and valid segmentation. First, we demonstrate that these variations propagate into the physics simulations, compromising the resulting physics quantities. Second, we propose a general framework for rapidly quantifying segmentation uncertainty. Through the creation and sampling of segmentation uncertainty probability maps, we systematically and objectively create uncertainty distributions of the physics quantities. We show that physics quantity uncertainty distributions can follow a Normal distribution, but, in more complicated physics simulations, the resulting uncertainty distribution can be surprisingly nontrivial. We establish that bounding segmentation uncertainty can fail in these nontrivial situations. While our work does not eliminate segmentation uncertainty, it improves simulation credibility by making visible the previously unrecognized segmentation uncertainty plaguing image-based simulation.

More Details

Developing Uncertainty Quantification Strategies in Electromagnetic Problems Involving Highly Resonant Cavities

Journal of Verification, Validation and Uncertainty Quantification

Campione, Salvatore; Stephens, John A.; Martin, Nevin; Eckert, Aubrey; Warne, Larry K.; Huerta, Jose G.; Pfeiffer, Robert A.; Jones, Adam

High-quality factor resonant cavities are challenging structures to model in electromagnetics owing to their large sensitivity to minute parameter changes. Therefore, uncertainty quantification (UQ) strategies are pivotal to understanding key parameters affecting the cavity response. We discuss here some of these strategies focusing on shielding effectiveness (SE) properties of a canonical slotted cylindrical cavity that will be used to develop credibility evidence in support of predictions made using computational simulations for this application.

More Details

Evaluating MPI resource usage summary statistics

Parallel Computing

Ferreira, Kurt; Levy, Scott L.N.

The Message Passing Interface (MPI) remains the dominant programming model for scientific applications running on today's high-performance computing (HPC) systems. This dominance stems from MPI's powerful semantics for inter-process communication that has enabled scientists to write applications for simulating important physical phenomena. MPI does not, however, specify how messages and synchronization should be carried out. Those details are typically dependent on low-level architecture details and the message characteristics of the application. Therefore, analyzing an application's MPI resource usage is critical to tuning MPI's performance on a particular platform. The result of this analysis is typically a discussion of the mean message sizes, queue search lengths and message arrival times for a workload or set of workloads. While a discussion of the arithmetic mean in MPI resource usage might be the most intuitive summary statistic, it is not always the most accurate in terms of representing the underlying data. In this paper, we analyze MPI resource usage for a number of key MPI workloads using an existing MPI trace collector and discrete-event simulator. Our analysis demonstrates that the average, while easy and efficient to calculate, is a useful metric for characterizing latency and bandwidth measurements, but may not be a good representation of application message sizes, match list search depths, or MPI inter-operation times. Additionally, we show that the median and mode are superior choices in many cases. We also observe that the arithmetic mean is not the best representation of central tendency for data that are drawn from distributions that are multi-modal or have heavy tails. The results and analysis of our work provide valuable guidance on how we, as a community, should discuss and analyze MPI resource usage data for scientific applications.

More Details

Quantifying the unknown impact of segmentation uncertainty on image-based simulations

Nature Communications

Krygier, Michael; Labonte, Tyler; Martinez, Carianne; Norris, Chance; Sharma, Krish; Collins, Lincoln N.; Mukherjee, Partha P.; Roberts, Scott A.

Image-based simulation, the use of 3D images to calculate physical quantities, relies on image segmentation for geometry creation. However, this process introduces image segmentation uncertainty because different segmentation tools (both manual and machine-learning-based) will each produce a unique and valid segmentation. First, we demonstrate that these variations propagate into the physics simulations, compromising the resulting physics quantities. Second, we propose a general framework for rapidly quantifying segmentation uncertainty. Through the creation and sampling of segmentation uncertainty probability maps, we systematically and objectively create uncertainty distributions of the physics quantities. We show that physics quantity uncertainty distributions can follow a Normal distribution, but, in more complicated physics simulations, the resulting uncertainty distribution can be surprisingly nontrivial. We establish that bounding segmentation uncertainty can fail in these nontrivial situations. While our work does not eliminate segmentation uncertainty, it improves simulation credibility by making visible the previously unrecognized segmentation uncertainty plaguing image-based simulation.

More Details

Timely Reporting of Heavy Hitters Using External Memory

ACM Transactions on Database Systems

Singh, Shikha; Pandey, Prashant; Bender, Michael A.; Berry, Jonathan; Farach-Colton, Martin; Johnson, Rob; Kroeger, Thomas; Phillips, Cynthia A.

Given an input stream S of size N, a φ-heavy hitter is an item that occurs at least φN times in S. The problem of finding heavy-hitters is extensively studied in the database literature.We study a real-time heavy-hitters variant in which an element must be reported shortly after we see its T = φN-th occurrence (and hence it becomes a heavy hitter). We call this the Timely Event Detection (TED) Problem. The TED problem models the needs of many real-world monitoring systems, which demand accurate (i.e., no false negatives) and timely reporting of all events from large, high-speed streams with a low reporting threshold (high sensitivity).Like the classic heavy-hitters problem, solving the TED problem without false-positives requires large space (ω (N) words). Thus in-RAM heavy-hitters algorithms typically sacrifice accuracy (i.e., allow false positives), sensitivity, or timeliness (i.e., use multiple passes).We show how to adapt heavy-hitters algorithms to external memory to solve the TED problem on large high-speed streams while guaranteeing accuracy, sensitivity, and timeliness. Our data structures are limited only by I/O-bandwidth (not latency) and support a tunable tradeoff between reporting delay and I/O overhead. With a small bounded reporting delay, our algorithms incur only a logarithmic I/O overhead.We implement and validate our data structures empirically using the Firehose streaming benchmark. Multi-threaded versions of our structures can scale to process 11M observations per second before becoming CPU bound. In comparison, a naive adaptation of the standard heavy-hitters algorithm to external memory would be limited by the storage device's random I/O throughput, i.e., ≈100K observations per second.

More Details

NMSBA Sustainable Engineering (Final Report)

Nicholson, Bethany L.; Siirola, John D.

This report summarizes the guidance provided to Sustainable Engineering to help them learn about equation-oriented optimization and the Sandia-developed software packages Pyomo and IDAESPSE. This was a short 10-week project (October 2021 – December 2021) and the goal was to help the company learn about the IDAES framework and how it could be used for their future projects. The company submitted an SBIR proposal related to developing a green ammonia process model with IDAES and if that proposal is successful this NMSBA project could lead to future collaboration opportunities.

More Details

Document Retrieval and Ranking using Similarity Graph Mean Hitting Times

Dunlavy, Daniel M.; Chew, Peter A.

We present a novel approach to information retrieval and document analysis based on graph analytic methods. Traditional information retrieval methods use a set of terms to define a query that is applied against a document corpus to identify the documents most related to those terms. In contrast, we define a query as a set of documents of interest and apply the query by computing mean hitting times between this set and all other documents on a document similarity graph abstraction of the semantic relationships between all pairs of documents. We present the steps of our approach along with a simple example application illustrating how this approach can be used to find documents related to two or more documents or topics of interest.

More Details

Discrete modeling of a transformer with ALEGRA

Rodriguez, Angel E.; Niederhaus, John H.J.; Greenwood, Wesley J.; Clutz, Christopher C.

We report progress on a task to model transformers in ALEGRA using the “Transient Magnetics” option. We specifically evaluate limits of the approach resolving individual coil wires. There are practical limits to the number of turns in a coil that can be numerically modeled, but calculated inductance can be scaled to the correct number of turns in a simple way. Our testing essentially confirmed this “turns scaling” hypothesis. We developed a conceptual transformer design, representative of practical designs of interest, and that focused our analysis. That design includes three coils wrapped around a rectangular ferromagnetic core. The secondary and tertiary coils have multiple layers. The tertiary has three layers of 13 turns each; the secondary has five layers of 44 turns; the primary has one layer of 20 turns. We validated the turns scaling of inductance for simple (one-layer) coils in air (no core) by comparison to available independent calculations for simple rectangular coils. These comparisons quantified the errors versus reduced number of turns modeled. For more than 3 turns, the errors are <5%. The magnetic field solver failed to converge (within 5000 iterations) for >10 turns. Including the core introduced some complications. It was necessary to capture the core surfaces in thin grid sheaths to minimize errors in computed magnetic energy. We do not yet have quantitative benchmarks with which to compare, but calculated results are qualitatively reasonable.

More Details
Results 126–150 of 9,998
Results 126–150 of 9,998