Constructing and Accessing Tabulated Chemistry for Fire Scenarios
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
The goal of the ExaWind project is to enable predictive simulations of wind farms comprised of many megawatt-scale turbines situated in complex terrain. Predictive simulations will require computational fluid dynamics (CFD) simulations for which the mesh resolves the geometry of the turbines and captures the rotation and large deflections of blades. Whereas such simulations for a single turbine are arguably petascale class, multi-turbine wind farm simulations will require exascale-class resources. The primary physics codes in the ExaWind project are Nalu-Wind, which is an unstructured-grid solver for the acoustically incompressible Navier-Stokes equations, and OpenFAST, which is a whole-turbine simulation code. The Nalu-Wind model consists of the mass-continuity Poisson-type equation for pressure and a momentum equation for the velocity. For such modeling approaches, simulation times are dominated by linear-system setup and solution for the continuity and momentum systems. For the ExaWind challenge problem, the moving meshes greatly affect overall solver costs as reinitialization of matrices and recomputation of preconditioners is required at every time step. This milestone represents an effort to increase the fidelity of Nalu-Wind at a fixed resolution through the implementation of a tensor-product based, matrix-free high order scheme. High order finite element methods have increased local work per datum communicated and have the potential to provide significantly more accurate solutions at a fixed number of degrees of freedom. Previous to this milestone, Nalu-Wind had an arbitrary order Control Volume Finite Element Method discretization as a solver option, but it required too much memory and was too slow to be of practical use. The work in this milestone addresses these issues by first implementing an implicit, high order solver that only partially assembles the global system. This reduces the memory footprint of the high-order scheme by orders of magnitude for higher polynomial orders. Second, a faster, tensor-product based method for evaluating the action of the left-hand side was implemented. This reduces the amount of computational work required by the scheme and dramatically enhanced the time-to-solution on example problems. Finally, this milestone is an evaluation of the value of high order methods in the wind application space. With the enhancements to memory and computational cost, accuracy vs. time-to-solution was evaluated for several resolutions on an under-resolved Taylor Green vortex test case. Results show that the high order scheme is cost-competitive with the production low-order schemes in Nalu-Wind, being moderately more expensive than the production edge-based vertex centered finite volume scheme. The evaluation of accuracy on the test case shows a potential benefit to high order at the highest resolution while not deteriorating accuracy on the lowest tested resolution. More work is needed to show value in the wind application, but positive strides have been made.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
ASME-JSME-KSME 2019 8th Joint Fluids Engineering Conference, AJKFluids 2019
Power production of the turbines at the Department of Energy/Sandia National Laboratories Scaled Wind Farm Technology (SWiFT) facility located at the Texas Tech University’s National Wind Institute Research Center was measured experimentally and simulated for neutral atmospheric boundary layer operating conditions. Two V27 wind turbines were aligned in series with the dominant wind direction, and the upwind turbine was yawed to investigate the impact of wake steering on the downwind turbine. Two conditions were investigated, including that of the leading turbine operating alone and both turbines operating in series. The field measurements include meteorological evaluation tower (MET) data and light detection and ranging (lidar) data. Computations were performed by coupling large eddy simulations (LES) in the three-dimensional, transient code Nalu-Wind with engineering actuator line models of the turbines from OpenFAST. The simulations consist of a coarse precursor without the turbines to set up an atmospheric boundary layer inflow followed by a simulation with refinement near the turbines. Good agreement between simulations and field data are shown. These results demonstrate that Nalu-Wind holds the promise for the prediction of wind plant power and loads for a range of yaw conditions.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Wind applications require the ability to simulate rotating blades. To support this use-case, a novel design-order sliding mesh algorithm has been developed and deployed. The hybrid method combines the control volume finite element methodology (CVFEM) with concepts found within a discontinuous Galerkin (DG) finite element method (FEM) to manage a sliding mesh. The method has been demonstrated to be design-order for the tested polynomial basis (P=1 and P=2) and has been deployed to provide production simulation capability for a Vestas V27 (225 kW) wind turbine. Other stationary and canonical rotating ow simulations are also presented. As the majority of wind-energy applications are driving extensive usage of hybrid meshes, a foundational study that outlines near-wall numerical behavior for a variety of element topologies is presented. Results indicate that the proposed nonlinear stabilization operator (NSO) is an effective stabilization methodology to control Gibbs phenomena at large cell Peclet numbers. The study also provides practical mesh resolution guidelines for future analysis efforts. Application-driven performance and algorithmic improvements have been carried out to increase robustness of the scheme on hybrid production wind energy meshes. Specifically, the Kokkos-based Nalu Kernel construct outlined in the FY17/Q4 ExaWind milestone has been transitioned to the hybrid mesh regime. This code base is exercised within a full V27 production run. Simulation timings for parallel search and custom ghosting are presented. As the low-Mach application space requires implicit matrix solves, the cost of matrix reinitialization has been evaluated on a variety of production meshes. Results indicate that at low element counts, i.e., fewer than 100 million elements, matrix graph initialization and preconditioner setup times are small. However, as mesh sizes increase, e.g., 500 million elements, simulation time associated with \setup-up" costs can increase to nearly 50% of overall simulation time when using the full Tpetra solver stack and nearly 35% when using a mixed Tpetra- Hypre-based solver stack. The report also highlights the project achievement of surpassing the 1 billion element mesh scale for a production V27 hybrid mesh. A detailed timing breakdown is presented that again suggests work to be done in the setup events associated with the linear system. In order to mitigate these initialization costs, several application paths have been explored, all of which are designed to reduce the frequency of matrix reinitialization. Methods such as removing Jacobian entries on the dynamic matrix columns (in concert with increased inner equation iterations), and lagging of Jacobian entries have reduced setup times at the cost of numerical stability. Artificially increasing, or bloating, the matrix stencil to ensure that full Jacobians are included is developed with results suggesting that this methodology is useful in decreasing reinitialization events without loss of matrix contributions. With the above foundational advances in computational capability, the project is well positioned to begin scientific inquiry on a variety of wind-farm physics such as turbine/turbine wake interactions.
This memo summarizes the aerodynamic drag scoping work done for Goodyear in early FY18. The work is to evaluate the feasibility of using Sierra/Low-Mach (Fuego) for drag predictions of rolling tires, particularly focused on the effects of tire features such as lettering, sidewall geometry, rim geometry, and interaction with the vehicle body. The work is broken into two parts. Part 1 consisted of investigation of a canonical validation problem (turbulent flow over a cylinder) using existing tools with different meshes and turbulence models. Part 2 involved calculating drag differences over plate geometries with simple features (ridges and grooves) defined by Goodyear of approximately the size of interest for a tire. The results of part 1 show the level of noise to be expected in a drag calculation and highlight the sensitivity of absolute predictions to model parameters such as mesh size and turbulence model. There is 20-30% noise in the experimental measurements on the canonical cylinder problem, and a similar level of variation between different meshes and turbulence models. Part 2 shows that there is a notable difference in the predicted drag on the sample plate geometries, however, the computational cost of extending the LES model to a full tire would be significant. This cost could be reduced by implementation of more sophisticated wall and turbulence models (e.g. detached eddy simulations - DES) and by focusing the mesh refinement on feature subsets with the goal of comparing configurations rather than absolute predictivity for the whole tire.
2018 Spring Technical Meeting of the Western States Section of the Combustion Institute, WSSCI 2018
This study addresses predicting the internal thermochemical state in buoyant fire plumes using largeeddy simulations (LES) with a tabular flamelet library for the underlying flame chemistry. Buoyant fire plumes are characterized by moderate turbulent mixing, soot growth and oxidation and radiation transport. Soot moments, mixture fraction and enthalpy evolve in the LES with soot source terms given by the non-adiabatic flamelet library. Participating media radiation transport is predicted using the discrete ordinates method with source terms also from the flamelet library, and the LES subgrid-scale modeling is based on a one-equation kinetic-energy sub-filter model. This library is generated with flamelet states that include unsteady heat loss through extinction nominally representing radiative quenching. We describe the performance of this model both in the context of a laminar coflow configuration where extensive measurements are available and in buoyant turbulent fire plumes where measurements are more global.
Wind applications require the ability to simulate rotating blades. To support this use-case, a novel design-order sliding mesh algorithm has been developed and deployed. The hybrid method combines the control volume finite element methodology (CVFEM) with concepts found within a discontinuous Galerkin (DG) finite element method (FEM) to manage a sliding mesh. The method has been demonstrated to be design-order for the tested polynomial basis (P=1 and P=2) and has been deployed to provide production simulation capability for a Vestas V27 (225 kW) wind turbine. Other stationary and canonical rotating flow simulations are also presented. As the majority of wind-energy applications are driving extensive usage of hybrid meshes, a foundational study that outlines near-wall numerical behavior for a variety of element topologies is presented. Results indicate that the proposed nonlinear stabilization operator (NSO) is an effective stabilization methodology to control Gibbs phenomena at large cell Peclet numbers. The study also provides practical mesh resolution guidelines for future analysis efforts. Application-driven performance and algorithmic improvements have been carried out to increase robustness of the scheme on hybrid production wind energy meshes. Specifically, the Kokkos-based Nalu Kernel construct outlined in the FY17/Q4 ExaWind milestone has been transitioned to the hybrid mesh regime. This code base is exercised within a full V27 production run. Simulation timings for parallel search and custom ghosting are presented. As the low-Mach application space requires implicit matrix solves, the cost of matrix reinitialization has been evaluated on a variety of production meshes. Results indicate that at low element counts, i.e., fewer than 100 million elements, matrix graph initialization and preconditioner setup times are small. However, as mesh sizes increase, e.g., 500 million elements, simulation time associated with "setup-up" costs can increase to nearly 50% of overall simulation time when using the full Tpetra solver stack and nearly 35% when using a mixed Tpetra- Hypre-based solver stack. The report also highlights the project achievement of surpassing the 1 billion element mesh scale for a production V27 hybrid mesh. A detailed timing breakdown is presented that again suggests work to be done in the setup events associated with the linear system. In order to mitigate these initialization costs, several application paths have been explored, all of which are designed to reduce the frequency of matrix reinitialization. Methods such as removing Jacobian entries on the dynamic matrix columns (in concert with increased inner equation iterations), and lagging of Jacobian entries have reduced setup times at the cost of numerical stability. Artificially increasing, or bloating, the matrix stencil to ensure that full Jacobians are included is developed with results suggesting that this methodology is useful in decreasing reinitialization events without loss of matrix contributions. With the above foundational advances in computational capability, the project is well positioned to begin scientific inquiry on a variety of wind-farm physics such as turbine/turbine wake interactions.
Abstract not provided.
The former Nalu interior heterogeneous algorithm design, which was originally designed to manage matrix assembly operations over all elemental topology types, has been modified to operate over homogeneous collections of mesh entities. This newly templated kernel design allows for removal of workset variable resize operations that were formerly required at each loop over a Sierra ToolKit (STK) bucket (nominally, 512 entities in size). Extensive usage of the Standard Template Library (STL) std::vector has been removed in favor of intrinsic Kokkos memory views. In this milestone effort, the transition to Kokkos as the underlying infrastructure to support performance and portability on many-core architectures has been deployed for key matrix algorithmic kernels. A unit-test driven design effort has developed a homogeneous entity algorithm that employs a team-based thread parallelism construct. The STK Single Instruction Multiple Data (SIMD) infrastructure is used to interleave data for improved vectorization. The collective algorithm design, which allows for concurrent threading and SIMD management, has been deployed for the core low-Mach element- based algorithm. Several tests to ascertain SIMD performance on Intel KNL and Haswell architectures have been carried out. The performance test matrix includes evaluation of both low- and higher-order methods. The higher-order low-Mach methodology builds on polynomial promotion of the core low-order control volume nite element method (CVFEM). Performance testing of the Kokkos-view/SIMD design indicates low-order matrix assembly kernel speed-up ranging between two and four times depending on mesh loading and node count. Better speedups are observed for higher-order meshes (currently only P=2 has been tested) especially on KNL. The increased workload per element on higher-order meshes bene ts from the wide SIMD width on KNL machines. Combining multiple threads with SIMD on KNL achieves a 4.6x speedup over the baseline, with assembly timings faster than that observed on Haswell architecture. The computational workload of higher-order meshes, therefore, seems ideally suited for the many-core architecture and justi es further exploration of higher-order on NGP platforms. A Trilinos/Tpetra-based multi-threaded GMRES preconditioned by symmetric Gauss Seidel (SGS) represents the core solver infrastructure for the low-Mach advection/diffusion implicit solves. The threaded solver stack has been tested on small problems on NREL's Peregrine system using the newly developed and deployed Kokkos-view/SIMD kernels. fforts are underway to deploy the Tpetra-based solver stack on NERSC Cori system to benchmark its performance at scale on KNL machines.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This report documents work performed using ALCC computing resources granted under a proposal submitted in February 2016, with the resource allocation period spanning the period July 2016 through June 2017. The award allocation was 10.7 million processor-hours at the National Energy Research Scientific Computing Center. The simulations performed were in support of two projects: the Atmosphere to Electrons (A2e) project, supported by the DOE EERE office; and the Exascale Computing Project (ECP), supported by the DOE Office of Science. The project team for both efforts consists of staff scientists and postdocs from Sandia National Laboratories and the National Renewable Energy Laboratory. At the heart of these projects is the open-source computational-fluid-dynamics (CFD) code, Nalu. Nalu solves the low-Mach-number Navier-Stokes equations using an unstructured- grid discretization. Nalu leverages the open-source Trilinos solver library and the Sierra Toolkit (STK) for parallelization and I/O. This report documents baseline computational performance of the Nalu code on problems of direct relevance to the wind plant physics application - namely, Large Eddy Simulation (LES) of an atmospheric boundary layer (ABL) flow and wall-modeled LES of a flow past a static wind turbine rotor blade. Parallel performance of Nalu and its constituent solver routines residing in the Trilinos library has been assessed previously under various campaigns. However, both Nalu and Trilinos have been, and remain, in active development and resources have not been available previously to rigorously track code performance over time. With the initiation of the ECP, it is important to establish and document baseline code performance on the problems of interest. This will allow the project team to identify and target any deficiencies in performance, as well as highlight any performance bottlenecks as we exercise the code on a greater variety of platforms and at larger scales. The current study is rather modest in scale, examining performance on problem sizes of O(100 million) elements and core counts up to 8k cores. This will be expanded as more computational resources become available to the projects.
Proceedings of the Combustion Institute
Turbulent fluctuations of the scalar dissipation rate have a major impact on extinction in non-premixed combustion. Recently, an unsteady extinction criterion has been developed (Hewson, 2013) that predicts extinction dependent on the duration and the magnitude of dissipation rate fluctuations exceeding a critical quenching value; this quantity is referred to as the dissipation impulse. The magnitude of the dissipation impulse corresponding to unsteady extinction is related to the difficulty with which a flamelet is exintguished, based on the steady-state S-curve. In this paper we evaluate this new extinction criterion for more realistic dissipation rates by evolving a stochastic Ornstein-Uhlenbeck process for the dissipation rate. A comparison between unsteady flamelet evolution using this dissipation rate and the extinction criterion exhibit good agreement. The rate of predicted extinction is examined over a range of Damköhler and Reynolds numbers and over a range of the extinction difficulty. The results suggest that the rate of extinction is proportional to the average dissipation rate and the area under the dissipation rate probability density function exceeding the steady-state quenching value. It is also inversely related to the actual probability that this steady-state quenching dissipation rate is observed and the difficulty of extinction associated with the distance between the upper and middle branches of the S-curve.
Abstract not provided.