Publications

Results 6101–6125 of 9,998

Search results

Jump to search filters

Extension and evaluation of the multilevel summation method for fast long-range electrostatics calculations

Journal of Chemical Physics

Moore, Stan G.; Crozier, Paul C.

Several extensions and improvements have been made to the multilevel summation method (MSM) of computing long-range electrostatic interactions. These include pressure calculation, an improved error estimator, faster direct part calculation, extension to non-orthogonal (triclinic) systems, and parallelization using the domain decomposition method. MSM also allows fully non-periodic long-range electrostatics calculations which are not possible using traditional Ewald-based methods. In spite of these significant improvements to the MSM algorithm, the particle-particle particle-mesh (PPPM) method was still found to be faster for the periodic systems we tested on a single processor. However, the fast Fourier transforms (FFTs) that PPPM relies on represent a major scaling bottleneck for the method when running on many cores (because the many-to-many communication pattern of the FFT becomes expensive) and MSM scales better than PPPM when using a large core count for two test problems on Sandia's Redsky machine. This FFT bottleneck can be reduced by running PPPM on only a subset of the total processors. MSM is most competitive for relatively low accuracy calculations. On Sandia's Chama machine, however, PPPM is found to scale better than MSM for all core counts that we tested. These results suggest that PPPM is usually more efficient than MSM for typical problems running on current high performance computers. However, further improvements to MSM algorithm could increase its competitiveness for calculation of long-range electrostatic interactions. © 2014 AIP Publishing LLC.

More Details

Report for the ASC CSSE L2 Milestone (4873) - Demonstration of Local Failure Local Recovery Resilient Programming Model

Heroux, Michael A.; Teranishi, Keita T.

Recovery from process loss during the execution of a distributed memory parallel application is presently achieved by restarting the program, typically from a checkpoint file. Future computer system trends indicate that the size of data to checkpoint, the lack of improvement in parallel file system performance and the increase in process failure rates will lead to situations where checkpoint restart becomes infeasible. In this report we describe and prototype the use of a new application level resilient computing model that manages persistent storage of local state for each process such that, if a process fails, recovery can be performed locally without requiring access to a global checkpoint file. LFLR provides application developers with an ability to recover locally and continue application execution when a process is lost. This report discusses what features are required from the hardware, OS and runtime layers, and what approaches application developers might use in the design of future codes, including a demonstration of LFLR-enabled MiniFE code from the Matenvo mini-application suite.

More Details
Results 6101–6125 of 9,998
Results 6101–6125 of 9,998