Publications

Results 14201–14400 of 96,771

Search results

Jump to search filters

Opportunities and limitations of Quality-of-Service in Message Passing applications on adaptively routed Dragonfly and Fat Tree networks

Proceedings - IEEE International Conference on Cluster Computing, ICCC

Wilke, Jeremiah J.; Kenny, Joseph P.

Avoiding communication bottlenecks remains a critical challenge in high-performance computing (HPC) as systems grow to exascale. Numerous design possibilities exist for avoiding network congestion including topology, adaptive routing, congestion control, and quality-of-service (QoS). While network design often focuses on topological features like diameter, bisection bandwidth, and routing, efficient QoS implementations will be critical for next-generation interconnects. HPC workloads are dominated by tightly-coupled mathematics, making delays in a single message manifest as delays across an entire parallel job. QoS can spread traffic onto different virtual lanes (VLs), lowering the impact of network hotspots by providing priorities or bandwidth guarantees that prevent starvation of critical traffic. Two leading topology candidates, Dragonfly and Fat Tree, are often discussed in terms of routing properties and cost, but the topology can have a major impact on QoS. While Dragonfly has attractive routing flexibility and cost relative to Fat Tree, the extra routing complexity requires several VLs to avoid deadlock. Here we discuss the special challenges of Dragonfly, proposing configurations that use different routing algorithms for different service levels (SLs) to limit VL requirements. We provide simulated results showing how each QoS strategy performs on different classes of application and different workload mixes. Despite Dragonfly's desirable characteristics for adaptive routing, Fat Tree is shown to be an attractive option when QoS is considered.

More Details

Opportunities and limitations of Quality-of-Service in Message Passing applications on adaptively routed Dragonfly and Fat Tree networks

Proceedings - IEEE International Conference on Cluster Computing, ICCC

Wilke, Jeremiah J.; Kenny, Joseph P.

Avoiding communication bottlenecks remains a critical challenge in high-performance computing (HPC) as systems grow to exascale. Numerous design possibilities exist for avoiding network congestion including topology, adaptive routing, congestion control, and quality-of-service (QoS). While network design often focuses on topological features like diameter, bisection bandwidth, and routing, efficient QoS implementations will be critical for next-generation interconnects. HPC workloads are dominated by tightly-coupled mathematics, making delays in a single message manifest as delays across an entire parallel job. QoS can spread traffic onto different virtual lanes (VLs), lowering the impact of network hotspots by providing priorities or bandwidth guarantees that prevent starvation of critical traffic. Two leading topology candidates, Dragonfly and Fat Tree, are often discussed in terms of routing properties and cost, but the topology can have a major impact on QoS. While Dragonfly has attractive routing flexibility and cost relative to Fat Tree, the extra routing complexity requires several VLs to avoid deadlock. Here we discuss the special challenges of Dragonfly, proposing configurations that use different routing algorithms for different service levels (SLs) to limit VL requirements. We provide simulated results showing how each QoS strategy performs on different classes of application and different workload mixes. Despite Dragonfly's desirable characteristics for adaptive routing, Fat Tree is shown to be an attractive option when QoS is considered.

More Details

LDMS Monitoring of EDR InfiniBand Networks

Proceedings - IEEE International Conference on Cluster Computing, ICCC

Allan, Benjamin A.; Aguilar, Michael J.; Schwaller, Benjamin S.; Langer, Steven

We introduce a new HPC system high-speed network fabric production monitoring tool, the ibnet sampler plugin for LDMS version 4. Large-scale testing of this tool is our work in progress. When deployed appropriately, the ibnet sampler plugin can provide extensive counter data, at frequencies up to 1 Hz. This allows the LDMS monitoring system to be useful for tracking the impact of new network features on production systems. We present preliminary results concerning reliability, performance impact, and usability of the sampler.

More Details

An Adaptive Framework for Extreme Deformation and Failure in Solids

Mota, Alejandro M.; Plews, Julia A.; Talamini, Brandon T.; Avery, Avery

Recent developments at Sandia in meshfree methods have delivered improved robustness in solid mechanics problems that prove difficult for traditional Lagrangian, mesh-based finite elements. Nevertheless, there remains a limitation in accurately predicting very large material deformations. It seems robust meshfree discretizations and integration schemes are necessary, but not sufficient, to close this capability gap. This state of affairs directly impacts current and future LEPs, whose simulation needs are not well met for extremely large deformation problems. We propose to use a new numerical framework, the Optimal Transportation Meshfree (OTM) method enhanced by meshfree adaptivity, as we believe that a combination of both will provide a novel way to close this capability gap.

More Details

Site Environmental Report for Sandia National Laboratories California (2019)

Robinson, Christina R.

Sandia National Laboratories, California (SNL/CA) is a Department of Energy (DOE) facility. The management and operations of the facility are under a contract with the DOE's National Nuclear Security Administration (NNSA). On May 1, 2017, the name of the management and operating contractor changed from Sandia Corporation to National Technology & Engineering Solutions of Sandia, LLC (NTESS). The DOE, NNSA, Sandia Field Office administers the contract and oversees contractor operations at the site. DOE and its management and operating contractor for Sandia are committed to safeguarding environmental protection, compliance, and sustainability and to ensuring the validity and accuracy of the monitoring data presented in this Annual Site Environmental Report. This Site Environmental Report for 2019 was prepared in accordance with DOE Order 231.1B, Environment, S afro and Health &potting (DOE 2012). The report provides a summary of environmental monitoring information and compliance activities that occurred at SNL/CA during calendar year 2019, unless noted otherwise. General site and environmental program information is also included.

More Details

Volt-var curve reactive power control requirements and risks for feeders with distributed roof-top photovoltaic systems

Energies

Jones, Christian B.; Lave, Matthew S.; Reno, Matthew J.; Darbali-Zamora, Rachid; Summers, Adam; Hossain-McKenzie, Shamina S.

The benefits and risks associated with Volt-Var Curve (VVC) control for management of voltages in electric feeders with distributed, roof-top photovoltaic (PV) can be defined using a stochastic hosting capacity analysis methodology. Although past work showed that a PV inverter's reactive power can improve grid voltages for large PV installations, this study adds to the past research by evaluating the control method's impact (both good and bad) when deployed throughout the feeder within small, distributed PV systems. The stochastic hosting capacity simulation effort iterated through hundreds of load and PV generation scenarios and various control types. The simulations also tested the impact of VVCs with tampered settings to understand the potential risks associated with a cyber-attack on all of the PV inverters scattered throughout a feeder. The simulation effort found that the VVC can have an insignificant role in managing the voltage when deployed in distributed roof-top PV inverters. This type of integration strategy will result in little to no harm when subjected to a successful cyber-attack that alters the VVC settings.

More Details

Recipe for coating ceramic blades for ion trapping

Stick, Daniel L.; Casias, Adrian L.

The first batches of ion traps patterned and coated were processed per the standard 3-step clean, air fire, and metallization processes. The third or fourth lot using this process resulted in poorly adhering metallization. Up until this point, the standard process was used to metallize and pattern ceramic ion traps without fail. At about the 4th batch of parts something changed. After the 5th batch, the ceramic ion traps received generally came with some unknown contamination that does not come off in a standard 3-step clean (Lenium Vapor Degreaser, Acetone, IPA) and air fire (860C for 1 hour) for which this process removes the vast majority of all contamination for most ceramic metallization. This is highly unusual. Using HF + Boiling H2O2 is extreme for cleaning the ceramic ion traps. The contamination was never identified and is stubborn to effectively clean. Standard as-fired ceramic should be very easy to clean as if s fired at temperatures greater than 1400°C and not much in terms of contamination should exist at these temperatures, so there must be an intermediate step/process which is imparting this contamination. It is likely a polishing compound or previous polishing contaminant, but also not easily visually distinguishable until after metallization. The halo marks observed on parts might be fingerprints (less likely) or potential polishing marks (more likely) as metallization typically doesn't cover/hide any damage or contamination, but rather quite clearly the opposite, it accentuates it. Blotchy appearances in the metallization usually indicated an adhesion issue. As a result of the fragility of the parts (yield loss due to handling) and difficulty in identifying the contamination during cleaning, we have taken a conservative approach of HF + H2O2 cleaning for all batches after the contamination and adhesion issues were identified.

More Details

PNT Resilience RFI Response

Brashar, Connor B.; Haydon, Tucker C.E.; Luong, Anh

The use of the Global Positioning System (GPS) is a fundamental requirement for most navigation systems today, and this heavy reliance means that denial of GPS service (or extended threats) can pose a significant risk to modern navigation. There is an urgent need for enabling, high-accuracy navigation technologies that can operate without the need for GPS. Ideally, these solutions must be able to initialize in a completely GPS-free environment and continue to navigate even through challenging scenarios. The increasing risk posed to GPS means that trust in this platform is waning—and solutions are required. A future navigator should leverage GPS whenever possible and be capable of identifying and responding to risks while maintaining mission accuracy needs. In the absence of GPS, fully alternative navigation (altnav) technologies are required. This report describes an introductory view of altnav for GPS-impaired and contested environments. Various technologies are collected, presented, and evaluated as potential solutions. A wide snapshot of currently available technologies with a first-order summary of their potential is presented. While this report attempts to be as broad and complete as possible, this is a quickly evolving field.

More Details
Results 14201–14400 of 96,771
Results 14201–14400 of 96,771