Publications

Results 26–33 of 33

Search results

Jump to search filters

Staghorn: An Automated Large-Scale Distributed System Analysis Platform

Gabert, Kasimir G.; Burns, Ian B.; Elliott, Steven E.; Kallaher, Jenna M.; Vail, Adam R.

Conducting experiments on large-scale distributed computing systems is becoming significantly easier with the assistance of emulation. Researchers can now create a model of a distributed computing environment and then generate a virtual, laboratory copy of the entire system composed of potentially thousands of virtual machines, switches, and software. The use of real software, running at clock rate in full virtual machines, allows experiments to produce meaningful results without necessitating a full understanding of all model components. However, the ability to inspect and modify elements within these models is bound by the limitation that such modifications must compete with the model, either running in or alongside it. This inhibits entire classes of analyses from being conducted upon these models. We developed a mechanism to snapshot an entire emulation-based model as it is running. This allows us to \freeze time" and subsequently fork execution, replay execution, modify arbitrary parts of the model, or deeply explore the model. This snapshot includes capturing packets in transit and other input/output state along with the running virtual machines. We were able to build this system in Linux using Open vSwitch and Kernel Virtual Machines on top of Sandia's emulation platform Firewheel. This primitive opens the door to numerous subsequent analyses on models, including state space exploration, debugging distributed systems, performance optimizations, improved training environments, and improved experiment repeatability.

More Details

Complex Systems Models and Their Applications: Towards a New Science of Verification, Validation & Uncertainty Quantification

Tsao, Jeffrey Y.; Trucano, Timothy G.; Kleban, S.D.; Naugle, Asmeret B.; Verzi, Stephen J.; Swiler, Laura P.; Johnson, Curtis M.; Smith, Mark A.; Flanagan, Tatiana P.; Vugrin, Eric D.; Gabert, Kasimir G.; Lave, Matthew S.; Chen, Wei; Delaurentis, Daniel; Hubler, Alfred; Oberkampf, Bill

This report contains the written footprint of a Sandia-hosted workshop held in Albuquerque, New Mexico, June 22-23, 2016 on “Complex Systems Models and Their Applications: Towards a New Science of Verification, Validation and Uncertainty Quantification,” as well as of pre-work that fed into the workshop. The workshop’s intent was to explore and begin articulating research opportunities at the intersection between two important Sandia communities: the complex systems (CS) modeling community, and the verification, validation and uncertainty quantification (VVUQ) community The overarching research opportunity (and challenge) that we ultimately hope to address is: how can we quantify the credibility of knowledge gained from complex systems models, knowledge that is often incomplete and interim, but will nonetheless be used, sometimes in real-time, by decision makers?

More Details

Exploration of cloud computing late start LDRD #149630 : Raincoat. v. 2.1

Edgett, Patrick G.; Gabert, Kasimir G.; Echeverria, Victor T.; Metral, Michael D.; Leger, Michelle A.; Thai, Tan Q.

This report contains documentation from an interoperability study conducted under the Late Start LDRD 149630, Exploration of Cloud Computing. A small late-start LDRD from last year resulted in a study (Raincoat) on using Virtual Private Networks (VPNs) to enhance security in a hybrid cloud environment. Raincoat initially explored the use of OpenVPN on IPv4 and demonstrates that it is possible to secure the communication channel between two small 'test' clouds (a few nodes each) at New Mexico Tech and Sandia. We extended the Raincoat study to add IPSec support via Vyatta routers, to interface with a public cloud (Amazon Elastic Compute Cloud (EC2)), and to be significantly more scalable than the previous iteration. The study contributed to our understanding of interoperability in a hybrid cloud.

More Details
Results 26–33 of 33
Results 26–33 of 33