The purpose of this LDRD is to develop technology allowing warfighters to provide high-level commands to their unmanned assets, freeing them to command a group of them or commit the bulk of their attention elsewhere. To this end, a brain-emulating cognition and control architecture (BECCA) was developed, incorporating novel and uniquely capable feature creation and reinforcement learning algorithms. BECCA was demonstrated on both a mobile manipulator platform and on a seven degree of freedom serial link robot arm. Existing military ground robots are almost universally teleoperated and occupy the complete attention of an operator. They may remove a soldier from harm's way, but they do not necessarily reduce manpower requirements. Current research efforts to solve the problem of autonomous operation in an unstructured, dynamic environment fall short of the desired performance. In order to increase the effectiveness of unmanned vehicle (UV) operators, we proposed to develop robots that can be 'directed' rather than remote-controlled. They are instructed and trained by human operators, rather than driven. The technical approach is modeled closely on psychological and neuroscientific models of human learning. Two Sandia-developed models are utilized in this effort: the Sandia Cognitive Framework (SCF), a cognitive psychology-based model of human processes, and BECCA, a psychophysical-based model of learning, motor control, and conceptualization. Together, these models span the functional space from perceptuo-motor abilities, to high-level motivational and attentional processes.
This report documents our first year efforts to address the use of many-core processors for high performance cyber protection. As the demands grow for higher bandwidth (beyond 1 Gbits/sec) on network connections, the need to provide faster and more efficient solution to cyber security grows. Fortunately, in recent years, the development of many-core network processors have seen increased interest. Prior working experiences with many-core processors have led us to investigate its effectiveness for cyber protection tools, with particular emphasis on high performance firewalls. Although advanced algorithms for smarter cyber protection of high-speed network traffic are being developed, these advanced analysis techniques require significantly more computational capabilities than static techniques. Moreover, many locations where cyber protections are deployed have limited power, space and cooling resources. This makes the use of traditionally large computing systems impractical for the front-end systems that process large network streams; hence, the drive for this study which could potentially yield a highly reconfigurable and rapidly scalable solution.
In [3] we proposed a new Control Volume Finite Element Method with multi-dimensional, edge- based Scharfetter-Gummel upwinding (CVFEM-MDEU). This report follows up with a detailed computational study of the method. The study compares the CVFEM-MDEU method with other CVFEM and FEM formulations for a set of standard scalar advection-diffusion test problems in two dimensions. The first two CVFEM formulations are derived from the CVFEM-MDEU by simplifying the computation of the flux integrals on the sides of the control volumes, the third is the nodal CVFEM [2] without upwinding, and the fourth is the streamline upwind version of CVFEM [10]. The finite elements in our study are the standard Galerkin, SUPG and artificial diffusion methods. All studies employ logically Cartesian partitions of the unit square into quadrilateral elements. Both uniform and non-uniform grids are considered. Our results demonstrate that CVFEM-MDEU and its simplified versions perform equally well on rectangular or nearly rectangular grids. However, performance of the simplified versions significantly degrades on non-affine grids, whereas the CVFEM-MDEU remains stable and accurate over a wide range of mesh Peclet numbers and non-affine grids. Compared to FEM formulations the CVFEM-MDEU appears to be slightly more dissipative than the SUPG, but has much less local overshoots and undershoots.
Recognition of the importance of power in the field of High Performance Computing, whether it be as an obstacle, expense or design consideration, has never been greater and more pervasive. While research has been conducted on many related aspects, there is a stark absence of work focused on large scale High Performance Computing. Part of the reason is the lack of measurement capability currently available on small or large platforms. Typically, research is conducted using coarse methods of measurement such as inserting a power meter between the power source and the platform, or fine grained measurements using custom instrumented boards (with obvious limitations in scale). To collect the measurements necessary to analyze real scientific computing applications at large scale, an in-situ measurement capability must exist on a large scale capability class platform. In response to this challenge, we exploit the unique power measurement capabilities of the Cray XT architecture to gain an understanding of power use and the effects of tuning. We apply these capabilities at the operating system level by deterministically halting cores when idle. At the application level, we gain an understanding of the power requirements of a range of important DOE/NNSA production scientific computing applications running at large scale (thousands of nodes), while simultaneously collecting current and voltage measurements on the hosting nodes. We examine the effects of both CPU and network bandwidth tuning and demonstrate energy savings opportunities of up to 39% with little or no impact on run-time performance. Capturing scale effects in our experimental results was key. Our results provide strong evidence that next generation large-scale platforms should not only approach CPU frequency scaling differently, but could also benefit from the capability to tune other platform components, such as the network, to achieve energy efficient performance.
Despite its seemingly nonsensical cost, we show through modeling and simulation that redundant computation merits full consideration as a resilience strategy for next-generation systems. Without revolutionary breakthroughs in failure rates, part counts, or stable-storage bandwidths, it has been shown that the utility of Exascale systems will be crushed by the overheads of traditional checkpoint/restart mechanisms. Alternate resilience strategies must be considered, and redundancy is a proven unrivaled approach in many domains. We develop a distribution-independent model for job interrupts on systems of arbitrary redundancy, adapt Daly’s model for total application runtime, and find that his estimate for optimal checkpoint interval remains valid for redundant systems. We then identify conditions where redundancy is more cost effective than non-redundancy. These are done in the context of the number one supercomputers of the last decade, showing that thorough consideration of redundant computation is timely - if not overdue.