Publications Search

Challenges in Data Mining and Optimization

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

A Framework for Analysis of Attacker-Defender Interaction in Cyber Systems

Outkin, Alexander V.; Eames, Brandon K.; Jones, Stephen T.; Vugrin, Eric; Phillips, Cynthia A.; Walsh, Sarah; Hobbs, Jacob; Verzi, Stephen J.; Heersink, Byron

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Anti-persistence on persistent storage: History-independent sparse tables and dictionaries

Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Bender, Michael A.; Berry, Jonathan; Johnson, Rob; Kroeger, Thomas; Mccauley, Samuel; Phillips, Cynthia A.; Simon, Bertrand; Singh, Shikha; Zage, David J.

We present history-independent alternatives to a B-tree, the primary indexing data structure used in databases. A data structure is history independent (HI) if it is impossible to deduce any information by examining the bit representation of the data structure that is not already available through the API. We show how to build a history-independent cache-oblivious B-tree and a history-independent external-memory skip list. One of the main contributions is a data structure we build on the way - a history-independent packed-memory array (PMA). The PMA supports efficient range queries, one of the most important operations for answering database queries. Our HI PMA matches the asymptotic bounds of prior non-HI packed-memory arrays and sparse tables. Specifically, a PMA maintains a dynamic set of elements in sorted order in a linearsized array. Inserts and deletes take an amortized O(log2 N) element moves with high probability. Simple experiments with our implementation of HI PMAs corroborate our theoretical analysis. Comparisons to regular PMAs give preliminary indications that the practical cost of adding history-independence is not too large. Our HI cache-oblivious B-tree bounds match those of prior non-HI cache-oblivious B-trees. Searches take O(logB N) I/Os; inserts and deletes take O(log2N/B + logB N) amortized I/Os with high probability; and range queries returning k elements take O(logB N + k/B) I/Os. Our HI external-memory skip list achieves optimal bounds with high probability, analogous to in-memory skip lists: O(logB N) I/Os for point queries and amortized O(logB N) I/Os for inserts/deletes. Range queries returning k elements run in O(logB N + k/B) I/Os. In contrast, the best possible high-probability bounds for inserting into the folklore B-skip list, which promotes elements with probability 1/B, is just Θ(log N) I/Os. This is no better than the bounds one gets from running an inmemory skip list in external memory.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI Scopus

Stochastic Adversarial Modeling for Evaluating Trust in Systems and Effectiveness of Moving Target Defense

Outkin, Alexander V.; Eames, Brandon K.; Jones, Todd; Vugrin, Eric; Phillips, Cynthia A.; Walsh, Sarah; Hobbs, Jacob; Gearhart, Jared L.; Siirola, John D.; Verzi, Stephen J.; Naugle, Asmeret B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

Stochastic Adversarial Modeling for Evaluating Trust in Systems and Effectiveness of Moving Target Defense

Outkin, Alexander V.; Eames, Brandon K.; Jones, Todd; Vugrin, Eric; Phillips, Cynthia A.; Walsh, Sarah; Hobbs, Jacob; Gearhart, Jared L.; Siirola, John D.; Verzi, Stephen J.; Naugle, Asmeret B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

Cooperative computing for autonomous data centers

Berry, Jonathan; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared; Smith, Randy D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI

Social-Media Network Collection Problems

Wendt, Jeremy; Field, Richard V.; Garcia, Daniel; Quach, Tu T.; Wells, Randall; Zage, David J.; Phillips, Cynthia A.; Pinar, Ali P.; Soundarajan, Sucheta

More Details

TYPE Conference Poster YEAR 2016

OSTI

A new scheduling problem motivated by moving-target cyberdefense

Gearhart, Jared L.; Hobbs, Jacob; Jones, Stephen T.; Mulder, Samuel A.; Naugle, Asmeret B.; Outkin, Alexander V.; Phillips, Cynthia A.; Siirola, John D.; Tauritz, Daniel; Verzi, Stephen J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Finding Non-Human Nodes in Social Networks Using Only Topology

Berry, Jonathan; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared

More Details

TYPE Conference Poster YEAR 2015

OSTI

So you might want to work at a national lab?

Phillips, Cynthia A.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

K-Means clustering on two-level memory systems

ACM International Conference Proceeding Series

Bender, Michael A.; Berry, Jonathan; Hammond, Simon; Moore, Branden J.; Moseley, Benjamin; Phillips, Cynthia A.

In recent work we quantified the anticipated performance boost when a sorting algorithm is modified to leverage user- Addressable "near-memory," which we call scratchpad. This architectural feature is expected in the Intel Knight's Land- ing processors that will be used in DOE's next large-scale supercomputer. This paper expands our analytical study of the scratch- pad to consider k-means clustering, a classical data-analysis technique that is ubiquitous in the literature and in prac- Tice. We present new theoretical results using the model introduced in [13], which measures memory transfers and assumes that computations are memory-bound. Our the- oretical results indicate that scratchpad-aware versions of k-means clustering can expect performance boosts for high- dimensional instances with relatively few cluster centers. These constraints may limit the practical impact of scratch- pad for k-means acceleration, so we discuss their origins and practical implications. We corroborate our theory with ex- perimental runs on a system instrumented to mimic one with scratchpad memory. We also contribute a semi-formalization of the computa- Tional properties that are necessary and sufficient to predict a performance boost from scratchpad-aware variants of al- gorithms. We have observed and studied these properties in the context of sorting, and now clustering. We conclude with some thoughts on the application of these properties to new areas. Specifically, we believe that dense linear algebra has similar properties to k-means, while sparse linear algebra and FFT computations are more sim-ilar to sorting. The sparse operations are more common in scientific computing, so we expect scratchpad to have signif- icant impact in that area.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

Computing quality scores and uncertainty for approximate pattern matching in geospatial semantic graphs

Statistical Analysis and Data Mining

Stracuzzi, David J.; Brost, Randolph; Phillips, Cynthia A.; Robinson, David G.; Wilson, Alyson G.; Woodbridge, Diane M.

Geospatial semantic graphs provide a robust foundation for representing and analyzing remote sensor data. In particular, they support a variety of pattern search operations that capture the spatial and temporal relationships among the objects and events in the data. However, in the presence of large data corpora, even a carefully constructed search query may return a large number of unintended matches. This work considers the problem of calculating a quality score for each match to the query, given that the underlying data are uncertain. We present a preliminary evaluation of three methods for determining both match quality scores and associated uncertainty bounds, illustrated in the context of an example based on overhead imagery data.

More Details

TYPE Journal Article YEAR 2015

DOI OSTI Scopus

Geometric hitting set for segments of few orientations

Fekete, Sandor; Huang, Kan; Mitchell, Joseph S.B.; Parekh, Ojas D.; Phillips, Cynthia A.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI

Two-level main memory co-design: Multi-threaded algorithmic primitives, analysis, and simulation

Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2015

Bender, Michael A.; Berry, Jonathan; Hammond, Simon; Hemmert, Karl S.; Mccauley, Samuel; Moore, Branden J.; Moseley, Benjamin; Phillips, Cynthia A.; Resnick, David R.; Rodrigues, Arun

A fundamental challenge for supercomputer architecture is that processors cannot be fed data from DRAM as fast as CPUs can consume it. Therefore, many applications are memory-bandwidth bound. As the number of cores per chip increases, and traditional DDR DRAM speeds stagnate, the problem is only getting worse. A variety of non-DDR 3D memory technologies (Wide I/O 2, HBM) offer higher bandwidth and lower power by stacking DRAM chips on the processor or nearby on a silicon interposer. However, such a packaging scheme cannot contain sufficient memory capacity for a node. It seems likely that future systems will require at least two levels of main memory: high-bandwidth, low-power memory near the processor and low-bandwidth high-capacity memory further away. This near memory will probably not have significantly faster latency than the far memory. This, combined with the large size of the near memory (multiple GB) and power constraints, may make it difficult to treat it as a standard cache. In this paper, we explore some of the design space for a user-controlled multi-level main memory. We present algorithms designed for the heterogeneous bandwidth, using streaming to exploit data locality. We consider algorithms for the fundamental application of sorting. Our algorithms asymptotically reduce memory-block transfers under certain architectural parameter settings. We use and extend Sandia National Laboratories' SST simulation capability to demonstrate the relationship between increased bandwidth and improved algorithmic performance. Memory access counts from simulations corroborate predicted performance. This co-design effort suggests implementing two-level main memory systems may improve memory performance in fundamental applications.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

Evaluating Moving Target Defense with PLADD

Jones, Stephen T.; Outkin, Alexander V.; Gearhart, Jared L.; Hobbs, Jacob; Siirola, John D.; Phillips, Cynthia A.; Verzi, Stephen J.; Tauritz, Daniel; Mulder, Samuel A.; Naugle, Asmeret B.

This project evaluates the effectiveness of moving target defense (MTD) techniques using a new game we have designed, called PLADD, inspired by the game FlipIt [28]. PLADD extends FlipIt by incorporating what we believe are key MTD concepts. We have analyzed PLADD and proven the existence of a defender strategy that pushes a rational attacker out of the game, demonstrated how limited the strategies available to an attacker are in PLADD, and derived analytic expressions for the expected utility of the game’s players in multiple game variants. We have created an algorithm for finding a defender’s optimal PLADD strategy. We show that in the special case of achieving deterrence in PLADD, MTD is not always cost effective and that its optimal deployment may shift abruptly from not using MTD at all to using it as aggressively as possible. We believe our effort provides basic, fundamental insights into the use of MTD, but conclude that a truly practical analysis requires model selection and calibration based on real scenarios and empirical data. We propose several avenues for further inquiry, including (1) agents with adaptive capabilities more reflective of real world adversaries, (2) the presence of multiple, heterogeneous adversaries, (3) computational game theory-based approaches such as coevolution to allow scaling to the real world beyond the limitations of analytical analysis and classical game theory, (4) mapping the game to real-world scenarios, (5) taking player risk into account when designing a strategy (in addition to expected payoff), (6) improving our understanding of the dynamic nature of MTD-inspired games by using a martingale representation, defensive forecasting, and techniques from signal processing, and (7) using adversarial games to develop inherently resilient cyber systems.

More Details

TYPE Other Report YEAR 2015

DOI OSTI

Preliminary Results on Uncertainty Quantification for Pattern Analytics

Stracuzzi, David J.; Brost, Randolph; Chen, Maximillian G.; Malinas, Rebecca; Peterson, Matthew G.; Phillips, Cynthia A.; Robinson, David G.; Woodbridge, Diane M.

This report summarizes preliminary research into uncertainty quantification for pattern ana- lytics within the context of the Pattern Analytics to Support High-Performance Exploitation and Reasoning (PANTHER) project. The primary focus of PANTHER was to make large quantities of remote sensing data searchable by analysts. The work described in this re- port adds nuance to both the initial data preparation steps and the search process. Search queries are transformed from does the specified pattern exist in the data? to how certain is the system that the returned results match the query? We show example results for both data processing and search, and discuss a number of possible improvements for each.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

Cooperative Computing for Autonomous Data Centers

Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium, IPDPS 2015

Berry, Jonathan; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared; Smith, Randy D.

We present a new distributed model for graph computations motivated by limited information sharing. Two or more independent entities have collected large social graphs. They wish to compute the result of running graph algorithms on the entire set of relationships. Because the information is sensitive or economically valuable, they do not wish to simply combine the information in a single location. We consider two models for computing the solution to graph algorithms in this setting: 1) limited-sharing: the two entities can share only a poly logarithmic size subgraph, 2) low-trust: the entities must not reveal any information beyond the query answer, assuming they are all honest but curious. We believe this model captures realistic constraints on cooperating autonomous data centres' have results for both models for s-t connectivity, one of the simplest graph problems that requires global information in the worst case. In the limited-sharing model, our results exploit social network structure. Standard communication complexity gives polynomial lower bounds on s-t connectivity for general graphs. However, if the graph for each data centre has a giant component and these giant components intersect, then we can overcome this lower bound, computing-t connectivity while exchanging O(log 2 n) bits for a constant number of data centers. We can also test the assumption that the giant components overlap using O(log 2 n) bits provided the (unknown) overlap is sufficiently large. The second result is in the low trust model. We give a secure multi-party computation (MPC) algorithm that 1) does not make cryptographic assumptions when there are 3 or more entities, and 2) is efficient, especially when compared to the usual garbled circuit approach. The entities learn only the yes/no answer. No party learns anything about the others' graph, not even node names. This algorithm does not require any special graph structure. This secure MPC result for s-t connectivity is one of the first that involves a few parties computing on large inputs, instead of many parties computing on a few local values.

More Details

TYPE Conference Poster YEAR 2015

OSTI Scopus

Cooperative computing for autonomous data centers

Berry, Jonathan; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared; Smith, Randy D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Next-generation Algorithms for Assessing Infrastructure Vulnerability and Optimizing System Resilience

Burchett, Deon L.; Chen, Richard L.Y.; Phillips, Cynthia A.; Richard, Jean-Philippe

This report summarizes the work performed under the project project Next-Generation Algorithms for Assessing Infrastructure Vulnerability and Optimizing System Resilience. The goal of the project was to improve mathematical programming-based optimization technology for infrastructure protection. In general, the owner of a network wishes to design a network a network that can perform well when certain transportation channels are inhibited (e.g. destroyed) by an adversary. These are typically bi-level problems where the owner designs a system, an adversary optimally attacks it, and then the owner can recover by optimally using the remaining network. This project funded three years of Deon Burchett's graduate research. Deon's graduate advisor, Professor Jean-Philippe Richard, and his Sandia advisors, Richard Chen and Cynthia Phillips, supported Deon on other funds or volunteer time. This report is, therefore. essentially a replication of the Ph.D. dissertation it funded in a format required for project documentation. The thesis had some general polyhedral research. This is the study of the structure of the feasible region of mathematical programs, such as integer programs. For example, an integer program optimizes a linear objective function subject to linear constraints, and (nonlinear) integrality constraints on the variables. The feasible region without the integrality constraints is a convex polygon. Careful study of additional valid constraints can significantly improve computational performance. Here is the abstract from the dissertation: We perform a polyhedral study of a multi-commodity generalization of variable upper bound flow models. In particular, we establish some relations between facets of single- and multi- commodity models. We then introduce a new family of inequalities, which generalizes traditional flow cover inequalities to the multi-commodity context. We present encouraging numerical results. We also consider the directed edge-failure resilient network design problem (DRNDP). This problem entails the design of a directed multi-commodity flow network that is capable of fulfilling a specified percentage of demands in the event that any G arcs are destroyed, where G is a constant parameter. We present a formulation of DRNDP and solve it in a branch-column-cut framework. We present computational results.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

Cyber Graph Queries for Geographically Distributed Data Centers

Berry, Jonathan; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared

We present new algorithms for a distributed model for graph computations motivated by limited information sharing we first discussed. Two or more independent entities have collected large social graphs. They wish to compute the result of running graph algorithms on the entire set of relationships. Because the information is sensitive or economically valuable, they do not wish to simply combine the information in a single location. We consider two models for computing the solution to graph algorithms in this setting: 1) limited-sharing: the two entities can share only a polylogarithmic size subgraph; 2) low-trust: the entities must not reveal any information beyond the query answer, assuming they are all honest but curious. We believe this model captures realistic constraints on cooperating autonomous data centers. We have algorithms in both setting for s - t connectivity in both models. We also give an algorithm in the low-communication model for finding a planted clique. This is an anomaly-detection problem, finding a subgraph that is larger and denser than expected. For both the low- communication algorithms, we exploit structural properties of social networks to prove performance bounds better than what is possible for general graphs. For s - t connectivity, we use known properties. For planted clique, we propose a new property: bounded number of triangles per node. This property is based upon evidence from the social science literature. We found that classic examples of social networks do not have the bounded-triangles property. This is because many social networks contain elements that are non-human, such as accounts for a business, or other automated accounts. We describe some initial attempts to distinguish human nodes from automated nodes in social networks based only on topological properties.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

Secure Distributed Set Membership through Secret Sharing

Kroeger, Thomas; Zage, David J.; Phillips, Cynthia A.; Saia, J.; Benson, T.R.

More Details

TYPE Presentation YEAR 2015

OSTI

My Path to Senior Scientist

Phillips, Cynthia A.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Why do simple algorithms for triangle enumeration work in the real world?

Internet Mathematics

Berry, Jonathan; Fostvedt, Luke A.; Nordman, Daniel J.; Phillips, Cynthia A.; Comandur, Seshadhri; Wilson, Alyson G.

Listing all triangles is a fundamental graph operation. Triangles can have important interpretations in real-world graphs, especially social and other interaction networks. Despite the lack of provably efficient (linear, or slightly super linear) worst-case algorithms for this problem, practitioners run simple, efficient heuristics to find all triangles in graphs with millions of vertices. How are these heuristics exploiting the structure of these special graphs to provide major speedups in running time? We study one of the most prevalent algorithms used by practitioners. A trivial algorithm enumerates all paths of length 2, and checks if each such path is incident to a triangle. A good heuristic is to enumerate only those paths of length 2 in which the middle vertex has the lowest degree. It is easily implemented and is empirically known to give remarkable speedups over the trivial algorithm. We study the behavior of this algorithm over graphs with heavy-tailed degree distributions, a defining feature of real-world graphs. The erased configuration model (ECM) efficiently generates a graph with asymptotically (almost) any desired degree sequence. We show that the expected running time of this algorithm over the distribution of graphs created by the ECM is controlled by the l4/3-norm of the degree sequence. Norms of the degree sequence are a measure of the heaviness of the tail, and it is precisely this feature that allows non trivial speedups of simple triangle enumeration algorithms. As a corollary of our main theorem, we prove expected linear-time performance for degree sequences following a power law with exponent α ≥ 7/3, and non trivial speedup whenever α ∈ (2, 3).

More Details

TYPE Conference YEAR 2015

Scopus OSTI

Geometric hitting set for segments of few orientations

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Fekete, Sandor P.; Huang, Kan; Mitchell, Joseph S.B.; Parekh, Ojas D.; Phillips, Cynthia A.

We study several natural instances of the geometric hitting set problem for input consisting of sets of line segments (and rays, lines) having a small number of distinct slopes. These problems model path monitoring (e.g., on road networks) using the fewest sensors (the “hitting points”). We give approximation algorithms for cases including (i) lines of 3 slopes in the plane, (ii) vertical lines and horizontal segments, (iii) pairs of horizontal/vertical segments. We give hardness and hardness of approximation results for these problems. We prove that the hitting set problem for vertical lines and horizontal rays is polynomially solvable.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

Workshop on Streaming Graph Algorithms (WSGA) Overview

Berry, Jonathan; Phillips, Cynthia A.; Hendrickson, Bruce A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Publications

Search results