Publications Search

Time Series Discord Detection in Medical Data using a Parallel Relational Database

Woodbridge, Diane M.; Foulk, James W.; Wilson, Andrew T.; Goldstein, Richard

Recent advances in sensor technology have made continuous real-time health monitoring available in both hospital and non-hospital settings. Since data collected from high frequency medical sensors includes a huge amount of data, storing and processing continuous medical data is an emerging big data area. Especially detecting anomaly in real time is important for patients’ emergency detection and prevention. A time series discord indicates a subsequence that has the maximum difference to the rest of the time series subsequences, meaning that it has abnormal or unusual data trends. In this study, we implemented two versions of time series discord detection algorithms on a high performance parallel database management system (DBMS) and applied them to 240 Hz waveform data collected from 9,723 patients. The initial brute force version of the discord detection algorithm takes each possible subsequence and calculates a distance to the nearest non-self match to find the biggest discords in time series. For the heuristic version of the algorithm, a combination of an array and a trie structure was applied to order time series data for enhancing time efficiency. The study results showed efficient data loading, decoding and discord searches in a large amount of data, benefiting from the time series discord detection algorithm and the architectural characteristics of the parallel DBMS including data compression, data pipe-lining, and task scheduling.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI

Facilitation of Forensic Analysis Using a Narrative Template

Wilson, Andrew T.; Forsythe, James C.; Silva, Austin R.; Hopkins, Shelby E.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Trajectory analysis via a geometric feature space approach

Statistical Analysis and Data Mining

Foulk, James W.; Wilson, Andrew T.

This study aimed to organize a body of trajectories in order to identify, search for and classify both common and uncommon behaviors among objects such as aircraft and ships. Existing comparison functions such as the Fréchet distance are computationally expensive and yield counterintuitive results in some cases. We propose an approach using feature vectors whose components represent succinctly the salient information in trajectories. These features incorporate basic information such as the total distance traveled and the distance between start/stop points as well as geometric features related to the properties of the convex hull, trajectory curvature and general distance geometry. Additionally, these features can generally be mapped easily to behaviors of interest to humans who are searching large databases. Most of these geometric features are invariant under rigid transformation. We demonstrate the use of different subsets of these features to identify trajectories similar to an exemplar, cluster a database of several hundred thousand trajectories and identify outliers.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

Finding Needles in Airborne Haystacks

Wilson, Andrew T.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

PANTHER. Trajectory Analysis

Foulk, James W.; Wilson, Andrew T.; Valicka, Christopher G.; Kegelmeyer, William P.; Shead, Timothy M.; Czuchlewski, Kristina R.; Newton, Benjamin D.

We want to organize a body of trajectories in order to identify, search for, classify and predict behavior among objects such as aircraft and ships. Existing compari- son functions such as the Fr'echet distance are computationally expensive and yield counterintuitive results in some cases. We propose an approach using feature vectors whose components represent succinctly the salient information in trajectories. These features incorporate basic information such as total distance traveled and distance be- tween start/stop points as well as geometric features related to the properties of the convex hull, trajectory curvature and general distance geometry. Additionally, these features can generally be mapped easily to behaviors of interest to humans that are searching large databases. Most of these geometric features are invariant under rigid transformation. We demonstrate the use of different subsets of these features to iden- tify trajectories similar to an exemplar, cluster a database of several hundred thousand trajectories, predict destination and apply unsupervised machine learning algorithms.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

Time Series Discord Detection in Medical Data using a Parallel Relational Database

Woodbridge, Diane M.; Foulk, James W.; Wilson, Andrew T.; Goldstein, Richard

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Characterizing and Detecting Aircraft Identity and Diversion

Kegelmeyer, William P.; Shead, Timothy M.; Foulk, James W.; Wilson, Andrew T.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Trackable: A Toolset for Trajectory Analysis

Foulk, James W.; Wilson, Andrew T.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Facilitation of Forensic Analysis Using a Narrative Template

Forsythe, James C.; Hopkins, Shelby; Silva, Austin R.; Wilson, Andrew T.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Nested Narratives Final Report

Wilson, Andrew T.; Pattengale, Nicholas D.; Forsythe, James C.; Carvey, Bradley J.

In cybersecurity forensics and incident response, the story of what has happened is the most important artifact yet the one least supported by tools and techniques. Existing tools focus on gathering and manipulating low-level data to allow an analyst to investigate exactly what happened on a host system or a network. Higher-level analysis is usually left to whatever ad hoc tools and techniques an individual may have developed. We discuss visual representations of narrative in the context of cybersecurity incidents with an eye toward multi-scale illustration of actions and actors. We envision that this representation could smoothly encompass individual packets on a wire at the lowest level and nation-state-level actors at the highest. We present progress to date, discuss the impact of technical risk on this project and highlight opportunities for future work.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

From Points to Holding Patterns: Large-Scale Analysis of Trajectory Data

Wilson, Andrew T.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Trajectory Analysis and Data Mining

Valicka, Christopher G.; Foulk, James W.; Wilson, Andrew T.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Highways in the Sky

Wilson, Andrew T.; Rintoul, Mark D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Trajectory analysis via a geometric feature space approach

Rintoul, Mark D.; Wilson, Andrew T.; Brost, Randolph

Abstract not provided.

More Details

TYPE Conference YEAR 2014

DOI OSTI

US Civilian Air Traffic April 4 2013

Wilson, Andrew T.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

Harnessing the Full Potential of Geospatial Data

Brost, Randolph; Chow, James G.; Rintoul, Mark D.; Mcnamara, Laura A.; Stracuzzi, David J.; Wilson, Andrew T.; Moya, Mary M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

Investigating the integration of supercomputers and data-warehouse appliances

Oldfield, Ron; Ulmer, Craig; Wilson, Andrew T.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Can we identify spear phishing targets before the email is sent?

Suppona, Roger A.; Wilson, Andrew T.; Doak, Justin E.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Scheduling Nuclear Weapon Stockpile Transformation

Mcnamara, Laura A.; Phillips, Cynthia A.; Wilson, Andrew T.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Identifying Dynamic Patterns in Network Traffic to Predict and Mitigate Cyberattacks

Doak, Justin E.; Wilson, Andrew T.; Suppona, Roger A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Evaluating parallel relational databases for medical data analysis

Wilson, Andrew T.; Rintoul, Mark D.

Hospitals have always generated and consumed large amounts of data concerning patients, treatment and outcomes. As computers and networks have permeated the hospital environment it has become feasible to collect and organize all of this data. This raises naturally the question of how to deal with the resulting mountain of information. In this report we detail a proof-of-concept test using two commercially available parallel database systems to analyze a set of real, de-identified medical records. We examine database scalability as data sizes increase as well as responsiveness under load from multiple users.

More Details

TYPE SAND Report YEAR 2012

DOI OSTI

TopicView: Visually Comparing Topic Models of Text Collections

Wilson, Andrew T.; Dunlavy, Daniel M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI DOI

TopicView Poster Preview

Wilson, Andrew T.; Dunlavy, Daniel M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Tracking topic birth and death in LDA

Wilson, Andrew T.; Robinson, David G.

Most topic modeling algorithms that address the evolution of documents over time use the same number of topics at all times. This obscures the common occurrence in the data where new subjects arise and old ones diminish or disappear entirely. We propose an algorithm to model the birth and death of topics within an LDA-like framework. The user selects an initial number of topics, after which new topics are created and retired without further supervision. Our approach also accommodates many of the acceleration and parallelization schemes developed in recent years for standard LDA. In recent years, topic modeling algorithms such as latent semantic analysis (LSA)[17], latent Dirichlet allocation (LDA)[10] and their descendants have offered a powerful way to explore and interrogate corpora far too large for any human to grasp without assistance. Using such algorithms we are able to search for similar documents, model and track the volume of topics over time, search for correlated topics or model them with a hierarchy. Most of these algorithms are intended for use with static corpora where the number of documents and the size of the vocabulary are known in advance. Moreover, almost all current topic modeling algorithms fix the number of topics as one of the input parameters and keep it fixed across the entire corpus. While this is appropriate for static corpora, it becomes a serious handicap when analyzing time-varying data sets where topics come and go as a matter of course. This is doubly true for online algorithms that may not have the option of revising earlier results in light of new data. To be sure, these algorithms will account for changing data one way or another, but without the ability to adapt to structural changes such as entirely new topics they may do so in counterintuitive ways.

More Details

TYPE SAND Report YEAR 2011

DOI OSTI

TopicView: Understanding Document Relationships Using Latent Dirichlet Allocation Models

Wilson, Andrew T.; Dunlavy, Daniel M.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Publications

Search results