Detecting Anomalous Event Timings in Cybersecurity Logs
Abstract not provided.
Abstract not provided.
IS and T International Symposium on Electronic Imaging Science and Technology
Computational modeling frequently generates sets of related simulation runs, known as ensembles. These simulations often output 3D surface mesh data, where the geometry and variable values of the mesh are changing with each time step. Comparing these ensembles depends on comparing not only geometric properties, but also associated field data. In this paper, we propose a new metric for comparing mesh geometry combined with field data variables. Our measure is a generalization of the well-known Metro algorithm used in mesh simplification. The Metro algorithm can compare two meshes but doesn't consider field variables. Our metric evaluates a single variable in combination with the mesh geometry. Combining our metric with multidimensional scaling, we visualize a low dimensional representation of all the time steps from a set of example ensembles to demonstrate the effectiveness of this approach.
Computer Methods in Applied Mechanics and Engineering
The phase-field method is a popular modeling technique used to describe the dynamics of microstructures and their physical properties at the mesoscale. However, because in these simulations the microstructure is described by a system of continuous variables evolving both in space and time, phase-field models are computationally expensive. They require refined spatio-temporal discretization and a parallel computing approach to achieve a useful degree of accuracy. As an alternative, we present and discuss an accelerated phase-field approach which uses a recurrent neural network (RNN) to learn the microstructure evolution in latent space. We perform a comprehensive analysis of different dimensionality-reduction methods and types of recurrent units in RNNs. Specifically, we compare statistical functions combined with linear and nonlinear embedding techniques to represent the microstructure evolution in latent space. We also evaluate several RNN models that implement a gating mechanism, including the long short-term memory (LSTM) unit and the gated recurrent unit (GRU) as the microstructure-learning engine. We analyze the different combinations of these methods on the spinodal decomposition of a two-phase system. Our comparison reveals that describing the microstructure evolution in latent space using an autocorrelation-based principal component analysis (PCA) method is the most efficient. We find that the LSTM and GRU RNN implementations provide comparable accuracy with respect to the high-fidelity phase-field predictions, but with a considerable computational speedup relative to the full simulation. This study not only enhances our understanding of the performance of dimensionality reduction on the microstructure evolution, but it also provides insights on strategies for accelerating phase-field modeling via machine learning techniques.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
The Dial-A-Cluster (DAC) model allows interactive visualization of multivariate time series data. A multivariate time series dataset consists of an ensemble of data points, where each data point consists of a set of time series curves. The example of a DAC dataset used in this guide is a collection of 100 cities in the United States, where each city collects a year's worth of weather data, including daily temperature, humidity, and wind speed measurements.
This report describes the results of a seven day effort to assist subject matter experts address a problem related to COVID-19. In the course of this effort, we analyzed the 29K documents provided as part of the White House's call to action. This involved applying a variety of natural language processing techniques and compression-based analytics in combination with visualization techniques and assessment with subject matter experts to pursue answers to a specific question. In this paper, we will describe the algorithms, the software, the study performed, and availability of the software developed during the effort.
Abstract not provided.
Abstract not provided.
IS and T International Symposium on Electronic Imaging Science and Technology
We present VideoSwarm, a system for visualizing video ensembles generated by numerical simulations. VideoSwarm is a web application, where linked views of the ensemble each represent the data using a different level of abstraction. VideoSwarm uses multidimensional scaling to reveal relationships between a set of simulations relative to a single moment in time, and to show the evolution of video similarities over a span of time. VideoSwarm is a plug-in for Slycat, a web-based visualization framework which provides a web-server, database, and Python infrastructure. The Slycat framework provides support for managing multiple users, maintains access control, and requires only a Slycat supported commodity browser (such as Firefox, Chrome, or Safari).
Proceedings of the 28th International Meshing Roundtable, IMR 2019
We describe new machine-learning-based methods to defeature CAD models for tetrahedral meshing. Using machine learning predictions of mesh quality for geometric features of a CAD model prior to meshing we can identify potential problem areas and improve meshing outcomes by presenting a prioritized list of suggested geometric operations to users. Our machine learning models are trained using a combination of geometric and topological features from the CAD model and local quality metrics for ground truth. We demonstrate a proof-of-concept implementation of the resulting work ow using Sandia's Cubit Geometry and Meshing Toolkit.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Slycat™ is a web-based system for performing data analysis and visualization of potentially large quantities of remote, high-dimensional data. Slycat™ specializes in working with ensemble data. An ensemble is a group of related data sets, which typically consists of a set of simulation runs exploring the same problem space. An ensemble can be thought of as a set of samples within a multi-variate domain, where each sample is a vector whose value defines a point in high-dimensional space. To understand and describe the underlying problem being modeled in the simulations, ensemble analysis looks for shared behaviors and common features across the group of runs. Additionally, ensemble analysis tries to quantify differences found in any members that deviate from the rest of the group. The Slycat™ system integrates data management, scalable analysis, and visualization. Results are viewed remotely on a user’s desktop via commodity web clients using a multi-tiered hierarchy of computation and data storage, as shown in Figure 1. Our goal is to operate on data as close to the source as possible, thereby reducing time and storage costs associated with data movement. Consequently, we are working to develop parallel analysis capabilities that operate on High Performance Computing (HPC) platforms, to explore approaches for reducing data size, and to implement strategies for staging computation across the Slycat™ hierarchy. Within Slycat™, data and visual analysis are organized around projects, which are shared by a project team. Project members are explicitly added, each with a designated set of permissions. Although users sign-in to access Slycat™, individual accounts are not maintained. Instead, authentication is used to determine project access. Within projects, Slycat™ models capture analysis results and enable data exploration through various visual representations. Although for scientists each simulation run is a model of real-world phenomena given certain conditions, we use the term model to refer to our modeling of the ensemble data, not the physics. Different model types often provide complementary perspectives on data features when analyzing the same data set. Each model visualizes data at several levels of abstraction, allowing the user to range from viewing the ensemble holistically to accessing numeric parameter values for a single run. Bookmarks provide a mechanism for sharing results, enabling interesting model states to be labeled and saved.
Molecular Informatics
We seek to optimize Ionic liquids (ILs) for application to redox flow batteries. As part of this effort, we have developed a computational method for suggesting ILs with high conductivity and low viscosity. Since ILs consist of cation-anion pairs, we consider a method for treating ILs as pairs using product descriptors for QSPRs, a concept borrowed from the prediction of protein-protein interactions in bioinformatics. We demonstrate the method by predicting electrical conductivity, viscosity, and melting point on a dataset taken from the ILThermo database on June 18th, 2014. The dataset consists of 4,329 measurements taken from 165 ILs made up of 72 cations and 34 anions. We benchmark our QSPRs on the known values in the dataset then extend our predictions to screen all 2,448 possible cation-anion pairs in the dataset.
Abstract not provided.