Publications

Results 1–25 of 99,299

Search results

Jump to search filters

BinSimDB: Benchmark Dataset Construction for Fine-Grained Binary Code Similarity Analysis

Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST

Zuo, Fei; Tompkins, Cody; Zeng, Qiang; Luo, Lannan; Choe, Yung R.; Rhee, Junghwan

Binary Code Similarity Analysis (BCSA) has a wide spectrum of applications, including plagiarism detection, vulnerability discovery, and malware analysis, thus drawing significant attention from the security community. However, conventional techniques often face challenges in balancing both accuracy and scalability simultaneously. To overcome these existing problems, a surge of deep learning-based work has been recently proposed. Unfortunately, many researchers still find it extremely difficult to conduct relevant studies or extend existing approaches. First, prior work typically relies on proprietary benchmark without making the entire dataset publicly accessible. Consequently, a large-scale, well-labeled dataset for binary code similarity analysis remains precious and scarce. Moreover, previous work has primarily focused on comparing at the function level, rather than exploring other finer granularities. Therefore, we argue that the lack of a fine-grained dataset for BCSA leaves a critical gap in current research. To address these challenges, we construct a benchmark dataset for fine-grained binary code similarity analysis called BinSimDB, which contains equivalent pairs of smaller binary code snippets, such as basic blocks. Specifically, we propose BMerge and BPair algorithms to bridge the discrepancies between two binary code snippets caused by different optimization levels or platforms. Furthermore, we empirically study the properties of our dataset and evaluate its effectiveness for the BCSA research. The experimental results demonstrate that BinSimDB significantly improves the performance of binary code similarity comparison.

More Details

Image masks of global ship tracks for NASA MODIS data products

Scientific Data

Warburton, Pierce; Shuler, Kurtis; Patel, Lekha

Ship tracks, long thin artificial cloud features formed from the pollutants in ship exhaust, are satellite-observable examples of aerosol-cloud interactions (ACI) that can lead to increased cloud albedo and thus increased solar reflectivity, phenomena of interest in solar radiation management. In addition to ship tracks being of interest to meteorologists and policy makers, their observed cloud perturbations provide benchmark evidence of ACI that remain poorly captured by climate models. To broadly analyze the effects of ship tracks, high-resolution satellite imagery data highlighting their presence are required. To support this, we provide a hand labelled dataset to serve as a benchmark for a variety of subsequent analyses. Established from a previous dataset that identified ship track presence using NASA’s MODIS Aqua satellite imager, our first-of-its-kind dataset is comprised of image masks: capturing full ship track regions, including their contours, emission points and dispersive patterns. In total, 300 images, or around 2,500 masked ship tracks, observed under varying conditions are provided, and may facilitate training of machine learning algorithms to automate extraction.

More Details

Analysis of the Trusted Inertial Terrain-Aided Navigation Measurement Function

Navigation, Journal of the Institute of Navigation

Haydon, Tucker; Huang, Andy; Humphreys, Todd E.

The trusted inertial terrain-aided navigation (TITAN) algorithm leverages an airborne vertical synthetic aperture radar to measure the range to the closest ground points along several prescribed iso-Doppler contours. These TITAN minimum-range, prescribed-Doppler measurements are the result of a constrained nonlinear optimization problem whose optimization function and constraints both depend on the radar position and velocity. Owing to the complexity of this measurement definition, analysis of the TITAN algorithm is lacking in prior work. This publication offers such an analysis, making the following three contributions: (1) an analytical solution to the TITAN constrained optimization measurement problem, (2) a derivation of the TITAN measurement function Jacobian, and (3) a derivation of the Cramér–Rao lower bound on the estimated position and velocity error covariance. These three contributions are verified via Monte Carlo simulations over synthetic terrain, which further reveal two remarkable properties of the TITAN algorithm: (1) the along-track positioning errors tend to be smaller than the cross-track positioning errors, and (2) the cross-track positioning errors are independent of the terrain roughness.

More Details

A Multivariate Space‐Time Dynamic Model for Characterizing the Atmospheric Impacts Following the Mt. Pinatubo Eruption

Environmetrics

Garrett, Robert C.; Shand, Lyndsay; Huerta, Jose G.

The June 1991 Mt. Pinatubo eruption resulted in a massive increase of sulfate aerosols in the atmosphere, absorbing radiation and leading to global changes in surface and stratospheric temperatures. A volcanic eruption of this magnitude serves as a natural analog for stratospheric aerosol injection, a proposed solar radiation modification method to combat a warming climate. The impacts of such an event are multifaceted and region-specific. Our goal is to characterize the multivariate and dynamic nature of the atmospheric impacts following the Mt. Pinatubo eruption. We developed a multivariate space-time dynamic linear model to understand the full extent of the spatially- and temporally-varying impacts. Specifically, spatial variation is modeled using a flexible set of basis functions for which the basis coefficients are allowed to vary in time through a vector autoregressive (VAR) structure. This novel model is cast in a Dynamic Linear Model (DLM) framework and estimated via a customized MCMC approach. We demonstrate how the model quantifies the relationships between key atmospheric parameters prior to and following the Mt. Pinatubo eruption with reanalysis data from MERRA-2 and highlight when such a model is advantageous over univariate models.

More Details

Quantum materials for nanosensing and fault-tolerant quantum computing

Cuozzo, Joseph J.

New concepts of symmetry related to topological order emerged from the discovery of the fractional quantum Hall effect and high-temperature superconductivity in strongly correlated electron systems. This led to the study of quantum materials-- materials exhibiting emergent quantum phenomena with no classical analogues. While these materials have engendered exciting basic materials science and physics, realizing novel devices is a key challenge in the field. The goal of this proposal is to harnes

More Details

Tunable reciprocal and nonreciprocal contributions to 1D Coulomb drag

Nature Communications

Zheng, Mingyang; Makaju, Rebika; Gazizulin, Rasul; Addamane, Sadhvikas J.; Laroche, Dominique

Coulomb drag is a powerful tool to study interactions in coupled low-dimensional systems. Historically, Coulomb drag has been attributed to a frictional force arising from momentum transfer whose direction is dictated by the current flow. In the absence of electron-electron correlations, treating the Coulomb drag circuit as a rectifier of noise fluctuations yields similar conclusions about the reciprocal nature of Coulomb drag. In contrast, recent findings in one-dimensional systems have identified a nonreciprocal contribution to Coulomb drag that is independent of the current flow direction. In this work, we present Coulomb drag measurements between vertically coupled GaAs/AlGaAs quantum wires separated vertically by a hard barrier only 15 nm wide, where both reciprocal and nonreciprocal contributions to the drag signal are observed simultaneously, and whose relative magnitudes are temperature and gate tunable. Our study opens up the possibility of studying the physical mechanisms behind the onset of both Coulomb drag contributions simultaneously in a single device, ultimately leading to a better understanding of Luttinger liquids in multi-channel wires and paving the way for the creation of energy harvesting devices.

More Details

Unsupervised Clustering of Microseismic Events and Focal Mechanism Analysis at the CO2 Injection Site in Decatur, Illinois

Journal of Geophysical Research: Machine Learning and Computation

Willis, Rachel M.; Yoon, Hongkyu; Williams-Stroud, Sherilyn; Frailey, Scott M.; Silva, Josimar A.; Juanes, Ruben

Characterization of induced microseismicity at a carbon dioxide (CO2) storage site is critical for preserving reservoir integrity and mitigating seismic hazards. We apply a multilevel machine learning (ML) approach that combines the nonnegative matrix factorization and hidden Markov model to extract spectral representations of microseismic events and cluster them to identify seismic patterns at the Illinois Basin-Decatur Project. Unlike traditional waveform correlation methods, this approach leverages spectral characteristics of first arrivals to improve event classification and detect previously undetected planes of weakness. By integrating ML-based clustering with focal mechanism analysis, we resolve small-scale fault structures that are below the detection limits of conventional seismic imaging. Our findings reveal temporal bursts of microseismicity associated with brittle failure, providing insights into the spatio-temporal evolution of fault reactivation during CO2 injection. This approach enhances seismic monitoring capabilities at CO2 injection sites by improving fault characterization beyond the resolution of standard geophysical surveys.

More Details

Detecting outbreaks using a spatial latent field

PLOS ONE

Ray, Jaideep; Bridgman, Wyatt

In this paper, we present a method for estimating the infection-rate of a disease as a spatial-temporal field. Our data comprises time-series case-counts of symptomatic patients in various areal units of a region. We extend an epidemiological model, originally designed for a single areal unit, to accommodate multiple units. The field estimation is framed within a Bayesian context, utilizing a parameterized Gaussian random field as a spatial prior. We apply an adaptive Markov chain Monte Carlo method to sample the posterior distribution of the model parameters condition on COVID-19 case-count data from three adjacent counties in New Mexico, USA. Our results suggest that the correlation between epidemiological dynamics in neighboring regions helps regularize estimations in areas with high variance (i.e., poor quality) data. Using the calibrated epidemic model, we forecast the infection-rate over each areal unit and develop a simple anomaly detector to signal new epidemic waves. Our findings show that anomaly detector based on estimated infection-rates outperforms a conventional algorithm that relies solely on case-counts.

More Details

One-shot gas detection with transformer paired neural networks in Mako collected longwave infrared hyperspectral imagery

Journal of Applied Remote Sensing

Benham, Kevin; Deneke, Elihu

To date, careful data treatment workflows and statistical detectors are used to perform hyperspectral image (HSI) detection of any gas contained in a spectral library, which is often expanded with physics models to incorporate different spectral characteristics. In general, surrounding evidence or known gas-release parameters are used to provide confidence in or confirm detection capability, respectively. This makes quantifying detection performance difficult as it is nearly impossible to develop an absolute ground truth for gas target pixel presence in collected HSI. Consequently, developing and comparing new detection methods, especially machine learning (ML)-based methods, is susceptible to subjectivity in derived detection map quality. In this work, we demonstrate the first use of transformer-based paired neural networks (PNNs) for one-shot gas target detection for multiple gases while providing quantitative classification and detection metrics for their use on labeled data. Terabytes of training data are generated from a database of long-wave infrared HSI obtained from historical Mako sensor campaigns over Los Angeles. By incorporating labels, singular signature representations, and a model development pipeline, we can tune and select PNNs to detect multiple gas targets that are not seen in training on a quantitative basis. We additionally assess our test set detections using interpretability techniques widely employed with ML-based predictors, but less common with detection methods relying on learned latent spaces.

More Details

Consistency of fatigue crack growth behavior of pipeline and low-alloy pressure vessel steels in gaseous hydrogen

International Journal of Hydrogen Energy

Ronevich, Joseph; Agnani, Milan; San Marchi, Chris

This study investigates the fatigue crack growth rate (FCGR) behavior of pipeline and low-alloy pressure vessel steels in high-pressure gaseous hydrogen. Despite a broad range of yield strengths and microstructures ranging from ferrite/pearlite, acicular ferrite, bainite, and martensite, the FCGR in gaseous hydrogen remained consistent (falling within a factor of 2–3). Steels with higher fractions of pearlite, typical of older vintage pipeline steels, exhibited modestly lower crack growth rates in gaseous hydrogen compared to steels with lower fractions of pearlite. Crack growth rates in these materials exhibit a systematic dependence on stress ratio and partial pressure of hydrogen, as captured in the recently published fatigue design curves in ASME B31 code case 220 for pipeline steels and ASME BPVC code case 2938 for pressure-vessel steels.

More Details

A novel peridynamics-based approach to predict pharmaceutical tablet robustness

Powder Technology

Garner, Sean; Silling, Stewart; Ketterhagen, William; Strong, John

The pharmaceutical drug product development process can be greatly accelerated through the use of modeling and simulation techniques to predict the manufacturability and performance of a given formulation. The anticipation and possible mitigation of tablet damage due to manufacturing stresses represents a specific area of interest in the pharmaceutical industry for predicting formulation and tableting performance. While the finite element method (FEM) has been extensively used for predicting the mechanical behavior of powder material in the compaction processes, a shortcoming of the approach is the inherent difficulty to predict discontinuities (e.g., damage or cracking) within a tablet as FEM is a continuum-based approach. In this work, we propose a novel method utilizing peridynamics (PD), a numerical method that can capture discontinuities such as tablet fracture, to predict the evolution of damage and breakage in pharmaceutical tablets. The approach links (1) the finite element method – to elucidate the behavior of powders during die compaction – with (2) the peridynamics modeling technique – to model the discontinuous nature of damage and predict tablet breakage during the critical stages of unloading and ejection from the compression die. This short communication presents a proof of concept including a workflow to calibrate the linked FEM-PD simulation models. It demonstrates promising results from a preliminary experimental validation of the approach. Following further development, this approach could be used to guide the optimization of compression processes through targeted changes to formulation material properties, compression process conditions, and/or tooling geometries to deliver improved process efficiency and tablet robustness.

More Details

Bayesian learning with Gaussian processes for low-dimensional representations of time-dependent nonlinear systems

Physica D: Nonlinear Phenomena

Mcquarrie, Shane A.; Chaudhuri, Anirban; Guo, Mengwu

This work presents a data-driven method for learning low-dimensional time-dependent physics-based surrogate models whose predictions are endowed with uncertainty estimates. We use the operator inference approach to model reduction that poses the problem of learning low-dimensional model terms as a regression of state space data and corresponding time derivatives by minimizing the residual of reduced system equations. Standard operator inference models perform well with accurate training data that are dense in time, but producing stable and accurate models when the state data are noisy and/or sparse in time remains a challenge. Another challenge is the lack of uncertainty estimation for the predictions from the operator inference models. Our approach addresses these challenges by incorporating Gaussian process surrogates into the operator inference framework to (1) probabilistically describe uncertainties in the state predictions and (2) procure analytical time derivative estimates with quantified uncertainties. The formulation leads to a generalized least-squares regression and, ultimately, reduced-order models that are described probabilistically with a closed-form expression for the posterior distribution of the operators. The resulting probabilistic surrogate model propagates uncertainties from the observed state data to reduced-order predictions. We demonstrate the method is effective for constructing low-dimensional models of two nonlinear partial differential equations representing a compressible flow and a nonlinear diffusion–reaction process, as well as for estimating the parameters of a low-dimensional system of nonlinear ordinary differential equations representing compartmental models in epidemiology.

More Details

Timing based clustering of childhood BMI trajectories reveals differential maturational patterns; Study in the Northern Finland Birth Cohorts 1966 and 1986: Pediatrics

International Journal of Obesity

Tucker, J.D.; Heiskala, Anni; Choudhary, Priyanka; Nedelec, Rozenn; Ronkainen, Justiina; Sarala, Olli; Jarvelin, Marjo R.; Sillanpaa, Mikko J.; Sebert, Sylvain

Background/Objectives: Children’s biological age does not always correspond to their chronological age. In the case of BMI trajectories, this can appear as phase variation, which can be seen as shift, stretch, or shrinking between trajectories. With maturation thought of as a process moving towards the final state - adult BMI, we assessed whether children can be divided into latent groups reflecting similar maturational age of BMI. The groups were characterised by early factors and time-related features of the trajectories. Subjects/Methods: We used data from two general population birth cohort studies, Northern Finland Birth Cohorts 1966 and 1986 (NFBC1966 and NFBC1986). Height (n = 6329) and weight (n = 6568) measurements were interpolated in 34 shared time points using B-splines, and BMI values were calculated between 3 months to 16 years. Pairwise phase distances of 2999 females and 3163 males were used as a similarity measure in k-medoids clustering. Results: We identified three clusters of trajectories in females and males (Type 1: females, n = 1566, males, n = 1669; Type 2: females, n = 1028, males, n = 973; Type 3: females, n = 405, males, n = 521). Similar distinct timing patterns were identified in males and females. The clusters did not differ by sex, or early growth determinants studied. Conclusions: Trajectory cluster Type 1 reflected to the shape of what is typically illustrated as the childhood BMI trajectory in literature. However, the other two have not been identified previously. Type 2 pattern was more common in the NFBC1966 suggesting a generational shift in BMI maturational patterns.

More Details

Performance of 3 cm3 ion trap vacuum package sealed for 10 years

Applied Physics Letters

Thrasher, Daniel A.; Schwindt, Peter D.; Jau, Yuan-Yu

Miniature atomic clocks based on the interrogation of the ground state hyperfine splitting of buffer gas cooled ions confined in radio frequency Paul traps have shown great promise as high precision prototype clocks. We report on the performance of two miniature ion trap vacuum packages after being sealed for as much as 10 years. We find the lifetime of the ions within the trap has increased over time for both traps and can be as long as 50 days. We form two clocks using the two traps and compare their relative frequency instability one with another to demonstrate a short-term instability of 5×10-13$τ$-1/2 integrating down to 1×10-14 after 2 ks of integration. The trapped ion lifetime and clock instability demonstrated by these miniature devices despite only being passively pumped for many years represents a critical advance toward their proliferation in the clock community.

More Details

Pathogenesis of Chapare Virus in Cynomolgus Macaques

EMI: Animal & Environment

Johnson, Dylan M.; Geisbert, Thomas W.

Chapare virus (CHAPV) is an emerging New World arenavirus that is the causative agent of Chapare hemorrhagic fever (CHHF) responsible for recent outbreaks with alarmingly high case fatality rates in Bolivia near the Brazilian border. Here, we describe a nonhuman primate (NHP) model of CHHF infection which represents an essential tool to understand this emerging biological threat agent. Cynomolgus macaques challenged intravenously with CHAPV develop clinical disease, which recapitulates several key features of human CHHF. All subjects lost weight and had clinical scores following CHAPV challenge. Notably, one of four NHPs developed lethal disease with viral hepatitis and hemorrhagic features. Clinical chemistry and hematology revealed leukopenia, anemia, thrombocytopenia, and increased transaminase levels. In all four subjects, viremia was detectable for the first week following challenge and viral RNA was detectable in serum and many tissues persisting 35 days-post challenge. Several medical countermeasures (MCM) have efficacy against CHAPV infection in vitro, but the current model for MCM testing and approval of new drugs is reliant on the availability of animal models. This work lays the foundation for future CHHF MCM development.

More Details

Guiding Principles for Geochemical/Thermodynamic Model Development and Validation in Nuclear Waste Disposal: A Close Examination of Recent Thermodynamic Models for H+—Nd3+—NO3−(—Oxalate) Systems

Energies

Xiong, Yongliang; Wang, Yifeng

Development of a defensible source-term model (STM), usual ly a thermodynamical model for radionuclide solubility calculations, is critical to a performance assessment (PA) of a geologic repository for nuclear waste disposal. Such a model is generally subjected to rigorous regulatory scrutiny. In this article, we highlight key guiding principles for STM model development and validation in nuclear waste management. We illustrate these principles by closely examining three recently developed thermodynamic models with the Pitzer formulism for aqueous H+—Nd3+—NO3−(—oxalate) systems in a reverse alphabetical order of the authors: the XW model developed by Xiong and Wang, the OWC model developed by Oakes et al., and the GLC model developed by Guignot et al., among which the XW model deals with trace activity coefficients for Nd(III), while the OWC and GLC models are for concentrated Nd(NO3)3 electrolyte solutions. The principles highlighted include the following: (1) Principle 1. Validation against independent experimental data: A model should be validated against experimental data or field observations that have not been used in the original model parameterization. We tested the XW model against multiple independent experimental data sets including electromotive force (EMF), solubility, water vapor, and water activity measurements. The results show that the XW model is accurate and valid for its intended use for predicting trace activity coefficients and therefore Nd solubility in repository environments. (2) Principle 2. Testing for relevant and sensitive variables: Solution pH is such a variable for an STM and easily acquirable. All three models are checked for their ability to predict pH conditions in Nd(NO3)3 electrolyte solutions. The OWC model fails to provide a reasonable estimate for solution pH conditions, thus casting serious doubt on its validity for a source-term calculation. In contrast, both the XW and GLC models predict close-to-neutral pH values, in agreement with experimental measurements. (3) Principle 3. Honoring physical constraints: Upon close examination, it is found that the Nd(III)-NO3 association schema in the OWC model suffers from two shortcomings. Firstly, its second stepwise stability constant for Nd(NO3)2+ (log K2) is much higher than the first stepwise stability constant for NdNO32+ (log K1), thus violating the general rule of (log K2–log K1) < 0, or (Formula presented.). Secondly, the OWC model predicts abnormally high activity coefficients for Nd(NO3)2+ (up to ~900) as the concentration increases. (4) Principle 4. Minimizing degrees of freedom for model fitting: The OWC model with nine fitted parameters is compared with the GLC model with five fitted parameters, as both models apply to the concentrated region for Nd(NO3)3 electrolyte solutions. The latter appears superior to the former because the latter can fit osmotic coefficient data equally well with fewer model parameters. The work presented here thus illustrates the salient points of geochemical model development, selection, and validation in nuclear waste management.

More Details

Wear of Wave Energy Converters Mooring Lines Belts

Journal of Offshore Mechanics and Arctic Engineering

Abdellatef, Mohammed; Clark, Josiah; Kojimoto, Nigel; Gunawan, Budi

Using a belt as a replacement for a rope on a rotary power take-offs (PTOs) system has become more common for wave energy converters, improving cyclic bend over sheave performance with a smaller bending thickness for belts. However, the service life predictions of PTOs are a major concern in design, because belt performance under harsh underwater environments is largely less studied. In this work, the effect of fleet and twist angles on wear life is being investigated both experimentally and numerically. Two three-dimensional equivalent static finite element models are constructed to evaluate the complex stress state of polyurethane-steel belts around steel drums. The first is to capture the response of the experimental investigation performed on the wear life, and the second to predict the wear life of an existing functional PTO. The results show a significant effect for fleet and twist angles on stress concentrations and estimated service life.

More Details

Smart Meter Data: A Gateway for Reducing Solar Soft Costs with Model-Free Hosting Capacity Maps

Reno, Matthew J.; Azzolini, Joseph A.

Public-facing solar hosting capacity (HC) maps, which show the maximum amount of solar energy that can be installed at a location without adverse effects, have proven to be a key driver of solar soft cost reductions through a variety of pathways (e.g., streamlining interconnection, siting, and customer acquisition processes). However, current methods for generating HC maps require detailed grid models and time-consuming simulations that limit both their accuracy and scalability—today, only a handful out of almost 2,000 utilities provide these maps. This project developed and validated data-driven algorithms for calculating solar HC using data from AMI without the need of detailed grid models or simulations. The algorithms were validated on utility datasets and incorporated as an application into NRECA’s Open Modeling Framework (OMF.coop) for the over 260 coops and vendors throughout the US to use. The OMF is free and open-source for everyone.

More Details

An Approach to Realize Generalized Optimal Motion Primitives Using Physics Informed Neural Networks

ASME Letters in Dynamic Systems and Control

Slightam, Jonathon E.; Steyer, Andrew J.; Beaver, Logan E.; Young, Carol C.

Autonomous manipulation is a challenging problem in field robotics due to uncertainty in object properties, constraints, and coupling phenomenon with robot control systems. Humans learn motion primitives over time to effectively interact with the environment. We postulate that autonomous manipulation can be enabled by basic sets of motion primitives as well, but do not necessitate mimicking human motion primitives. This work presents an approach to generalized optimal motion primitives using physics-informed neural networks. Our simulated and experimental results demonstrate that optimality is notionally maintained where the mean maximum observed final position percent error was 0.564% and the average mean error for all the trajectories was 1.53%. These results indicate that notional generalization is attained using a physics-informed neural network approach that enables near optimal real-time adaptation of primitive motion profiles.

More Details
Results 1–25 of 99,299
Results 1–25 of 99,299