We present a classical approximation algorithm for the MAX-2-Local Hamiltonian problem. This is a maximization version of the QMA-complete 2-Local Hamiltonian problem in quantum computing, with the additional assumption that each local term is positive semidefinite. The MAX-2-Local Hamiltonian problem generalizes NP-hard constraint satisfaction problems, and our results may be viewed as generalizations of approximation approaches for the MAX-2-CSP problem. We work in the product state space and extend the framework of Goemans and Williamson for approximating MAX-2-CSPs. The key difference is that in the product state setting, a solution consists of a set of normalized 3-dimensional vectors rather than boolean numbers, and we leverage approximation results for rank-constrained Grothendieck inequalities. For MAX-2-Local Hamiltonian we achieve an approximation ratio of 0.328. This is the first example of an approximation algorithm beating the random quantum assignment ratio of 0.25 by a constant factor.
Deep learning networks have become a vital tool for image and data processing tasks for deployed and edge applications. Resource constraints, particularly low power budgets, have motivated methods and devices for efficient on-edge inference. Two promising methods are reduced precision communication networks (e.g. binary activation spiking neural networks) and weight pruning. In this paper, we provide a preliminary exploration for combining these two methods, specifically in-training weight pruning of whetstone networks, to achieve deep networks with both sparse weights and binary activations.
While dragonflies are well-known for their high success rates when hunting prey, how the underlying neural circuitry generates the prey-interception trajectories used by dragonflies to hunt remains an open question. I present a model of dragonfly prey interception that uses a neural network to calculate motor commands for prey-interception. The model uses the motor outputs of the neural network to internally generate a forward model of prey-image translation resulting from the dragonfly's own turning that can then serve as a feedback guidance signal, resulting in trajectories with final approaches very similar to proportional navigation. The neural network is biologically-plausible and can therefore can be compared against in vivo neural responses in the biological dragonfly, yet parsimonious enough that the algorithm can be implemented without requiring specialized hardware.
The widely parallel, spiking neural networks of neuromorphic processors can enable computationally powerful formulations. While recent interest has focused on primarily machine learning tasks, the space of appropriate applications is wide and continually expanding. Here, we leverage the parallel and event-driven structure to solve a steady state heat equation using a random walk method. The random walk can be executed fully within a spiking neural network using stochastic neuron behavior, and we provide results from both IBM TrueNorth and Intel Loihi implementations. Additionally, we position this algorithm as a potential scalable benchmark for neuromorphic systems.
DiPietro, Kelsey L.; Paquin-Lefebvre, Frederic; Bin XuBin; Lindsay, Alan E.; Jilkine, Alexandra
Reaction-diffusion systems have been widely used to study spatio-temporal phenomena in cell biology, such as cell polarization. Coupled bulk-surface models naturally include compartmentalization of cytosolic and membrane-bound polarity molecules. Here we study the distribution of the polarity protein Cdc42 in a mass-conserved membrane-bulk model, and explore the effects of diffusion and spatial dimensionality on spatio-temporal pattern formation. We first analyze a one-dimensional (1-D) model for Cdc42 oscillations in fission yeast, consisting of two diffusion equations in the bulk domain coupled to nonlinear ODEs for binding kinetics at each end of the cell. In 1-D, our analysis reveals the existence of symmetric and asymmetric steady states, as well as anti-phase relaxation oscillations typical of slow-fast systems. We then extend our analysis to a two-dimensional (2-D) model with circular bulk geometry, for which species can either diffuse inside the cell or become bound to the membrane and undergo a nonlinear reaction-diffusion process. We also consider a nonlocal system of PDEs approximating the dynamics of the 2-D membrane-bulk model in the limit of fast bulk diffusion. In all three model variants we find that mass conservation selects perturbations of spatial modes that simply redistribute mass. In 1-D, only anti-phase oscillations between the two ends of the cell can occur, and in-phase oscillations are excluded. In higher dimensions, no radially symmetric oscillations are observed. Instead, the only instabilities are symmetry-breaking, either corresponding to stationary Turing instabilities, leading to the formation of stationary patterns, or to oscillatory Turing instabilities, leading to traveling and standing waves. Codimension-two Bogdanov–Takens bifurcations occur when the two distinct instabilities coincide, causing traveling waves to slow down and to eventually become stationary patterns. Our work clarifies the effect of geometry and dimensionality on behaviors observed in mass-conserved cell polarity models.
Neural network (NN) inference is an essential part of modern systems and is found at the heart of numerous applications ranging from image recognition to natural language processing. In situ NN accelerators can efficiently perform NN inference using resistive crossbars, which makes them a promising solution to the data movement challenges faced by conventional architectures. Although such accelerators demonstrate significant potential for dense NNs, they often do not benefit from sparse NNs, which contain relatively few non-zero weights. Processing sparse NNs on in situ accelerators results in wasted energy to charge the entire crossbar where most elements are zeros. To address this limitation, this letter proposes Granular Matrix Reordering (GMR): a preprocessing technique that enables an energy-efficient computation of sparse NNs on in situ accelerators. GMR reorders the rows and columns of sparse weight matrices to maximize the crossbars' utilization and minimize the total number of crossbars needed to be charged. The reordering process does not rely on sparsity patterns and incurs no accuracy loss. Overall, GMR achieves an average of 28 percent and up to 34 percent reduction in energy consumption over seven pruned NNs across four different pruning methods and network architectures.
Atomic precision advanced manufacturing (APAM) offers creation of donor devices in an atomically thin layer doped beyond the solid solubility limit, enabling unique device physics. This presents an opportunity to use APAM as a pathfinding platform to investigate digital electronics at the atomic limit. Scaling to smaller transistors is increasingly difficult and expensive, necessitating the investigation of alternative fabrication paths that extend to the atomic scale. APAM donor devices can be created using a scanning tunneling microscope (STM). However, these devices are not currently compatible with industry standard fabrication processes. There exists a tradeoff between low thermal budget (LT) processes to limit dopant diffusion and high thermal budget (HT) processes to grow defect-free layers of epitaxial Si and gate oxide. To this end, we have developed an LT epitaxial Si cap and LT deposited Al2O3 gate oxide integrated with an atomically precise single-electron transistor (SET) that we use as an electrometer to characterize the quality of the gate stack. The surface-gated SET exhibits the expected Coulomb blockade behavior. However, the gate’s leverage over the SET is limited by defects in the layers above the SET, including interfaces between the Si and oxide, and structural and chemical defects in the Si cap. We propose a more sophisticated gate stack and process flow that is predicted to improve performance in future atomic precision devices.
In finite element simulations, not all of the data are of equal importance. In fact, the primary purpose of a numerical study is often to accurately assess only one or two engineering output quantities that can be expressed as functionals. Adjoint-based error estimation provides a means to approximate the discretization error in functional quantities and mesh adaptation provides the ability to control this discretization error by locally modifying the finite element mesh. In the past, adjoint-based error estimation has only been accessible to expert practitioners in the field of solid mechanics. In this work, we present an approach to automate the process of adjoint-based error estimation and mesh adaptation on parallel machines. This process is intended to lower the barrier of entry to adjoint-based error estimation and mesh adaptation for solid mechanics practitioners. We demonstrate that this approach is effective for example problems in Poisson’s equation, nonlinear elasticity, and thermomechanical elastoplasticity.
Computer Methods in Applied Mechanics and Engineering
Fleeter, Casey M.; Geraci, Gianluca G.; Schiavazzi, Daniele E.; Kahn, Andrew M.; Marsden, Alison L.
Standard approaches for uncertainty quantification in cardiovascular modeling pose challenges due to the large number of uncertain inputs and the significant computational cost of realistic three-dimensional simulations. We propose an efficient uncertainty quantification framework utilizing a multilevel multifidelity Monte Carlo (MLMF) estimator to improve the accuracy of hemodynamic quantities of interest while maintaining reasonable computational cost. This is achieved by leveraging three cardiovascular model fidelities, each with varying spatial resolution to rigorously quantify the variability in hemodynamic outputs. We employ two low-fidelity models (zero- and one-dimensional) to construct several different estimators. Our goal is to investigate and compare the efficiency of estimators built from combinations of these two low-fidelity model alternatives and our high-fidelity three-dimensional models. We demonstrate this framework on healthy and diseased models of aortic and coronary anatomy, including uncertainties in material property and boundary condition parameters. Our goal is to demonstrate that for this application it is possible to accelerate the convergence of the estimators by utilizing a MLMF paradigm. Therefore, we compare our approach to single fidelity Monte Carlo estimators and to a multilevel Monte Carlo approach based only on three-dimensional simulations, but leveraging multiple spatial resolutions. We demonstrate significant, on the order of 10 to 100 times, reduction in total computational cost with the MLMF estimators. We also examine the differing properties of the MLMF estimators in healthy versus diseased models, as well as global versus local quantities of interest. As expected, global quantities such as outlet pressure and flow show larger reductions than local quantities, such as those relating to wall shear stress, as the latter rely more heavily on the highest fidelity model evaluations. Similarly, healthy models show larger reductions than diseased models. In all cases, our workflow coupling Dakota's MLMF estimators with the SimVascular cardiovascular modeling framework makes uncertainty quantification feasible for constrained computational budgets.
Given an input stream of size N, a †-heavy hitter is an item that occurs at least † N times in S. The problem of finding heavy-hitters is extensively studied in the database literature. We study a real-time heavy-hitters variant in which an element must be reported shortly after we see its T = † N-th occurrence (and hence becomes a heavy hitter). We call this the Timely Event Detection (TED) Problem. The TED problem models the needs of many real-world monitoring systems, which demand accurate (i.e., no false negatives) and timely reporting of all events from large, high-speed streams, and with a low reporting threshold (high sensitivity). Like the classic heavy-hitters problem, solving the TED problem without false-positives requires large space (ω(N) words). Thus in-RAM heavy-hitters algorithms typically sacrifice accuracy (i.e., allow false positives), sensitivity, or timeliness (i.e., use multiple passes). We show how to adapt heavy-hitters algorithms to external memory to solve the TED problem on large high-speed streams while guaranteeing accuracy, sensitivity, and timeliness. Our data structures are limited only by I/O-bandwidth (not latency) and support a tunable trade-off between reporting delay and I/O overhead. With a small bounded reporting delay, our algorithms incur only a logarithmic I/O overhead. We implement and validate our data structures empirically using the Firehose streaming benchmark. Multi-threaded versions of our structures can scale to process 11M observations per second before becoming CPU bound. In comparison, a naive adaptation of the standard heavy-hitters algorithm to external memory would be limited by the storage device's random I/O throughput, i.e., ∼100K observations per second.
In this work, a stabilized continuous Galerkin (CG) method for magnetohydrodynamics (MHD) is presented. Ideal, compressible inviscid MHD equations are discretized in space on unstructured meshes using piecewise linear or bilinear finite element bases to get a semi-discrete scheme. Stabilization is then introduced to the semi-discrete method in a strategy that follows the algebraic flux correction paradigm. This involves adding some artificial diffusion to the high order, semi-discrete method and mass lumping in the time derivative term. The result is a low order method that provides local extremum diminishing properties for hyperbolic systems. The difference between the low order method and the high order method is scaled element-wise using a limiter and added to the low order scheme. The limiter is solution dependent and computed via an iterative linearity preserving nodal variation limiting strategy. The stabilization also involves an optional consistent background high order dissipation that reduces phase errors. The resulting stabilized scheme is a semi-discrete method that can be applied to inviscid shock MHD problems and may be even extended to resistive and viscous MHD problems. To satisfy the divergence free constraint of the MHD equations, we add parabolic divergence cleaning to the system. Various time integration methods can be used to discretize the scheme in time. We demonstrate the robustness of the scheme by solving several shock MHD problems.
We study a trajectory analysis problem we call the Trajectory Capture Problem (TCP), in which, for a given input set T of trajectories in the plane, and an integer k-2, we seek to compute a set of k points ("portals") to maximize the total weight of all subtrajectories of T between pairs of portals. This problem naturally arises in trajectory analysis and summarization. We show that the TCP is NP-hard (even in very special cases) and give some first approximation results. Our main focus is on attacking the TCP with practical algorithm-engineering approaches, including integer linear programming (to solve instances to provable optimality) and local search methods. We study the integrality gap arising from such approaches. We analyze our methods on different classes of data, including benchmark instances that we generate. Our goal is to understand the best performing heuristics, based on both solution time and solution quality. We demonstrate that we are able to compute provably optimal solutions for real-world instances. 2012 ACM Subject Classification Theory of computation ! Design and analysis of algorithms.
We present SUSPECT, an open source toolkit that symbolically analyzes mixed-integer nonlinear optimization problems formulated using the Python algebraic modeling library Pyomo. We present the data structures and algorithms used to implement SUSPECT. SUSPECT works on a directed acyclic graph representation of the optimization problem to perform: bounds tightening, bound propagation, monotonicity detection, and convexity detection. We show how the tree-walking rules in SUSPECT balance the need for lightweight computation with effective special structure detection. SUSPECT can be used as a standalone tool or as a Python library to be integrated in other tools or solvers. We highlight the easy extensibility of SUSPECT with several recent convexity detection tricks from the literature. We also report experimental results on the MINLPLib 2 dataset.