The ethical, legal, and social issues (ELSI) surrounding Artificial Intelligence (AI) can have as great of an impact on the technologies’ success as technical issues such as safety, reliability, and security. Addressing these risks can counter potential program failures, legal and ethical battles, constraints to scientific research, and product vulnerabilities. This paper presents a surety engineering framework and process that can be applied to AI to identify and address technical, ethical, legal and societal risks. Extending sound engineering practices to incorporate a method to “engineer” ELSI can offer the scientific rigor required to significantly reduce the risk of AI vulnerabilities. Modeling the specification, design, evaluation and quality/risk indicators for AI provides a foundation for a risk-informed decision process that can benefit researchers and stakeholders alike as they use it to critically examine both substantial and intangible risks.
Emerging memory devices, such as resistive crossbars, have the capacity to store large amounts of data in a single array. Acquiring the data stored in large-capacity crossbars in a sequential fashion can become a bottleneck. We present practical methods, based on sparse sampling, to quickly acquire sparse data stored on emerging memory devices that support the basic summation kernel, reducing the acquisition time from linear to sub-linear. The experimental results show that at least an order of magnitude improvement in acquisition time can be achieved when the data are sparse. Finally, in addition, we show that the energy cost associated with our approach is competitive to that of the sequential method.
A forensics investigation after a breach often uncovers network and host indicators of compromise (IOCs) that can be deployed to sensors to allow early detection of the adversary in the future. Over time, the adversary will change tactics, techniques, and procedures (TTPs), which will also change the data generated. If the IOCs are not kept up-to-date with the adversary's new TTPs, the adversary will no longer be detected once all of the IOCs become invalid. Tracking the Known (TTK) is the problem of keeping IOCs, in this case regular expression (regexes), up-to-date with a dynamic adversary. Our framework solves the TTK problem in an automated, cyclic fashion to bracket a previously discovered adversary. This tracking is accomplished through a data-driven approach of self-adapting a given model based on its own detection capabilities.In our initial experiments, we found that the true positive rate (TPR) of the adaptive solution degrades much less significantly over time than the naïve solution, suggesting that self-updating the model allows the continued detection of positives (i.e., adversaries). The cost for this performance is in the false positive rate (FPR), which increases over time for the adaptive solution, but remains constant for the naïve solution. However, the difference in overall detection performance, as measured by the area under the curve (AUC), between the two methods is negligible. This result suggests that self-updating the model over time should be done in practice to continue to detect known, evolving adversaries.
Neural-inspired spike-based computing machines often claim to achieve considerable advantages in terms of energy and time efficiency by using spikes for computation and communication. However, fundamental questions about spike-based computation remain unanswered. For instance, how much advantage do spike-based approaches have over conventionalmethods, and underwhat circumstances does spike-based computing provide a comparative advantage? Simply implementing existing algorithms using spikes as the medium of computation and communication is not guaranteed to yield an advantage. Here, we demonstrate that spike-based communication and computation within algorithms can increase throughput, and they can decrease energy cost in some cases. We present several spiking algorithms, including sorting a set of numbers in ascending/descending order, as well as finding the maximum or minimum ormedian of a set of numbers.We also provide an example application: a spiking median-filtering approach for image processing providing a low-energy, parallel implementation. The algorithms and analyses presented here demonstrate that spiking algorithms can provide performance advantages and offer efficient computation of fundamental operations useful in more complex algorithms.
File fragment classification is an important step in the task of file carving in digital forensics. In file carving, files must be reconstructed based on their content as a result of their fragmented storage on disk or in memory. Existing methods for classification of file fragments typically use hand-engineered features, such as byte histograms or entropy measures. In this paper, we propose an approach using sparse coding that enables automated feature extraction. Sparse coding, or sparse dictionary learning, is an unsupervised learning algorithm, and is capable of extracting features based simply on how well those features can be used to reconstruct the original data. With respect to file fragments, we learn sparse dictionaries for n-grams, continuous sequences of bytes, of different sizes. These dictionaries may then be used to estimate n-gram frequencies for a given file fragment, but for significantly larger n-gram sizes than are typically found in existing methods which suffer from combinatorial explosion. To demonstrate the capability of our sparse coding approach, we used the resulting features to train standard classifiers, such as support vector machines over multiple file types. Experimentally, we achieved significantly better classification results with respect to existing methods, especially when the features were used in supplement to existing hand-engineered features.
Malware detection and remediation is an on-going task for computer security and IT professionals. Here, we examine the use of neural algorithms to detect malware using the system calls generated by executables-alleviating attempts at obfuscation as the behavior is monitored. We examine several deep learning techniques, and liquid state machines baselined against a random forest. The experiments examine the effects of concept drift to understand how well the algorithms generalize to novel malware samples by testing them on data that was collected after the training data. The results suggest that each of the examined machine learning algorithms is a viable solution to detect malware-achieving between 90% and 95% class-averaged accuracy (CAA). In real-world scenarios, the performance evaluation on an operational network may not match the performance achieved in training. Namely, the CAA may be about the same, but the values for precision and recall over the malware can change significantly. We structure experiments to highlight these caveats and offer insights into expected performance in operational environments. In addition, we use the induced models to better understand what differentiates malware samples from goodware, which can further be used as a forensics tool to provide directions for investigation and remediation.
This paper formulates general computation as a feedback-control problem, which allows the agent to autonomously overcome some limitations of standard procedural language programming: resilience to errors and early program termination. Our formulation considers computation to be trajectory generation in the program's variable space. The computing then becomes a sequential decision making problem, solved with reinforcement learning (RL), and analyzed with Lyapunov stability theory to assess the agent's resilience and progression to the goal. We do this through a case study on a quintessential computer science problem, array sorting. Evaluations show that our RL sorting agent makes steady progress to an asymptotically stable goal, is resilient to faulty components, and performs less array manipulations than traditional Quicksort and Bubble sort.
Unlike general purpose computer architectures that are comprised of complex processor cores and sequential computation, the brain is innately parallel and contains highly complex connections between computational units (neurons). Key to the architecture of the brain is a functionality enabled by the combined effect of spiking communication and sparse connectivity with unique variable efficacies and temporal latencies. Utilizing these neuroscience principles, we have developed the Spiking Temporal Processing Unit (STPU) architecture which is well-suited for areas such as pattern recognition and natural language processing. In this paper, we formally describe the STPU, implement the STPU on a field programmable gate array, and show measured performance data.
Analog resistive memories promise to reduce the energy of neural networks by orders of magnitude. However, the write variability and write nonlinearity of current devices prevent neural networks from training to high accuracy. We present a novel periodic carry method that uses a positional number system to overcome this while maintaining the benefit of parallel analog matrix operations. We demonstrate how noisy, nonlinear TaOx devices that could only train to 80% accuracy on MNIST, can now reach 97% accuracy, only 1% away from an ideal numeric accuracy of 98%. On a file type dataset, the TaOx devices achieve ideal numeric accuracy. In addition, low noise, linear Li1-xCoO2 devices train to ideal numeric accuracies using periodic carry on both datasets.
In 2016, Lewis Rhodes Labs, (LRL), shipped the first commercially viable Neuromorphic Processing Unit, (NPU), branded as a Neuromorphic Data Microscope (NDM). This product leverages architectural mechanisms derived from the sensory cortex of the human brain to efficiently implement pattern matching. LRL and Sandia National Labs have optimized this product for streaming analytics, and demonstrated a 1,000x power per operation reduction in an FPGA format. When reduced to an ASIC, the efficiency will improve to 1,000,000x. Additionally, the neuromorphic nature of the device gives it powerful computational attributes that are counterintuitive to those schooled in traditional von Neumann architectures. The Neuromorphic Data Microscope is the first of a broad class of brain-inspired, time domain processors that will profoundly alter the functionality and economics of data processing.
Considerable effort is currently being spent designing neuromorphic hardware for addressing challenging problems in a variety of pattern-matching applications. These neuromorphic systems offer low power architectures with intrinsically parallel and simple spiking neuron processing elements. Unfortunately, these new hardware architectures have been largely developed without a clear justification for using spiking neurons to compute quantities for problems of interest. Specifically, the use of spiking for encoding information in time has not been explored theoretically with complexity analysis to examine the operating conditions under which neuromorphic computing provides a computational advantage (time, space, power, etc.) In this paper, we present and formally analyze the use of temporal coding in a neural-inspired algorithm for optimization-based computation in neural spiking architectures.
Neural machine learning methods, such as deep neural networks (DNN), have achieved remarkable success in a number of complex data processing tasks. These methods have arguably had their strongest impact on tasks such as image and audio processing - data processing domains in which humans have long held clear advantages over conventional algorithms. In contrast to biological neural systems, which are capable of learning continuously, deep artificial networks have a limited ability for incorporating new information in an already trained network. As a result, methods for continuous learning are potentially highly impactful in enabling the application of deep networks to dynamic data sets. Here, inspired by the process of adult neurogenesis in the hippocampus, we explore the potential for adding new neurons to deep layers of artificial neural networks in order to facilitate their acquisition of novel information while preserving previously trained data representations. Our results on the MNIST handwritten digit dataset and the NIST SD 19 dataset, which includes lower and upper case letters and digits, demonstrate that neurogenesis is well suited for addressing the stability-plasticity dilemma that has long challenged adaptive machine learning algorithms.