Publications

Results 101–114 of 114

Search results

Jump to search filters

A high-performance GPU-based forward-projection model for computed tomography applications

Proceedings of SPIE - The International Society for Optical Engineering

Perez, Ismael P.; Jimenez, Edward S.; Thompson, Kyle R.

This work describes a high-performance approach to radiograph (i.e. X-ray image for this work) simulation for arbitrary objects. The generation of radiographs is more generally known as the forward projection imaging model. The formation of radiographs is very computationally expensive and is not typically approached for large-scale applications such as industrial radiography. The approach described in this work revolves around a single GPU-based implementation that performs the attenuation calculation in a massively parallel environment. Additionally, further performance gains are realized by exploiting the GPU-specific hardware. Early results show that using a single GPU can increase computational performance by three orders-of- magnitude for volumes of 10003 voxels and images with 10002 pixels.

More Details

Exploring mediated reality to approximate X-ray attenuation coefficients from radiographs

Proceedings of SPIE - The International Society for Optical Engineering

Jimenez, Edward S.; Orr, Laurel J.; Morgan, Megan L.; Thompson, Kyle R.

Estimation of the x-ray attenuation properties of an object with respect to the energy emitted from the source is a challenging task for traditional Bremsstrahlung sources. This exploratory work attempts to estimate the x-ray attenuation profile for the energy range of a given Bremsstrahlung profile. Previous work has shown that calculating a single effective attenuation value for a polychromatic source is not accurate due to the non-linearities associated with the image formation process. Instead, we completely characterize the imaging system virtually and utilize an iterative search method/constrained optimization technique to approximate the attenuation profile of the object of interest. This work presents preliminary results from various approaches that were investigated. The early results illustrate the challenges associated with these techniques and the potential for obtaining an accurate estimate of the attenuation profile for objects composed of homogeneous materials.

More Details

High performance graphics processor based computed tomography reconstruction algorithms for nuclear and other large scale applications

Jimenez, Edward S.; Orr, Laurel J.; Thompson, Kyle R.

The goal of this work is to develop a fast computed tomography (CT) reconstruction algorithm based on graphics processing units (GPU) that achieves significant improvement over traditional central processing unit (CPU) based implementations. The main challenge in developing a CT algorithm that is capable of handling very large datasets is parallelizing the algorithm in such a way that data transfer does not hinder performance of the reconstruction algorithm. General Purpose Graphics Processing (GPGPU) is a new technology that the Science and Technology (S&T) community is starting to adopt in many fields where CPU-based computing is the norm. GPGPU programming requires a new approach to algorithm development that utilizes massively multi-threaded environments. Multi-threaded algorithms in general are difficult to optimize since performance bottlenecks occur that are non-existent in single-threaded algorithms such as memory latencies. If an efficient GPU-based CT reconstruction algorithm can be developed; computational times could be improved by a factor of 20. Additionally, cost benefits will be realized as commodity graphics hardware could potentially replace expensive supercomputers and high-end workstations. This project will take advantage of the CUDA programming environment and attempt to parallelize the task in such a way that multiple slices of the reconstruction volume are computed simultaneously. This work will also take advantage of the GPU memory by utilizing asynchronous memory transfers, GPU texture memory, and (when possible) pinned host memory so that the memory transfer bottleneck inherent to GPGPU is amortized. Additionally, this work will take advantage of GPU-specific hardware (i.e. fast texture memory, pixel-pipelines, hardware interpolators, and varying memory hierarchy) that will allow for additional performance improvements.

More Details

A high-performance and energy-efficient CT reconstruction algorithm for multi-terabyte datasets

IEEE Nuclear Science Symposium Conference Record

Jimenez, Edward S.; Orr, Laurel J.; Thompson, Kyle R.

There has been much work done in implementing various GPU-based Computed Tomography reconstruction algorithms for medical applications showing tremendous improvement in computational performance. While many of these reconstruction algorithms could also be applied to industrial-scale datasets, the performance gains may be modest to non-existent due to a combination of algorithmic, hardware, or scalability limitations. Previous work presented showed an irregular dynamic approach to GPU-Reconstruction kernel execution for industrial-scale reconstructions that dramatically improved voxel processing throughput. However, the improved kernel execution magnified other system bottlenecks such as host memory bandwidth and storage read/write bandwidth, thus hindering performance gains. This paper presents a multi-GPU-based reconstruction algorithm capable of efficiently reconstructing large volumes (between 64 gigavoxels and 1 teravoxel volumes) not only faster than traditional CPU- and GPU-based reconstruction algorithms but also while consuming significantly less energy. The reconstruction algorithm exploits the irregular kernel approach from previous work as well as a modularized MIMD-like environment, heterogeneous parallelism, as well as macro- and micro-scale dynamic task allocation. The result is a portable and flexible reconstruction algorithm capable of executing on a wide range of architectures including mobile computers, workstations, supercomputers, and modestly-sized hetero or homogeneous clusters with any number of graphics processors. © 2013 IEEE.

More Details

High-performance computing applied to semantic databases

Jimenez, Edward S.; Goodman, Eric G.

To-date, the application of high-performance computing resources to Semantic Web data has largely focused on commodity hardware and distributed memory platforms. In this paper we make the case that more specialized hardware can offer superior scaling and close to an order of magnitude improvement in performance. In particular we examine the Cray XMT. Its key characteristics, a large, global shared-memory, and processors with a memory-latency tolerant design, offer an environment conducive to programming for the Semantic Web and have engendered results that far surpass current state of the art. We examine three fundamental pieces requisite for a fully functioning semantic database: dictionary encoding, RDFS inference, and query processing. We show scaling up to 512 processors (the largest configuration we had available), and the ability to process 20 billion triples completely in-memory.

More Details
Results 101–114 of 114
Results 101–114 of 114