Leveraging GPUs in Industrial Computed Tomography Reconstruction
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Proceedings of SPIE the International Society for Optical Engineering
Estimation of the x-ray attenuation properties of an object with respect to the energy emitted from the source is a challenging task for traditional Bremsstrahlung sources. This exploratory work attempts to estimate the x-ray attenuation profile for the energy range of a given Bremsstrahlung profile. Previous work has shown that calculating a single effective attenuation value for a polychromatic source is not accurate due to the non-linearities associated with the image formation process. Instead, we completely characterize the imaging system virtually and utilize an iterative search method/constrained optimization technique to approximate the attenuation profile of the object of interest. This work presents preliminary results from various approaches that were investigated. The early results illustrate the challenges associated with these techniques and the potential for obtaining an accurate estimate of the attenuation profile for objects composed of homogeneous materials.
Proceedings of SPIE - The International Society for Optical Engineering
This paper will investigate energy-efficiency for various real-world industrial computed-tomography reconstruction algorithms, both CPU- and GPU-based implementations. This work shows that the energy required for a given reconstruction is based on performance and problem size. There are many ways to describe performance and energy efficiency, thus this work will investigate multiple metrics including performance-per-watt, energy-delay product, and energy consumption. This work found that irregular GPU-based approaches1 realized tremendous savings in energy consumption when compared to CPU implementations while also significantly improving the performanceper- watt and energy-delay product metrics. Additional energy savings and other metric improvement was realized on the GPU-based reconstructions by improving storage I/O by implementing a parallel MIMD-like modularization of the compute and I/O tasks.
Proceedings of SPIE - The International Society for Optical Engineering
Estimation of the x-ray attenuation properties of an object with respect to the energy emitted from the source is a challenging task for traditional Bremsstrahlung sources. This exploratory work attempts to estimate the x-ray attenuation profile for the energy range of a given Bremsstrahlung profile. Previous work has shown that calculating a single effective attenuation value for a polychromatic source is not accurate due to the non-linearities associated with the image formation process. Instead, we completely characterize the imaging system virtually and utilize an iterative search method/constrained optimization technique to approximate the attenuation profile of the object of interest. This work presents preliminary results from various approaches that were investigated. The early results illustrate the challenges associated with these techniques and the potential for obtaining an accurate estimate of the attenuation profile for objects composed of homogeneous materials.
Abstract not provided.
Abstract not provided.
The goal of this work is to develop a fast computed tomography (CT) reconstruction algorithm based on graphics processing units (GPU) that achieves significant improvement over traditional central processing unit (CPU) based implementations. The main challenge in developing a CT algorithm that is capable of handling very large datasets is parallelizing the algorithm in such a way that data transfer does not hinder performance of the reconstruction algorithm. General Purpose Graphics Processing (GPGPU) is a new technology that the Science and Technology (S&T) community is starting to adopt in many fields where CPU-based computing is the norm. GPGPU programming requires a new approach to algorithm development that utilizes massively multi-threaded environments. Multi-threaded algorithms in general are difficult to optimize since performance bottlenecks occur that are non-existent in single-threaded algorithms such as memory latencies. If an efficient GPU-based CT reconstruction algorithm can be developed; computational times could be improved by a factor of 20. Additionally, cost benefits will be realized as commodity graphics hardware could potentially replace expensive supercomputers and high-end workstations. This project will take advantage of the CUDA programming environment and attempt to parallelize the task in such a way that multiple slices of the reconstruction volume are computed simultaneously. This work will also take advantage of the GPU memory by utilizing asynchronous memory transfers, GPU texture memory, and (when possible) pinned host memory so that the memory transfer bottleneck inherent to GPGPU is amortized. Additionally, this work will take advantage of GPU-specific hardware (i.e. fast texture memory, pixel-pipelines, hardware interpolators, and varying memory hierarchy) that will allow for additional performance improvements.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
IEEE Nuclear Science Symposium Conference Record
There has been much work done in implementing various GPU-based Computed Tomography reconstruction algorithms for medical applications showing tremendous improvement in computational performance. While many of these reconstruction algorithms could also be applied to industrial-scale datasets, the performance gains may be modest to non-existent due to a combination of algorithmic, hardware, or scalability limitations. Previous work presented showed an irregular dynamic approach to GPU-Reconstruction kernel execution for industrial-scale reconstructions that dramatically improved voxel processing throughput. However, the improved kernel execution magnified other system bottlenecks such as host memory bandwidth and storage read/write bandwidth, thus hindering performance gains. This paper presents a multi-GPU-based reconstruction algorithm capable of efficiently reconstructing large volumes (between 64 gigavoxels and 1 teravoxel volumes) not only faster than traditional CPU- and GPU-based reconstruction algorithms but also while consuming significantly less energy. The reconstruction algorithm exploits the irregular kernel approach from previous work as well as a modularized MIMD-like environment, heterogeneous parallelism, as well as macro- and micro-scale dynamic task allocation. The result is a portable and flexible reconstruction algorithm capable of executing on a wide range of architectures including mobile computers, workstations, supercomputers, and modestly-sized hetero or homogeneous clusters with any number of graphics processors. © 2013 IEEE.
Abstract not provided.