2009 IEEE International Conference on
Systems, Man, and Cybernetics |
![]() |
Abstract
Due to the advents of multi-core CPU and GPU, various parallel processing techniques have been widely applied to many application fields including computer vision. This paper presents a parallel processing technique for real-time feature extraction in object recognition by autonomous mobile robots, which utilizes both CPU and GPU by combining OpenMP, SSE (Streaming SIMD Extension) and CUDA programming. Firstly, the algorithms and codes for feature extraction are optimized and implemented in parallel processing. After the parallel algorithms are assured to maintain the same level of performance, the process for extracting key points and obtaining dominant orientation with respect to the key points is parallelized. Following the extraction is the construction of a parallel descriptor via SSE instructions. Finally, the GPU version of SIFT is also implemented using CUDA. The experiments have shown that the CPU version of SIFT is almost five times faster than the original SIFT while maintaining robust performance. Further, the GPU-Parallel descriptor achieves acceleration up to five times higher than the CPU-Parallel descriptor at a cost of a bit lower performance.