Panasonic R&D Center Singapore has harnessed great leaders as much as it has great talent. These key figures lead major research and development projects at the top of their individual fields. Commanding authority and notable achievements in the industry, we are proud that they are a core part of the team.
Combining high resolution color and depth images for dense 3D reconstruction
Pongsak Lasang, Shengmei Shen and Wuttipong Kumwilaisak
Proc. IEEE International Conference on Consumer Electronics – Berlin, September 2014, pp. 331-334
Combining high resolution color and depth images for dense 3D reconstruction
Pongsak Lasang, Shengmei Shen and Wuttipong Kumwilaisak
Proc. IEEE International Conference on Consumer Electronics – Berlin, September 2014, pp. 331-334
In this paper, we present an effective method to reconstruct a dense 3D model of an object or a scene by combining high resolution color and depth images. Conventionally, multiple views of color images can be used for reconstructing a 3D model of the captured scene. Although the conventional method can give the accuracy of the textured-regions of an object, it lacks density and leaves many holes in the texture-less regions. However, a depth camera is capable to capture 3D distance information even in homogenous regions. Still, it gives low resolution and is unable to provide an accurate result of a detailed object. We thus propose a combined method of utilizing both high resolution color and depth images to obtain a high-quality, accurate and dense 3D model. Compared to the conventional methods, our proposed method produces a much denser and more all-over 3D results.
Cher Keng Heng, Samantha Yue Ying Lim, Zhiheng Niu, Bo Li
The Visual Object Tracking VOT2013 Challenge in conjunction with ICCV 2013, 2013.12
PLT runs a classifier at a fixed single scale for each test image, to determine the top scoring bounding box which is then the result of object detection. The classifier uses a binary feature vector constructed from color, grayscale and gradient information. To select a small set of discriminative features, an online sparse structural SVM [20] is used. Since the object can be non-rigid and the bounding box may be noisy, not all pixels in the bounding box belong to the object. Hence, a probabilistic object-background segmentation mask from color histograms is created and used to weight the features during SVM training. The resulting weighted and convex problem can be solved in three steps: (i) compute the probability that a pixel belongs to the object by using its color. (ii) solve the original non-sparse structural SVM and (iii) shrink the solution [21], i.e. features with smallest values are discarded. Since the feature vector is binary, the linear classifier can be implemented as a lookup table for fast speed.
Image-Processing Technologies for Service Robot “HOSPI-Rimo”
Tatsuo Sakai, Anselm Lim Yi Xiong
Panasonic Technical Journal Vol. 58 No. 4 Jan 2013
To enhance quality of service robot which substitutes for people’s work, easy observation of the circumference situation at the time of remote operation, and the environment recognition – especially human detection – at the time of autonomous movement are required. This report explains the image processing technologies for these purposes, and their application.
A Stereo-based Pedestrian Detection System for Smart Intersection
Yunyun Cao, Sugiri Pranata, Makoto Yasugi, Zhiheng Niu, Hirofumi Nishimura
19th ITS World Congress, Vienna, Austria, 22-26 October, 2012
This paper introduces a stereo-based pedestrian detection system for smart intersection applications. Promising for real-time usage, a detector with sliding window approach is employed. In order to reduce scanning time, stereo cameras are utilized. Firstly, the system generates a disparity image from calibrated left and right images. Then, searching regions of interest (ROI) for pedestrians are obtained by clustering the disparity information. Finally, a raster scan is applied to all of the ROI and their scaled copies. For distance measurement, the authors developed a fast and accurate algorithm called Inverted Phase Filter (IPF), which can measure the disparity at sub-pixel level fast. Furthermore, for pedestrian classification, a novel feature – Staggered multi-scale LBP (Local Binary Pattern) histogram – is proposed. Evaluation results show that the proposed feature outperforms benchmarks like HOG and CoHOG on Panasonic night-time dataset and Daimler Chrysler dataset.
Staggered Multi-scale LBP for Pedestrian Detection
Yunyun Cao, Sugiri Pranata, Makoto Yasugi, Zhiheng Niu, Hirofumi Nishimura
International Conference on Image Processing (ICIP) 2012
Pedestrian detection remains a popular and challenging problem due to large variation in appearance. A robust feature extraction method is highly desired for accurate pedestrian detection. In this paper, firstly, we propose a staggered multiscale LBP histogram. In order to exploit grayscale difference information in more directions, three scales with radius of 1, 3, and 5 pixels are utilized, and different scales are staggered. The Staggered Multi-scale LBP histogram is composed of three 256-bin histograms, each of which corresponds to one of the three scales. Secondly, dimensionality of the LBP histogram is reduced using a boosting learning method. Experimental results show that the proposed feature outperforms benchmarks such as Uniform-LBP, HOG and CoHOG on INRIA, Daimler Chrysler and our Panasonic night time datasets.
Boosted Translation-tolerable Classifiers for Fast Object Detection
Wei Zheng, Luhong Liang, Hong Chang, Cher Keng Heng, Shiguang Shan, Xilin Chen
Image and Vision Computing, Volume 30, Issue 8, August 2012, Pages 480-491
Boosted Translation-tolerable Classifiers for Fast Object Detection
Wei Zheng, Luhong Liang, Hong Chang, Cher Keng Heng, Shiguang Shan, Xilin Chen
Image and Vision Computing, Volume 30, Issue 8, August 2012, Pages 480-491
Different classifiers show different sensitivities to translation-variance. The translation-insensitive classifiers are capable of accelerating the detection process by searching over a coarse grid as well as guaranteeing the recall rate. In this paper, we define a concept of Translation-Tolerable Region (TTR) for a classifier. The TTR is such a region that all the detection windows in it have consistent (stable) results output by the classifier. We use the classifier’s Maximal Translation-Tolerable Region (MTTR) to measure its sensitivity to the translation-variance. For object detection, we propose an algorithm for training the discriminative classifiers as well as learning the associated MTTRs. The discriminative classifiers are assembled into a cascaded classifier in descending order of their MTTR sizes. To speed up the detection process, we propose a Granularity-Adaptively-Tunable (GAT) search strategy according to the classifiers’ MTTRs. Furthermore, we prove that the recall rate is Probably Approximately Admissible (PAA) in the GAT search, which means that the proposed approach can theoretically guarantee the accuracy while accelerating the detection process. Based on the boosting framework with Histograms of Oriented Gradients (HOG) features, we evaluate the proposed approach on the public datasets containing both rigid and non-rigid object classes. The experimental results show that our approach achieves considerable results with a fast speed.
Shrink boost for selecting multi-LBP histogram features in object detection
Cher Keng Heng, Sumio Yokomitsu, Yuichi Matsumoto, Hajime Tamura
Conference on Computer Vision and Pattern Recognition (CVPR) 2012, 2012.06
Feature selection from sparse and high dimension features using conventional greedy based boosting gives classifiers of poor generalization. We propose a novel “shrink boost” method to address this problem. It solves a sparse regularization problem with two iterative steps. First, a “boosting” step uses weighted training samples to learn a full high dimensional classifier on all features. This avoids over fitting to few features and improves generalization. Next, a “shrinkage” step shrinks least discriminative classifier dimension to zero to remove the redundant features. In our object detection system, we use “shrink boost” to select sparse features from histograms of local binary pattern (LBP) of multiple quantization and image channels to learn classifier of additive lookup tables (LUT). Our evaluation shows that our classifier has much better generalization than those from greedy based boosting and those from SVM methods, even under limited number of train samples. On public dataset of human detection and pedestrian detection, we achieve better performance than state of the arts. On our more challenging dataset of bird detection, we show promising results.
Context Modeling for Facial Landmark Detection based on Non-Adjacent Rectangle (NAR) Haar-like Feature
Xiaowei Zhao, Xiujuan Chai, Zhiheng Niu, Cher Keng Heng, Shiguang Shan
Image and Vision Computing, Volume 30, Issue 3, March 2012, Pages 136-146
Automatically locating facial landmarks in images is an important task in computer vision. This paper proposes a novel context modeling method for facial landmark detection, which integrates context constraints together with local texture model in the cascaded AdaBoost framework. The motivation of our method lies in the basic human psychology observation that not only the local texture information but also the global context information is used for human to locate facial landmarks in faces. Therefore, in our solution, a novel type of feature, called Non-Adjacent Rectangle (NAR) Haar-like feature, is proposed to characterize the co-occurrence between facial landmarks and its surroundings, i.e., the context information, in terms of low-level features. For the locating task, traditional Haar-like features (characterizing local texture information) and NAR Haar-like features (characterizing context constraints in global sense) are combined together to form more powerful representations. Through Real AdaBoost learning, the most discriminative feature set is selected automatically and used for facial landmark detection. To verify the effectiveness of the proposed method, we evaluate our facial landmark detection algorithm on BioID and Cohn-Kanade face databases. Experimental results convincingly show that the NAR Haar-like feature is effective to model the context and our proposed algorithm impressively outperforms the published state-of-the-art methods. In addition, the generalization capability of the NAR Haar-like feature is further validated by extended applications to face detection task on FDDB face database.
Directional Adaptive Hole Filling for New View Synthesis
Pongsak Lasang, The Kiet Lu, Shengmei Shen
Proc. IEEE International Conference on Consumer Electronics (ICCE’12), January 2012, pp. 610-611.
In this paper, a new hole filling method based on the direction of background image texture for new image view synthesis is presented. Strong texture gradient of background pixel is traced along its direction to obtain the texture orientation. Then texture direction map is computed for the hole pixels, based on the texture orientation. Finally, the hole pixels are filled by the background pixels along the direction guided by the texture direction map. This is to produce natural texture in the hole regions, while reducing blur, and preventing distortion in the foreground objects. Thus, high quality new image view can be achieved, even with large baseline synthesis. When the images are used for 3D viewing, the 3D effect is enhanced.
Detecting Pedestrians Using An Advanced Local Binary Pattern Histogram
Yunyun Cao, Hirofumi Nishimura, Sugiri Pranata
18th ITS World Congress, Orlando, USA, 16-20 October, 2011
Fast and simple for implementation, Local binary pattern (LBP), has shown its superiority in texture classification and face recognition, but weak in pedestrian recognition. An enhanced LBP Feature — Weighted LBP Histogram, is proposed for robust pedestrian detection. In Weighted LBP Histogram, each bin value is calculated by accumulating the weight of each pixel s LBP code which belongs to this bin, whereas the weight is defined as the Sum of Absolute Difference (SAD) of the centre pixel and its surrounding pixels. The experimental results show its effectiveness on alleviation of noise and enhancement of signal-to-noise ratio. Utilizing the proposed LBP feature, our pedestrian detection system achieves a high performance than the benchmark method HOG on several benchmark datasets.