Panasonic R&D Center Singapore has harnessed great leaders as much as it has great talent. These key figures lead major research and development projects at the top of their individual fields. Commanding authority and notable achievements in the industry, we are proud that they are a core part of the team.
Person re-identification with fusion of hand-crafted and deep pose-based body region features
Person re-identification (re-ID) aims to accurately re- trieve a person from a large-scale database of images cap- tured across multiple cameras. Existing works learn deep representations using a large training subset of unique per- sons. However, identifying unseen persons is critical for a good re-ID algorithm. Moreover, the misalignment be- tween person crops to detection errors or pose variations leads to poor feature matching. In this work, we present a fusion of handcrafted features and deep feature representa- tion learned using multiple body parts to complement the global body features that achieves high performance on un- seen test images. Pose information is used to detect body regions that are passed through Convolutional Neural Net- works (CNN) to guide feature learning. Finally, a metric learning step enables robust distance matching on a dis- criminative subspace. Experimental results on 4 popular re-ID benchmark datasets namely VIPer, DukeMTMC-reID, Market-1501 and CUHK03 show that the proposed method achieves state-of-the-art performance in image-based per- son re-identification.
Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis
Jian Zhao, Lin Xiong, Jayashree Karlekar, Jianshu Li, Fang Zhao, Zhecan Wang, Sugiri Pranata, Shengmei Shen, Jiashi Feng Conference on Neural Information Processing Systems (NIPS), 2017.12
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam varius felis aliquet, fermentum nisl at, molestie nisi. In hac habitasse platea dictumst. Maecenas luctus lectus eros, vitae malesuada risus cursus porta. Aliquam non enim malesuada quam lobortis lobortis. Fusce ac sapien volutpat, pulvinar ligula sed, sagittis arcu. Nam sagittis sem sit amet vulputate mollis. Donec non mattis lectus, quis tincidunt quam. Phasellus elementum pharetra tellus imperdiet elementum. Maecenas tempor efficitur quam, in mattis velit congue eu. Etiam egestas ultrices consequat. Sed a nisl tellus. Donec orci diam, lobortis ac orci et, eleifend posuere lectus. Etiam sollicitudin turpis a augue maximus tempus. Sed egestas mauris non elit congue, a varius mi elementum. Aliquam at ligula ante. Fusce imperdiet sapien ac dapibus fringilla.
Donec in arcu tortor. Vivamus eleifend lorem a fringilla malesuada. Praesent vitae mauris vehicula, vulputate dui non, consequat quam. Integer tellus diam, ultrices et iaculis eget, rhoncus in odio. Mauris vel aliquam lorem. Maecenas ac hendrerit ligula. Sed quis mi id elit faucibus viverra. Duis ac justo ac turpis blandit hendrerit. Donec lacus massa, blandit in quam non, maximus finibus diam.
Integer a felis massa. Nunc ac maximus tortor. Aliquam velit augue, tincidunt suscipit semper eget, imperdiet ut risus. Suspendisse sit amet porttitor sapien. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Quisque rhoncus tristique euismod. Fusce pharetra, augue nec auctor faucibus, diam orci auctor tortor, eu gravida magna neque non dolor.
Integer quis accumsan arcu. Pellentesque posuere faucibus risus. Fusce sed feugiat turpis. Vivamus euismod ex vitae lorem dapibus, id efficitur ligula pellentesque. Nulla facilisi. Maecenas luctus nisl at ornare venenatis. Mauris maximus nisl mauris.
Suspendisse a ultrices ipsum, at venenatis risus. Praesent ornare enim purus, ac varius leo convallis et. Nam porta, justo a hendrerit suscipit, sem enim cursus arcu, ac pellentesque eros felis a lacus. Vivamus tristique, ex sed blandit feugiat, ex nunc rutrum felis, in efficitur ante lectus vel massa. Praesent molestie finibus laoreet. Sed ultrices, eros eu cursus finibus, est libero sollicitudin ex, ac cursus lacus lectus ac purus. Donec bibendum vitae velit vitae aliquet. Fusce volutpat finibus tempor. Integer vulputate porta eleifend. Etiam mattis nibh ut aliquet consectetur. Nulla pharetra lectus eget aliquam tempus. Nulla facilisi. Nulla ultricies nisi vel urna posuere, eu porttitor erat maximus. Praesent viverra volutpat turpis, nec porttitor orci vulputate porta. Nulla fermentum aliquet placerat.
Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis
Jian Zhao, Lin Xiong, Jayashree Karlekar, Jianshu Li, Fang Zhao, Zhecan Wang, Sugiri Pranata, Shengmei Shen, Jiashi Feng
Conference on Neural Information Processing Systems (NIPS) 2017, 2017.12
Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis
Jian Zhao, Lin Xiong, Jayashree Karlekar, Jianshu Li, Fang Zhao, Zhecan Wang, Sugiri Pranata, Shengmei Shen, Jiashi Feng
Conference on Neural Information Processing Systems (NIPS) 2017, 2017.12
Synthesizing realistic profile faces is promising for more efficiently training deep pose-invariant models for large-scale unconstrained face recognition, by populating samples with extreme poses and avoiding tedious annotations. However, learning from synthetic faces may not achieve the desired performance due to the discrepancy between distributions of the synthetic and real face images. To narrow this gap, we propose a Dual-Agent Generative Adversarial Network (DA-GAN) model, which can improve the realism of a face simulator’s output using unlabeled real faces, while preserving the identity information during the realism refinement. The dual agents are specifically designed for distinguishing real v.s. fake and identities simultaneously. In particular, we employ an off-the-shelf 3D face model as a simulator to generate profile face images with varying poses. DA-GAN leverages a fully convolutional network as the generator to generate high-resolution images and an auto-encoder as the discriminator with the dual agents. Besides the novel architecture, we make several key modifications to the standard GAN to preserve pose and texture, preserve identity and stabilize training process: (i) a pose perception loss; (ii) an identity perception loss; (iii) an adversarial loss with a boundary equilibrium regularization term. Experimental results show that DA-GAN not only presents compelling perceptual results but also significantly outperforms state-of-the-arts on the large-scale and challenging NIST IJB-A unconstrained face recognition benchmark. In addition, the proposed DA-GAN is also promising as a new approach for solving generic transfer learning problems more effectively.
Intention-Net: Integrated Planning and Deep Learning for Autonomous Navigation
Wei Gao, Karthikk Subramanian Conference on Robot Learning (CoRL), 2017.11
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam varius felis aliquet, fermentum nisl at, molestie nisi. In hac habitasse platea dictumst. Maecenas luctus lectus eros, vitae malesuada risus cursus porta. Aliquam non enim malesuada quam lobortis lobortis. Fusce ac sapien volutpat, pulvinar ligula sed, sagittis arcu. Nam sagittis sem sit amet vulputate mollis. Donec non mattis lectus, quis tincidunt quam. Phasellus elementum pharetra tellus imperdiet elementum. Maecenas tempor efficitur quam, in mattis velit congue eu. Etiam egestas ultrices consequat. Sed a nisl tellus. Donec orci diam, lobortis ac orci et, eleifend posuere lectus. Etiam sollicitudin turpis a augue maximus tempus. Sed egestas mauris non elit congue, a varius mi elementum. Aliquam at ligula ante. Fusce imperdiet sapien ac dapibus fringilla.
Donec in arcu tortor. Vivamus eleifend lorem a fringilla malesuada. Praesent vitae mauris vehicula, vulputate dui non, consequat quam. Integer tellus diam, ultrices et iaculis eget, rhoncus in odio. Mauris vel aliquam lorem. Maecenas ac hendrerit ligula. Sed quis mi id elit faucibus viverra. Duis ac justo ac turpis blandit hendrerit. Donec lacus massa, blandit in quam non, maximus finibus diam.
Integer a felis massa. Nunc ac maximus tortor. Aliquam velit augue, tincidunt suscipit semper eget, imperdiet ut risus. Suspendisse sit amet porttitor sapien. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Quisque rhoncus tristique euismod. Fusce pharetra, augue nec auctor faucibus, diam orci auctor tortor, eu gravida magna neque non dolor.
Integer quis accumsan arcu. Pellentesque posuere faucibus risus. Fusce sed feugiat turpis. Vivamus euismod ex vitae lorem dapibus, id efficitur ligula pellentesque. Nulla facilisi. Maecenas luctus nisl at ornare venenatis. Mauris maximus nisl mauris.
Suspendisse a ultrices ipsum, at venenatis risus. Praesent ornare enim purus, ac varius leo convallis et. Nam porta, justo a hendrerit suscipit, sem enim cursus arcu, ac pellentesque eros felis a lacus. Vivamus tristique, ex sed blandit feugiat, ex nunc rutrum felis, in efficitur ante lectus vel massa. Praesent molestie finibus laoreet. Sed ultrices, eros eu cursus finibus, est libero sollicitudin ex, ac cursus lacus lectus ac purus. Donec bibendum vitae velit vitae aliquet. Fusce volutpat finibus tempor. Integer vulputate porta eleifend. Etiam mattis nibh ut aliquet consectetur. Nulla pharetra lectus eget aliquam tempus. Nulla facilisi. Nulla ultricies nisi vel urna posuere, eu porttitor erat maximus. Praesent viverra volutpat turpis, nec porttitor orci vulputate porta. Nulla fermentum aliquet placerat.
Audio-Visual Emotion Recognition using Deep Transfer Learning and Multiple Temporal Models
Xi Ouyang, Shigenori Kawaai, Gue Hua Ester Goh, Shengmei Shen, Wan Ding, Huaiping Ming, Dong-Yan Huang ACM International Conference on Multimodal Interaction (ICMI), 2017.11
Audio-Visual Emotion Recognition using Deep Transfer Learning and Multiple Temporal Models
Xi Ouyang, Shigenori Kawaai, Gue Hua Ester Goh, Shengmei Shen, Wan Ding, Huaiping Ming, Dong-Yan Huang ACM International Conference on Multimodal Interaction (ICMI), 2017.11
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam varius felis aliquet, fermentum nisl at, molestie nisi. In hac habitasse platea dictumst. Maecenas luctus lectus eros, vitae malesuada risus cursus porta. Aliquam non enim malesuada quam lobortis lobortis. Fusce ac sapien volutpat, pulvinar ligula sed, sagittis arcu. Nam sagittis sem sit amet vulputate mollis. Donec non mattis lectus, quis tincidunt quam. Phasellus elementum pharetra tellus imperdiet elementum. Maecenas tempor efficitur quam, in mattis velit congue eu. Etiam egestas ultrices consequat. Sed a nisl tellus. Donec orci diam, lobortis ac orci et, eleifend posuere lectus. Etiam sollicitudin turpis a augue maximus tempus. Sed egestas mauris non elit congue, a varius mi elementum. Aliquam at ligula ante. Fusce imperdiet sapien ac dapibus fringilla.
Donec in arcu tortor. Vivamus eleifend lorem a fringilla malesuada. Praesent vitae mauris vehicula, vulputate dui non, consequat quam. Integer tellus diam, ultrices et iaculis eget, rhoncus in odio. Mauris vel aliquam lorem. Maecenas ac hendrerit ligula. Sed quis mi id elit faucibus viverra. Duis ac justo ac turpis blandit hendrerit. Donec lacus massa, blandit in quam non, maximus finibus diam.
Integer a felis massa. Nunc ac maximus tortor. Aliquam velit augue, tincidunt suscipit semper eget, imperdiet ut risus. Suspendisse sit amet porttitor sapien. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Quisque rhoncus tristique euismod. Fusce pharetra, augue nec auctor faucibus, diam orci auctor tortor, eu gravida magna neque non dolor.
Integer quis accumsan arcu. Pellentesque posuere faucibus risus. Fusce sed feugiat turpis. Vivamus euismod ex vitae lorem dapibus, id efficitur ligula pellentesque. Nulla facilisi. Maecenas luctus nisl at ornare venenatis. Mauris maximus nisl mauris.
Suspendisse a ultrices ipsum, at venenatis risus. Praesent ornare enim purus, ac varius leo convallis et. Nam porta, justo a hendrerit suscipit, sem enim cursus arcu, ac pellentesque eros felis a lacus. Vivamus tristique, ex sed blandit feugiat, ex nunc rutrum felis, in efficitur ante lectus vel massa. Praesent molestie finibus laoreet. Sed ultrices, eros eu cursus finibus, est libero sollicitudin ex, ac cursus lacus lectus ac purus. Donec bibendum vitae velit vitae aliquet. Fusce volutpat finibus tempor. Integer vulputate porta eleifend. Etiam mattis nibh ut aliquet consectetur. Nulla pharetra lectus eget aliquam tempus. Nulla facilisi. Nulla ultricies nisi vel urna posuere, eu porttitor erat maximus. Praesent viverra volutpat turpis, nec porttitor orci vulputate porta. Nulla fermentum aliquet placerat.
High Performance Large Scale Face Recognition with Multi-Cognition Softmax and Feature Retrieval
Xi Ouyang, Shigenori Kawaai, Gue Hua Ester Goh, Shengmei Shen, Wan Ding, Huaiping Ming, Dong-Yan Huang
ACM International Conference on Multimodal Interaction (ICMI) 2017, 2017.11
High Performance Large Scale Face Recognition with Multi-Cognition Softmax and Feature Retrieval
Xi Ouyang, Shigenori Kawaai, Gue Hua Ester Goh, Shengmei Shen, Wan Ding, Huaiping Ming, Dong-Yan Huang
ACM International Conference on Multimodal Interaction (ICMI) 2017, 2017.11
This paper presents the techniques used in our contribution to Emotion Recognition in the Wild 2017 video based sub-challenge. The purpose of the sub-challenge is to classify the six basic emotions (angry, sad, happy, surprise, fear and disgust) and neutral. Our proposed solution utilizes three state-of-the-arts techniques to overcome the challenges for the wild emotion recognition. Deep network transfer learning is used for feature extraction. Spatial-temporal model fusion is to make full use of the complementary of different networks. Semi-auto reinforcement learning is for the optimization of fusion strategy based on dynamic outside feedbacks given by challenge organizers. The overall accuracy of the proposed approach on the challenge test dataset is 57.2%, which is better than the challenge baseline of 40.47% .
Intention-Net: Integrated Planning and Deep Learning for Autonomous Navigation
Wei Gao, David Hsu, Wee Sun Lee, Shengmei Shen, Karthikk Subramanian
Conference on Robot Learning (CoRL) 2017, 2017.12
How can a delivery robot navigate reliably to a destination in a new office building, with minimal prior information? To tackle this challenge, this paper introduces a two-level hierarchical method, which integrates model-free deep learning and model-based path planning. At the low level, a neural-network motion controller, called the intention-net, is trained end-to-end to provide robust local navigation. Intention-net maps images from a single monocular camera and given “intentions” directly to robot control. At the high level, a path planner uses a crude map, e.g., a 2-D floor plan, to compute a path from the robot’s current location to the goal. The planned path provides intentions to the intention-net. Preliminary experiments suggest that the learned motion controller is robust against perceptual uncertainty and by integrating with a path planner, it generalizes effectively to new environments and goals.
Hao Liu, Jiashi Feng, Zequn Jie, Jayashree Karlekar, Bo Zhao, Meibin Qi, Jianguo Jiang, Shuicheng Yan
International Conference on Computer Vision (ICCV) 2017, 2017.10
We investigate the problem of person search in the wild in this work. Instead of comparing the query against all candidate regions generated in a query-blind manner, we propose to recursively shrink the search area from the whole image till achieving precise localization of the target person, by fully exploiting information from the query and contextual cues in every recursive search step. We develop the Neural Person Search Machines (NPSM) to implement such recursive localization for person search. Benefiting from its neural search mechanism, NPSM is able to selectively shrink its focus from a loose region to a tighter one containing the target automatically. In this process, NPSM employs an internal primitive memory component to memorize the query representation which modulates the attention and augments its robustness to other distracting regions. Evaluations on two benchmark datasets, CUHK-SYSU Person Search dataset and PRW dataset, have demonstrated that our method can outperform current state-of-the-arts in both mAP and top-1 evaluation protocols.
Know You at One Glance: A Compact Vector Representation for Low-Shot Learning
Yu Cheng*, Jian Zhao*, Zhecan Wang, Yan Xu, Jayashree Karlekar, Shengmei Shen, Jiashi Feng
Workshop on MS-Celeb-1M Challenge with ICCV 2017, 2017.10
Low-shot face recognition is a very challenging yet important problem in computer vision. The feature representation of the gallery face sample is one key component in this problem. To this end, we propose an Enforced Softmax optimization approach built upon Convolutional Neural Networks (CNNs) to produce an effective and compact vector representation. The learned feature representation is very helpful to overcome the underlying multi-modality variations and remain the primary key features as close to the mean face of the identity as possible in the high-dimensional feature space, thus making the gallery basis more robust under various conditions, and improving the overall performance for low-shot learning. In particular, we sequentially leverage optimal dropout, selective attenuation, ℓ2 normalization, and model-level optimization to enhance the standard Softmax objective function for to produce a more compact vectorized representation for low-shot learning. Comprehensive evaluations on the MNIST, Labeled Faces in the Wild (LFW), and the challenging MS-Celeb-1M Low-Shot Learning Face Recognition benchmark datasets clearly demonstrate the superiority of our proposed method over state-of-the-arts. By further introducing a heuristic voting strategy for robust multi-view combination, and our proposed method has won the Top-1 place in the MS-Celeb-1M Low-Shot Learning Challenge.
Audio-Visual Emotion Recognition using Deep Transfer Learning and Multiple Temporal Models
Yan Xu*, Yu Cheng*, Jian Zhao, Zhecan Wang, Lin Xiong, Jayashree Karlekar, Hajime Tamura, Tomoyuki Kagaya, Shengmei Shen, Sugiri Pranata, Jiashi Feng, Junliang Xing
Workshop on MS-Celeb-1M Challenge with ICCV 2017, 2017.10
In this paper, we introduce our solution to the Challenge-1 of the MS-Celeb-lM challenges which aims to recognize one million celebrities. To solve this large scale face recognition problem, a Multi-Cognition Softmax Model (MCSM) is proposed to distribute training data to several cognition units by a data shuffling strategy. Here we introduce one cognition unit as a group of independent softmax models, which is designed to increase the diversity of the one softmax model to boost the performance for models ensemble. Meanwhile, a template-based Feature Retrieval (FR) module is adopted to improve the performance of MCSM by a specific voting scheme. Moreover, a one-shot learning method is applied on collected extra 600K identities due to each identity has one image only. Finally, testing images with lower score from MCSM and FR are assigned new labels with higher score by merging one-shot learning results. Extensive experiments on the MS-Celeb-1M testing set demonstrate the superiority of the proposed method. Our solution ranks the first place in both two settings of the final evaluation and outperforms other teams by a large margin.