Sirius Chen []

Ph.D. Student
Computer Science Department
University of Maryland, College Park
5940 Westchester Park Dr., College Park, MD 20740

Hi! Welcome to my personal homepage. I am a Ph.D. student at University of Maryland under supervision of professor Larry S. Davis. Before I came here, I was a research assistant at Academia Sinica, Taiwan working with professor Chu-Song Chen and professor Winston Hsu. I received my master and bachelor's degree in computer science from CSIE department in National Taiwan University. My research interests include computer vision, machine learning, multimedia content analysis, and large-scale image/video retrieval.

Research Projects


Detection of Metadata Tampering through Discrepancy between Image Content and Metadata using Multi-task Deep Learning

Bor-Chun Chen, Pallabi Ghosh, Vlad I. Morariu and Larry S. Davis, CVPRW 2017 [PDF]

We utilize multi-task learning framework to predict meteorological information from image and use it to detect discrepancy between metadata and visual content. Our multi-task learning model provides up to 15% relative improvement compare to traditional CNN networks (ResNet) on this task.


Video to Text Summary: Joint Video Summarization and Captioning with Recurrent Neural Networks

Bor-Chun Chen, Yan-Ying Chen and Francine Chen, BMVC 2017 [PDF]

We propose a general neural network configuration that jointly considers two supervisory signals (i.e., an image-based video summary and text-based video captions) in the training phase and generates both a video summary and corresponding captions for a given video in the test phase. We think this two tasks are complementary, and experiments show our model can achieve better performance in both tasks.


Business-Aware Visual Concept Discovery from Social Media for Multimodal Business Venue Recognition

Bor-Chun Chen, Yan-Ying Chen, Francine Chen and Dhiraj Joshi, AAAI 2016 [PDF]

We developed a novel framework for multimodal business venue recognition. We first mine a set of visual concept that is relavent to venue recognition from data. We then use these concepts to train our CNN network (BA-CNN) and use it to recognize business venues. Our model acheives 78.5% recognition rate on our test set.


Cross-Age Reference Coding for Age-Invariant Face Recognition and Retrieval

Bor-Chun Chen, Chu-Song Chen, and Winston H. Hsu, ECCV 2014 [PDF]
Bor-Chun Chen, Chu-Song Chen, and Winston H. Hsu, IEEE Transactions on Multimedia, 2015 [PDF]

We propose a novel coding framework called Cross-Age Reference Coding (CARC). By leveraging a large-scale image dataset freely available on the Internet as a reference set, CARC is able to encode the low-level feature of a face image with an age-invariant reference space. To thoroughly evaluate our work, we introduce a new large-scale dataset for face recognition and retrieval across age called Cross-Age Celebrity Dataset (CACD). The dataset contains more than 160,000 images of 2,000 celebrities with age ranging from 16 to 62.

[Project]

Discovering the City by Mining Diverse and Multimodal Data Streams

Yin-Hsi Kuo, Yan-Ying Chen, Bor-Chun Chen, Wen-Yu Lee, Chun-Che Wu, Chia-Hung Lin, Yu-Lin Hou, Wen-Feng Cheng, Yi-Chih Tsai, Chung-Yen Hung, Liang-Chi Hsieh, Winston Hsu, ACM MM Grand Challenge 2014 [PDF]

We address the IBM Challenge - NYC360 by mining multimodal data streams from different social media. I worked on the food recognition part in this project. We use weakly labeled images from Instagram to train a large convolutional nueral networks to recognize different food and find popular restaurants in New York City. We further use setiment analysis to find out the user opinion about the restaurants for real time recommendations. Our work won the ACM Multimedia Grand Challenge Multimodal Award, 2014


Scalable Face Track Retrieval in Video Archives using Bag-of-Faces Sparse Representation

Bor-Chun Chen, Yan-Ying Chen, Yin-Hsi Kuo, Thanh Duc Ngo, Duy-Dinh Le, Shin’ichi Satoh, and Winston H. Hsu, IEEE TCSVT, 2014 [PDF]

In order to organize large-scale face tracks, containing sequences of (detected) consecutive faces in the videos, we propose an efficient method to retrieve human face tracks using bag-of-faces sparse representation. Using the proposed method, a face track is encoded as a single bag-of-faces sparse representation and therefore allowing efficient indexing method to handle large-scale data. To further consider the possible variations in face tracks, we generalize our method to find multiple sparse representations, in an unsupervised manner, to represent a bag of faces and balance the trade-off between performance and retrieval time.

[Dataset]

Scalable Face Image Retrieval using Attribute-Enhanced Sparse Codewords

Bor-Chun Chen, Yan-Ying Chen, Yin-Hsi Kuo, and Winston H. Hsu, IEEE Transactions on Multimedia, 2013 [PDF]

We aim to utilize automatically detected human attributes that contain semantic cues of the face photos to improve content-based face retrieval by constructing semantic codewords for efficient large-scale face retrieval. By leveraging human attributes in a scalable and systematic framework, we propose two orthogonal methods named attribute-enhanced sparse coding and attribute-embedded inverted indexing to improve the face retrieval in the offline and online stages.


Where is Who: Large-Scale Photo Retrieval by Facial Attributes and Canvas Layout

Yu-Heng Lei, Yan-Ying Chen, Bor-Chun Chen, Lime Iida, and Winston H Hsu, ACM SIGIR 2012 [PDF]
Yu-Heng Lei, Yan-Ying Chen, Lime Iida, Bor-Chun Chen, Hsiao-Hang Su, and Winston H Hsu, ACM Multimedia Grand Challenge, 2011 [PDF]

We propose a novel way to search for face images according facial attributes and face similarity of the target persons. To better match the face layout in mind, our system allows the user to graphically specify the face positions and sizes on a query canvas, where each attribute or identity is defined as an icon for easier representation.

[Demo]

Semi-Supervised Face Image Retrieval using Sparse Coding with Identity Constraint

Bor-Chun Chen, Yin-Hsi Kuo, Yan-Ying Chen, Kuan-Yu Chu, and Winston Hsu, ACM Multimedia 2011 [PDF]

We try to construct semantic codewords for face image using sparse coding with low-level feature (i.e. LBP) and partially available label information.

[Demo]

About


Education

Ph.D. Student
Teaching Assistant for: Graduate Courseworks:
Master of Science
M.S. in Computer Science, 2012
Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
Advisor: Prof. Winston H. Hsu
GPA 4.2/4.3
Bachelor of Science
B.S. in Computer Science, 2010
Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
Minor in Economic
Overall GPA 3.9/4.0, Major GPA 4.0/4.0

Honors and Awards

  • Multimodal Prize in ACM Multimedia Grand Challenge (2014)
  • Dean’s Fellowship, University of Maryland (2014)
  • Best Master Thesis Award of Taiwanese Association for Artificial Intelligence (Nov. 2012)
  • Garmin Scholarship Award (Jan. 2012)
  • 1st Prize in ACM Multimedia Grand Challenge (Nov. 2011)
  • 2nd Place in Cloud Application Contest by Chunghwa Telecom, Taiwan (Nov. 2011)
  • Presidential Award, National Taiwan University (Fall, 2006; Spring, 2008)
  • Presidential Award, National Taiwan Normal University (Fall, 2005; Spring, 2006)
  • Second Place in Trigonometry – State Award, State of Kansas Scholarship Contest (2005)
  • Third Place in Calculus – State Award, State of Kansas Scholarship Contest (2005)
  • Outstanding Award, AYUSA International (Sep. 2004 - Jun. 2005)
  • Third Place in Astronomy Contest, International Generation Club (2003)

Personal Interests

Board games, Traveling, Astronomy, Magic, Juggling

Traveling: I have been to 21 states in the U.S. and 15 countries over the world: