My name is Kevin Chih-Yao Ma (馬志堯). I am a Staff Research Scientist working on generative foundation models in Meta's Gen AI org. In the past, I have been primarily focusing data-efficient learning, including semi-supervised learning, self-supervised learning, federated learning, etc. During my PhD, I have conducted researches on large-scale video classification, fine-grained human action recognition, relational reasoning for video understanding, visually grounded image/video captioning, and vision-and-language navigation agents.
Lead IC in Meta's Movie Gen.
Built llama-like DiT, scale model with parallelization, and co-lead post-training.
Core contributor of Emu that powers Emu Video/Edit and Imagine.
with Peter Vajda and Zijian He (GenAI Media Foundation team)
Generative Models, Federated Learning, Semi-Supervised Learning
with Peter Vajda and Zijian He (Mobile Vision & GenAI Media Foundation team)
Data-efficient learning
with Peter Vajda and Zijian He (Mobile Vision team)
Self-Supervised Learning
with Marcus Rohrbach (FAIR), Yannis Kalantidis (AML), Kan Chen (Mobile Vision), and Peter Vajda (Mobile Vision)
Vision-and-Language Navigation
with Caiming Xiong and Richard Socher
Relational reasoning for human action recognition and video captioning
with Asim Kadav
Electrical and Computer Engineering
with Ghassan AlRegib (advisor) and Zsolt Kira
Electrical and Computer Engineering
with Hsueh-Ming Hang
Electrical and Computer Engineering
Aug. 2006 - May 2011 Please see my |
Meta's Movie Gen team [Webpage] / [arXiv] / [MovieGenBench (GitHub)] / [bibtex] |
Xiaoliang Dai*, Ji Hou*, Chih-Yao Ma*, Sam Tsai*, Jialiang Wang*, Rui Wang*, Peizhao Zhang*, Simon Vandenhende, Xiaofang Wang, Abhimanyu Dubey, Matthew Yu, Abhishek Kadian, Filip Radenovic, Dhruv Mahajan, Kunpeng Li, Yue Zhao, Vladan Petrovic, Mitesh Kumar Singh, Simran Motwani, Yi Wen, Yiwen Song, Roshan Sumbaly+, Vignesh Ramanathan+, Zijian He+, Peter Vajda+, Devi Parikh+ (*: Core contributors: equal contribution, alphabetical order.) (+: Equal last authors.) [arXiv] / [bibtex] |
Sangwoo Mo, Jong-Chyi Su, Chih-Yao Ma, Mido Assran, Ishan Misra, Licheng Yu, Sean Bell International Conference on Learning Representations (ICLR), 2023 [arXiv] / [GitHub] / [bibtex] |
Junjiao Tian, Zecheng He, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Zsolt Kira Computer Vision and Pattern Recognition (CVPR), 2023 [arXiv] / [GitHub] / [bibtex] |
Chia-Wen Kuo, Chih-Yao Ma, Judy Hoffman, Zsolt Kira Winter Conference on Applications of Computer Vision (WACV), 2022 [arXiv] / [Project] / [bibtex] |
Yen-Cheng Liu, Chih-Yao Ma, Junjiao Tian, Zijian He, Zsolt Kira Conference on Neural Information Processing Systems (NeurIPS), 2022 [arXiv] / [Project] / [GitHub] (coming soon) [bibtex] |
Yen-Cheng Liu, Chih-Yao Ma, Xiaoliang Dai, Junjiao Tian, Peter Vadja, Zijian He, Zsolt Kira European Conference on Computer Vision (ECCV), 2022 (Oral) [arXiv] / [Project] / [GitHub] (coming soon) [bibtex] |
Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris Kitani, Peter Vadja Computer Vision and Pattern Recognition (CVPR), 2022 [PDF] / [GitHub] / [Project] / [bibtex] |
Yen-Cheng Liu, Chih-Yao Ma, Zsolt Kira Computer Vision and Pattern Recognition (CVPR), 2022 [arXiv] / [PDF] / [GitHub] / [Project] / [bibtex] |
Muhammad Zubair Irshad, Chih-Yao Ma, Zsolt Kira IEEE International Conference on Robotics and Automation (ICRA), 2021 [arXiv] / [GitHub] / [Project] / [bibtex] |
Yen-Cheng Liu, Chih-Yao Ma, Zijian He, Chia-Wen Kuo, Kan Chen, Peizhao Zhang, Bichen Wu, Zsolt Kira, Peter Vajda International Conference on Learning Representations (ICLR), 2021 [arXiv] / [GitHub] / [Project] / [OpenReview] / [bibtex] |
Chih-Yao Ma, Yannis Kalantidis, Ghassan AlRegib, Peter Vajda, Marcus Rohrbach, Zsolt Kira European Conference on Computer Vision (ECCV), 2020 [arXiv] / [GitHub] / [Project] / [ML@GT] / [bibtex] |
Chia-Wen Kuo, Chih-Yao Ma, Jia-Bin Huang, Zsolt Kira European Conference on Computer Vision (ECCV), 2020 [arXiv] / [Project] / [GitHub] / [bibtex] |
Yen-Cheng Liu, Junjiao Tian, Chih-Yao Ma, Nathaniel Glaser, Chia-Wen Kuo, Zsolt Kira International Conference on Robotics and Automation (ICRA), 2020 [arXiv] [GitHub] / [Project] / [bibtex] |
Chia-Wen Kuo, Chih-Yao Ma, Jia-Bin Huang, Zsolt Kira Technical Report, 2019 [arXiv] / [Project] / [bibtex] |
Chih-Yao Ma, Zuxuan Wu, Ghassan AlRegib, Caiming Xiong, Zsolt Kira Computer Vision and Pattern Recognition (CVPR), 2019 (Oral) [arXiv] / [GitHub] / [Project] / [Poster] / [bibtex] |
Zuxuan Wu, Caiming Xiong, Chih-Yao Ma, Richard Socher, Larry S Davis Computer Vision and Pattern Recognition (CVPR), 2019 |
Chih-Yao Ma, Jiasen Lu, Zuxuan Wu, Ghassan AlRegib, Zsolt Kira, Richard Socher, Caiming Xiong International Conference on Learning Representations (ICLR), 2019 (Top 7% of reviews) |
Chih-Yao Ma, Asim Kadav, Iain Melvin, Zsolt Kira, Ghassan AlRegib, Hans Peter Graf Computer Vision and Pattern Recognition (CVPR), 2018 |
Chih-Yao Ma*, Min-Hung Chen*, Zsolt Kira, and Ghassan AlRegib Signal Processing: Image Communication, 2018 (*: equal contribution) |
Chih-Yao Ma, Asim Kadav, Iain Melvin, Zsolt Kira, Ghassan AlRegib, Hans Peter Graf Neural Information Processing Systems (NeurIPS) Workshop on Visually-Grounded Interaction and Language, 2017 |
Chih-Yao Ma and Hsueh-Ming Hang Journal of vision, 2015 |
© 2024 Kevin Chih-Yao Ma