About Me

I’m currently a postdoctoral researcher advised by Prof. Jürgen Schmidhuber at KAUST. I received my PhD degree from Institute of Automation, Chinese Academy of Sciences (CASIA), advised by Prof. Changsheng Xu. My research interests mainly focus on multimodality and general intelligence.

I’m always open to possible cooperation or visiting opportunities. If you are interested, please contact me by email.

Education

2022.9 - Present : Ph.D. in Pattern Recognition and Intelligent System, Institute of Automation, Chinese Academy of Sciences (CASIA), advised by Prof. Changsheng Xu.
2019.9 - 2022.6 : M.S. in Computer Science, Institute of Automation, Chinese Academy of Sciences (CASIA), advised by Prof. Weiming Dong
2015.9 - 2019.6 : B.E. in Aerospace Engineering, Beijing Institute of Technology, advised by Prof. Zixuan Liang.

2025.10 - Present : Postdoctrial researcher at King Abdullah University of Science and Technology (KAUST)
2022.5 - 2023.3 : Research internship at Tencent Youtu Lab, supported by Tencent Rhino-Bird Research Elite Program
2021.3 - 2021.9 : Research internship at Tencent Youtu Lab

Libra: Building Decoupled Vision System on Large Language Models
Yifan Xu, Xiaoshan Yang, Yaguang Song, Changsheng Xu
ICML2024: the Forty-first International Conference on Machine Learning
[arXiv] [Code]
Multi-modal Queried Object Detection in the Wild
Yifan Xu, Mengdan Zhang, Chaoyou Fu, Peixian Chen, Xiaoshan Yang, Ke Li, Changsheng Xu
NeurIPS2023: the Thirty-seventh Annual Conference on Neural Information Processing Systems
[arXiv] [Code] [Poster] [Slides]
Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer
Yifan Xu, Zhijie Zhang, Mengdan Zhang, Kekai Sheng, et al.
AAAI2022: the 36th AAAI Conference on Artificial Intelligence
[arXiv] [Code] [Poster] [Slides] [Video]
Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring the Design of Next-generation Neuromorphic Chips
Man Yao, JiaKui Hu, Tianxiang Hu, Yifan Xu, Zhaokun Zhou, Yonghong Tian, Bo XU, Guoqi Li
ICLR2024: the Twelfth International Conference on Learning Representations
[arXiv] [Code]

Transformers in Computational Visual Media: A Survey
Yifan Xu, Huapeng Wei, Mingxuan Lin, Yingying Deng, Kekai Sheng, Mengdan Zhang, et al.
CVMJ: Computational Visual Media
[Paper]
Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection
Yifan Xu, Mengdan Zhang, Xiaoshan Yang, Changsheng Xu
TIP: IEEE Transactions on Image Processing
[arxiv]
Towards Corruption-Agnostic Robust Domain Adaptation
Yifan Xu, Kekai Sheng, Weiming Dong, Baoyuan Wu, Changsheng Xu, Bao-Gang Hu
TOMM: ACM Transactions on Multimedia Computing, Communications, and Applications
[arxiv]

Outstanding Graduate of Beijing
(北京市优秀毕业生)
National Scholarship of China
(博士研究生国家奖学金)
2023 First Prize of the Pandeng Scholarship
(攀登一等奖学金)
2022 Second Prize of the Pandeng Scholarship
(攀登二等奖学金)
2023, 2023 Tencent Rhino-Bird Research Elite
(腾讯犀牛鸟精英人才)
2022, 2023 Merit Student of University of Chinese Academy of Sciences
(中国科学院大学三好学生)
2021 - 2024 Academic Scholarship of University of Chinese Academy of Sciences
(中国科学院大学学业奖学金)
2017 Space Application Scholarship
(太空应用奖学金)
2017 Second Prize in China Undergraduate Mathematical Contest in Modeling

2023.12.10 Invited talk “Multi-modal Queried Object Detection in the Wild” at NeurIPS2023
2023.3.24 Invited talk “Some Thoughts on Vision-Language Models” at Multimedia Computing Lab, CASIA
2022.11.24 Invited talk “The Mathematical Theorem of Diffusion” at Tencent Youtu Lab
2022.9.15 Invited talk “Advances in Weakly-supervised Object Detection” at Tencent Youtu Lab
2022.2.24 Invited talk “Slow-Fast Token Evolution for Dynamic Vision Transformer” at AAAI2022