Overview
Background
My name is Xin Yu, a Senior Lecturer at the University of Queensland. I am an Australian Research Council Discovery Early Career Researcher Award 2023-2025 (DECRA) recipient and an awardee of the prestigious Google Research Scholar Program in 2021. I am also a Google Visiting Faculty. Previously, I was a research fellow at the Australian National University (ANU). I received my PhD degree from the Australian National Unversity under the supervision of Prof. Richard Hartley, Prof. Fatih Porikli and Dr. Basura Fernando. I also received a PhD degree from Tsinghua University supervised by Prof. Li Zhang. I am interested in Computer Vision and Machine Learning topics.
My research topics includes various computer vision and machine learning tasks, especially in efficient low-level image processing, image retrieval and localization, action recognition, 3D pose estimation, visual navigation and sign language recognition and translation.
Availability
- Dr Xin Yu is:
- Available for supervision
Research impacts
One of my research papers has been awarded "Best Paper Honorable Mention" award in the premium computer vision conference WACV 2020, and one paper has been nominated for the Best Paper Award in CVPR 2020.
I was awarded the Outstanding Reviewer Award in ECCV 2020, CVPR 2021 and ICCV 2021. CVPR, ICCV and ECCV are internationally world-leading computer vision and machine learning conferences. My research interests include deep learning techniques, image processing, and computer vision tasks. I am a program committee member of top-tier computer vision and machine learning conferences, such as CVPR, ICCV, ECCV, ICML, ICLR and NeurIPS, and a reviewer of prestigious journals, such as TPAMI, IJCV and TIP.
I am happy to supervise self-motivated PhD and MPhil students. If you are an undergraduate student and willing to conduct your honour project, please drop me an email.
Works
Search Professor Xin Yu’s works on UQ eSpace
2025
Journal Article
Hyperspectral video object tracking with cross-modal spectral complementary and memory prompt network
Jiang, Wenhao, Zhao, Dong, Wang, Chen, Yu, Xin, Arun, Pattathal V., Asano, Yuta, Xiang, Pei and Zhou, Huixin (2025). Hyperspectral video object tracking with cross-modal spectral complementary and memory prompt network. Knowledge-Based Systems, 330 (Part B) 114595, 1-16. doi: 10.1016/j.knosys.2025.114595
2025
Conference Publication
Robust audio-visual segmentation via audio-guided visual convergent alignment
Liu, Chen, Li, Peike, Yang, Liying, Wang, Dadong, Li, Lincheng and Yu, Xin (2025). Robust audio-visual segmentation via audio-guided visual convergent alignment. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/cvpr52734.2025.02693
2025
Conference Publication
EasyCraft: a robust and efficient framework for automatic avatar crafting
Wang, Suzhen, Chen, Weijie, Zhang, Wei, Zhao, Minda, Li, Lincheng, Zhang, Rongsheng, Hu, Zhipeng and Yu, Xin (2025). EasyCraft: a robust and efficient framework for automatic avatar crafting. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. New York, NY USA: IEEE Computer Society. doi: 10.1109/CVPR52734.2025.00524
2025
Conference Publication
Dynamic derivation and elimination: audio visual segmentation with enhanced audio semantics
Liu, Chen, Yang, Liying, Li, Peike, Wang, Dadong, Li, Lincheng and Yu, Xin (2025). Dynamic derivation and elimination: audio visual segmentation with enhanced audio semantics. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/cvpr52734.2025.00298
2025
Conference Publication
Blind bitstream-corrupted video recovery via metadata-guided diffusion model
Wang, Shuyun, Zhang, Hu, Shen, Xin, Wang, Dadong and Yu, Xin (2025). Blind bitstream-corrupted video recovery via metadata-guided diffusion model. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. New York, NY USA: IEEE Computer Society. doi: 10.1109/CVPR52734.2025.02139
2025
Conference Publication
M3GYM: a large-scale multimodal multi-view multi-person pose dataset for fitness activity understanding in real-world settings
Xu, Qingzheng, Cao, Ru, Shen, Xin, Du, Heming, Wang, Sen and Yu, Xin (2025). M3GYM: a large-scale multimodal multi-view multi-person pose dataset for fitness activity understanding in real-world settings. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, United States, 10 - 17 June 2025. Washington, DC, United States: I E E E Computer Society. doi: 10.1109/cvpr52734.2025.01147
2025
Conference Publication
Cross-view isolated sign language recognition challenge: design, results and future research
Shen, Xin, Du, Heming, Xu, Miao, Liu, Miaomiao and Yu, Xin (2025). Cross-view isolated sign language recognition challenge: design, results and future research. WWW '25: The ACM Web Conference 2025, Sydney, NSW Australia, 28 April-2 May 2025. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3701716.3717522
2025
Conference Publication
MDAM 3: a misinformation detection and analysis framework for multitype multimodal media
Xu, Qingzheng, Du, Heming, Łukasik, Szymon, Zhu, Tianqing, Wang, Sen and Yu, Xin (2025). MDAM 3: a misinformation detection and analysis framework for multitype multimodal media. WWW '25: The ACM Web Conference 2025, Sydney, NSW Australia, 28 April-2 May 2025. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3696410.3714498
2025
Conference Publication
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding
Cao, Zhuo, Zhang, Bingqing, Du, Heming, Yu, Xin, Li, Xue and Wang, Sen (2025). FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding. 2025 Winter Conference on Applications of Computer Vision-WACV, Tucson Az, Feb 28-Mar 04, 2025. LOS ALAMITOS: IEEE. doi: 10.1109/wacv61041.2025.00894
2025
Conference Publication
TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm
Zhang, Bingqing, Cao, Zhuo, Du, Heming, Yu, Xin, Li, Xue, Liu, Jiajun and Wang, Sen (2025). TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm. IEEE. doi: 10.1109/wacv61041.2025.00485
2025
Journal Article
DreamCar: leveraging car-specific prior for in-the-wild 3D car reconstruction
Du, Xiaobiao, Sun, Haiyang, Lu, Ming, Zhu, Tianqing and Yu, Xin (2025). DreamCar: leveraging car-specific prior for in-the-wild 3D car reconstruction. IEEE Robotics and Automation Letters, 10 (2), 1840-1847. doi: 10.1109/lra.2024.3523231
2025
Journal Article
ICE: Interactive 3D Game Character Facial Editing via Dialogue
Wu, Haoqian, Zhao, Minda, Hu, Zhipeng, Fan, Changjie, Li, Lincheng, Chen, Weijie, Zhao, Rui and Yu, Xin (2025). ICE: Interactive 3D Game Character Facial Editing via Dialogue. IEEE Transactions on Multimedia, PP (99), 1-14. doi: 10.1109/tmm.2025.3557611
2025
Conference Publication
Vision-based abnormal action dataset for recognising body motion disorders
Ying, Jiaying, Shen, Xin and Yu, Xin (2025). Vision-based abnormal action dataset for recognising body motion disorders. 37th Australasian Joint Conference on Artificial Intelligence, AI 2024, Melbourne, VIC, Australia, 25 - 29 November 2024. Singapore, Singapore: Springer Nature Singapore. doi: 10.1007/978-981-96-0351-0_33
2025
Conference Publication
Compound expression recognition via curriculum learning
Liu, Chen, Qiu, Feng, Zhang, Wei, Li, Lincheng, Wang, Dadong and Yu, Xin (2025). Compound expression recognition via curriculum learning. ECCV 2024 Workshops, Milan, Italy, 29 September - 4 October 2024. Heidelberg, Germany: Springer. doi: 10.1007/978-3-031-91581-9_20
2025
Conference Publication
Who is Being Impersonated? Deepfake audio detection and impersonated identification via extraction of ID-specific features
Guo, Tianchen, Du, Heming, Huo, Huan, Liu, Bo and Yu, Xin (2025). Who is Being Impersonated? Deepfake audio detection and impersonated identification via extraction of ID-specific features. 24th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP 2024), Macau, China, 29-31 October 2024. Singapore: Springer. doi: 10.1007/978-981-96-1548-3_21
2025
Journal Article
TalkCLIP: talking head generation with text-guided expressive speaking styles
Ma, Yifeng, Wang, Suzhen, Ding, Yu, Ma, Bowen, Lv, Tangjie, Fan, Changjie, Hu, Zhipeng, Deng, Zhidong and Yu, Xin (2025). TalkCLIP: talking head generation with text-guided expressive speaking styles. IEEE Transactions on Multimedia, 27, 6335-6346. doi: 10.1109/tmm.2025.3581808
2025
Conference Publication
Affective behaviour analysis via progressive learning
Liu, Chen, Zhang, Wei, Qiu, Feng, Li, Lincheng, Wang, Dadong and Yu, Xin (2025). Affective behaviour analysis via progressive learning. ECCV 2024 Workshops, Milan, Italy, 29 September - 4 October 2024. Heidelberg, Germany: Springer. doi: 10.1007/978-3-031-91581-9_26
2025
Conference Publication
Transferable attacks for semantic segmentation
He, Mengqi, Zhang, Jing and Yu, Xin (2025). Transferable attacks for semantic segmentation. 35th Australasian Database Conference, Gold Coast, QLD, Australia, 16-18 December 2024. Heidelberg, Germany: Springer. doi: 10.1007/978-981-96-1242-0_28
2024
Conference Publication
CPT-VR: Improving Surface Rendering via Closest Point Transform with View-Reflection Appearance
Hu, Zhipeng, Zhang, Yongqiang, Liu, Chen, Li, Lincheng, Peng, Sida, Zhou, Xiaowei, Fan, Changjie and Yu, Xin (2024). CPT-VR: Improving Surface Rendering via Closest Point Transform with View-Reflection Appearance. 18th European Conference on Computer Vision, ECCV 2024, Milan, Italy, 29 September –4 October 2024. Cham, Switzerland: Springer. doi: 10.1007/978-3-031-73464-9_14
2024
Conference Publication
FreeAvatar: robust 3D facial animation transfer by learning an expression foundation model
Qiu, Feng, Zhang, Wei, Liu, Chen, An, Rudong, Li, Lincheng, Ding, Yu, Fan, Changjie, Hu, Zhipeng and Yu, Xin (2024). FreeAvatar: robust 3D facial animation transfer by learning an expression foundation model. SA '24: SIGGRAPH Asia 2024, Tokyo, Japan, 3-6 December 2024. New York, NY, United States: ACM. doi: 10.1145/3680528.3687669
Funding
Current funding
Past funding
Supervision
Availability
- Dr Xin Yu is:
- Available for supervision
Looking for a supervisor? Read our advice on how to choose a supervisor.
Supervision history
Current supervision
-
Doctor Philosophy
Multimodal foundation model design and analysis
Principal Advisor
Other advisors: Dr Miao Xu, Dr Heming Du
-
Doctor Philosophy
Human Posture Recognition Applied to Physical Activity
Principal Advisor
Other advisors: Professor Sean Tweedy
-
Doctor Philosophy
Understanding Human Intention and Performance
Principal Advisor
Other advisors: Associate Professor Sen Wang
-
Doctor Philosophy
Understanding Human Movements and Sport Performance Analysis
Principal Advisor
Other advisors: Dr Miao Xu
-
Doctor Philosophy
Effective Visual Data Compression
Principal Advisor
Other advisors: Associate Professor Sen Wang, Dr Heming Du
-
Doctor Philosophy
Automatic Retinal Health Monitoring through Multi-modal Medical Imaging
Principal Advisor
Other advisors: Associate Professor Mahsa Baktashmotlagh
-
Doctor Philosophy
Combating evolving deceptive fake visual information through deepfake detection
Principal Advisor
Other advisors: Dr Miao Xu
-
Doctor Philosophy
Two way Auslan Translation
Principal Advisor
Other advisors: Associate Professor Mahsa Baktashmotlagh, Dr Heming Du
-
Doctor Philosophy
Integrating Deep Learning and Remote Sensing for Precision Agriculture in Staple Crops
Principal Advisor
Other advisors: Dr Miao Xu
-
Doctor Philosophy
Understanding Human Intention and Performance
Principal Advisor
Other advisors: Dr Heming Du, Dr Miao Xu
-
Doctor Philosophy
Human Understanding in Sports
Principal Advisor
Other advisors: Associate Professor Sen Wang, Dr Heming Du
-
Doctor Philosophy
Pose Estimation for Human with Disabilities
Principal Advisor
Other advisors: Professor Brian Lovell
-
Doctor Philosophy
Two way Auslan Translation
Principal Advisor
Other advisors: Professor Helen Huang, Dr Heming Du
-
Doctor Philosophy
Compressed Video Restoration
Principal Advisor
Other advisors: Dr Miao Xu, Dr Heming Du
-
Doctor Philosophy
Object-Centric Audio-Visual Alignment for Sounding Source Segmentation
Principal Advisor
Other advisors: Associate Professor Sen Wang
-
Doctor Philosophy
Remote Sensing Analysis in computer vision
Associate Advisor
Other advisors: Professor Helen Huang
-
Doctor Philosophy
Enhancing Robustness and Generalizability in Computational Models
Associate Advisor
Other advisors: Associate Professor Mahsa Baktashmotlagh
-
Doctor Philosophy
Data driven approaches for smart farming
Associate Advisor
Other advisors: Professor Helen Huang
-
Doctor Philosophy
Towards knowledge discovery from imperfect and evolving data
Associate Advisor
Other advisors: Dr Miao Xu
Media
Enquiries
For media enquiries about Dr Xin Yu's areas of expertise, story ideas and help finding experts, contact our Media team: