|
2026 Journal Article Cluster-aware prompt ensemble learning for few-shot vision-language model adaptationChen, Zhi, Yu, Xin, Tao, Xiaohui, Li, Yan and Huang, Zi (2026). Cluster-aware prompt ensemble learning for few-shot vision-language model adaptation. Pattern Recognition, 172 (C) 112596. doi: 10.1016/j.patcog.2025.112596 |
|
2026 Journal Article Compression-Oriented Video Super-ResolutionWang, Shuyun, Liu, Yanbin, Lu, Ming, Wu, Zhuojie, Tian, Senmao, Guo, Yandong and Yu, Xin (2026). Compression-Oriented Video Super-Resolution. IEEE Transactions on Image Processing, PP (99), 1-1. doi: 10.1109/tip.2026.3682128 |
|
2026 Journal Article Distributed Zero-Shot Learning for Visual RecognitionChen, Zhi, Luo, Yadan, Huang, Zi, Li, Jingjing, Wang, Sen and Yu, Xin (2026). Distributed Zero-Shot Learning for Visual Recognition. IEEE Transactions on Multimedia, 1-12. doi: 10.1109/TMM.2026.3673561 |
|
2025 Journal Article Hyperspectral video object tracking with cross-modal spectral complementary and memory prompt networkJiang, Wenhao, Zhao, Dong, Wang, Chen, Yu, Xin, Arun, Pattathal V., Asano, Yuta, Xiang, Pei and Zhou, Huixin (2025). Hyperspectral video object tracking with cross-modal spectral complementary and memory prompt network. Knowledge-Based Systems, 330 (Part B) 114595, 1-16. doi: 10.1016/j.knosys.2025.114595 |
|
2025 Journal Article Analytical Survey of Learning with Low-Resource Data: From Analysis to InvestigationCao, Xiaofeng, Xu, Mingwei, Yu, Xin, Yao, Jiangchao, Ye, Wei, Huang, Shengjun, Zhang, Minling, Tsang, Ivor, Ong, Yew-Soon, Kwok, James T. and Shen, Heng Tao (2025). Analytical Survey of Learning with Low-Resource Data: From Analysis to Investigation. ACM Computing Surveys, 58 (6) 3773075, 1-47. doi: 10.1145/3773075 |
|
2025 Conference Publication Dynamic derivation and elimination: audio visual segmentation with enhanced audio semanticsLiu, Chen, Yang, Liying, Li, Peike, Wang, Dadong, Li, Lincheng and Yu, Xin (2025). Dynamic derivation and elimination: audio visual segmentation with enhanced audio semantics. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/cvpr52734.2025.00298 |
|
2025 Conference Publication Blind bitstream-corrupted video recovery via metadata-guided diffusion modelWang, Shuyun, Zhang, Hu, Shen, Xin, Wang, Dadong and Yu, Xin (2025). Blind bitstream-corrupted video recovery via metadata-guided diffusion model. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. New York, NY USA: IEEE Computer Society. doi: 10.1109/CVPR52734.2025.02139 |
|
2025 Conference Publication Robust audio-visual segmentation via audio-guided visual convergent alignmentLiu, Chen, Li, Peike, Yang, Liying, Wang, Dadong, Li, Lincheng and Yu, Xin (2025). Robust audio-visual segmentation via audio-guided visual convergent alignment. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/cvpr52734.2025.02693 |
|
2025 Conference Publication EasyCraft: a robust and efficient framework for automatic avatar craftingWang, Suzhen, Chen, Weijie, Zhang, Wei, Zhao, Minda, Li, Lincheng, Zhang, Rongsheng, Hu, Zhipeng and Yu, Xin (2025). EasyCraft: a robust and efficient framework for automatic avatar crafting. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. New York, NY USA: IEEE Computer Society. doi: 10.1109/CVPR52734.2025.00524 |
|
2025 Conference Publication M3GYM: a large-scale multimodal multi-view multi-person pose dataset for fitness activity understanding in real-world settingsXu, Qingzheng, Cao, Ru, Shen, Xin, Du, Heming, Wang, Sen and Yu, Xin (2025). M3GYM: a large-scale multimodal multi-view multi-person pose dataset for fitness activity understanding in real-world settings. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, United States, 10 - 17 June 2025. Washington, DC, United States: I E E E Computer Society. doi: 10.1109/cvpr52734.2025.01147 |
|
2025 Conference Publication Cross-view isolated sign language recognition challenge: design, results and future researchShen, Xin, Du, Heming, Xu, Miao, Liu, Miaomiao and Yu, Xin (2025). Cross-view isolated sign language recognition challenge: design, results and future research. WWW '25: The ACM Web Conference 2025, Sydney, NSW Australia, 28 April-2 May 2025. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3701716.3717522 |
|
2025 Conference Publication MDAM 3: a misinformation detection and analysis framework for multitype multimodal mediaXu, Qingzheng, Du, Heming, Łukasik, Szymon, Zhu, Tianqing, Wang, Sen and Yu, Xin (2025). MDAM 3: a misinformation detection and analysis framework for multitype multimodal media. WWW '25: The ACM Web Conference 2025, Sydney, NSW Australia, 28 April-2 May 2025. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3696410.3714498 |
|
2025 Journal Article ICE: interactive 3D game character facial editing via dialogueWu, Haoqian, Zhao, Minda, Hu, Zhipeng, Fan, Changjie, Li, Lincheng, Chen, Weijie, Zhao, Rui and Yu, Xin (2025). ICE: interactive 3D game character facial editing via dialogue. IEEE Transactions on Multimedia, 27, 3210-4223. doi: 10.1109/tmm.2025.3557611 |
|
2025 Conference Publication TokenBinder: Text-Video Retrieval with One-to-Many Alignment ParadigmZhang, Bingqing, Cao, Zhuo, Du, Heming, Yu, Xin, Li, Xue, Liu, Jiajun and Wang, Sen (2025). TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm. 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, AZ United States, 26 February - 6 March 2025. Piscataway, NJ United States: IEEE. doi: 10.1109/wacv61041.2025.00485 |
|
2025 Conference Publication FlashVTG: feature layering and adaptive score handling network for video temporal groundingCao, Zhuo, Zhang, Bingqing, Du, Heming, Yu, Xin, Li, Xue and Wang, Sen (2025). FlashVTG: feature layering and adaptive score handling network for video temporal grounding. 2025 Winter Conference on Applications of Computer Vision-WACV, Tucson, AZ, United States, 28 February-4 March 2025. Piscataway, NJ, United States: Institute of Electrical and Electronics Engineers. doi: 10.1109/wacv61041.2025.00894 |
|
2025 Journal Article DreamCar: leveraging car-specific prior for in-the-wild 3D car reconstructionDu, Xiaobiao, Sun, Haiyang, Lu, Ming, Zhu, Tianqing and Yu, Xin (2025). DreamCar: leveraging car-specific prior for in-the-wild 3D car reconstruction. IEEE Robotics and Automation Letters, 10 (2), 1840-1847. doi: 10.1109/lra.2024.3523231 |
|
2025 Journal Article TalkCLIP: talking head generation with text-guided expressive speaking stylesMa, Yifeng, Wang, Suzhen, Ding, Yu, Ma, Bowen, Lv, Tangjie, Fan, Changjie, Hu, Zhipeng, Deng, Zhidong and Yu, Xin (2025). TalkCLIP: talking head generation with text-guided expressive speaking styles. IEEE Transactions on Multimedia, 27, 6335-6346. doi: 10.1109/tmm.2025.3581808 |
|
2025 Conference Publication Affective behaviour analysis via progressive learningLiu, Chen, Zhang, Wei, Qiu, Feng, Li, Lincheng, Wang, Dadong and Yu, Xin (2025). Affective behaviour analysis via progressive learning. ECCV 2024 Workshops, Milan, Italy, 29 September - 4 October 2024. Heidelberg, Germany: Springer. doi: 10.1007/978-3-031-91581-9_26 |
|
2025 Conference Publication QGait: Toward Accurate Quantization for Gait RecognitionTian, Senmao, Gao, Haoyu, Hong, Gangyi, Wang, Shuyun, Wang, Jingjie, Yu, Xin and Zhang, Shunli (2025). QGait: Toward Accurate Quantization for Gait Recognition. Institute of Electrical and Electronics Engineers Inc.. doi: 10.1109/IJCB65343.2025.11411264 |
|
2025 Conference Publication Transferable attacks for semantic segmentationHe, Mengqi, Zhang, Jing and Yu, Xin (2025). Transferable attacks for semantic segmentation. 35th Australasian Database Conference, Gold Coast, QLD, Australia, 16-18 December 2024. Heidelberg, Germany: Springer. doi: 10.1007/978-981-96-1242-0_28 |