|
2026 Conference Publication Mebm: Exploring the Synergy of Mixture of Experts in Background MattingWang, Yiru, Lu, Ming, Tian, Senmao, Yu, Xin and Zhang, Shunli (2026). Mebm: Exploring the Synergy of Mixture of Experts in Background Matting. IEEE. doi: 10.1109/icassp55912.2026.11464270 |
|
2026 Conference Publication Content-Aware Model Slimming for Image Super-Resolution with Large InputTian, Senmao, Hong, Gangyi, Wang, Shuyun, Yu, Xin and Zhang, Shunli (2026). Content-Aware Model Slimming for Image Super-Resolution with Large Input. IEEE. doi: 10.1109/icassp55912.2026.11461944 |
|
2026 Journal Article DFBSNet: Dual frequency-domain branch fusion and selection network for hyperspectral anomaly detectionYao, Yiming, Wang, Qing, Zhao, Dong, You, Mingtao, Xiang, Pei, Asano, Yuta, Yu, Xin, Wang, Chao, Zhou, Huixin and Ren, Jinchang (2026). DFBSNet: Dual frequency-domain branch fusion and selection network for hyperspectral anomaly detection. Pattern Recognition, 180 113967, 113967. doi: 10.1016/j.patcog.2026.113967 |
|
2026 Journal Article Cluster-aware prompt ensemble learning for few-shot vision-language model adaptationChen, Zhi, Yu, Xin, Tao, Xiaohui, Li, Yan and Huang, Zi (2026). Cluster-aware prompt ensemble learning for few-shot vision-language model adaptation. Pattern Recognition, 172 (C) 112596. doi: 10.1016/j.patcog.2025.112596 |
|
2026 Journal Article Distributed Zero-Shot Learning for Visual RecognitionChen, Zhi, Luo, Yadan, Huang, Zi, Li, Jingjing, Wang, Sen and Yu, Xin (2026). Distributed Zero-Shot Learning for Visual Recognition. IEEE Transactions on Multimedia, PP (99), 1-12. doi: 10.1109/TMM.2026.3673561 |
|
2026 Book Chapter High-Resolution and Multimodal Optogenetic fMRI of Brain DynamicsHe, Yi, Yuan, Jianyu, Liang, Mingyao, Xie, Zeping and Yu, Xin (2026). High-Resolution and Multimodal Optogenetic fMRI of Brain Dynamics. Neuromethods. (pp. 3-18) New York, NY: Springer US. doi: 10.1007/978-1-0716-5178-0_1 |
|
2026 Journal Article Compression-Oriented Video Super-ResolutionWang, Shuyun, Liu, Yanbin, Lu, Ming, Wu, Zhuojie, Tian, Senmao, Guo, Yandong and Yu, Xin (2026). Compression-Oriented Video Super-Resolution. IEEE Transactions on Image Processing, PP (99), 1-1. doi: 10.1109/tip.2026.3682128 |
|
2026 Journal Article Safe and Reliable Diffusion Models via Subspace ProjectionChen, Huiqiang, Zhu, Tianqing, Wang, Linlin, Yu, Xin, Gao, Longxiang and Zhou, Wanlei (2026). Safe and Reliable Diffusion Models via Subspace Projection. IEEE Transactions on Dependable and Secure Computing, PP (99), 1-14. doi: 10.1109/TDSC.2026.3692493 |
|
2025 Journal Article Hyperspectral video object tracking with cross-modal spectral complementary and memory prompt networkJiang, Wenhao, Zhao, Dong, Wang, Chen, Yu, Xin, Arun, Pattathal V., Asano, Yuta, Xiang, Pei and Zhou, Huixin (2025). Hyperspectral video object tracking with cross-modal spectral complementary and memory prompt network. Knowledge-Based Systems, 330 (Part B) 114595, 1-16. doi: 10.1016/j.knosys.2025.114595 |
|
2025 Journal Article Analytical Survey of Learning with Low-Resource Data: From Analysis to InvestigationCao, Xiaofeng, Xu, Mingwei, Yu, Xin, Yao, Jiangchao, Ye, Wei, Huang, Shengjun, Zhang, Minling, Tsang, Ivor, Ong, Yew-Soon, Kwok, James T. and Shen, Heng Tao (2025). Analytical Survey of Learning with Low-Resource Data: From Analysis to Investigation. ACM Computing Surveys, 58 (6) 3773075, 1-47. doi: 10.1145/3773075 |
|
2025 Conference Publication Cross-View Isolated Sign Language Recognition via View Synthesis and Feature DisentanglementShen, Xin, Wang, Xinyu, Shen, Lei, Zhang, Kaihao and Yu, Xin (2025). Cross-View Isolated Sign Language Recognition via View Synthesis and Feature Disentanglement. IEEE. doi: 10.1109/iccv51701.2025.01920 |
|
2025 Conference Publication LDPose: Towards Inclusive Human Pose Estimation for Limb-Deficient Individuals in the WildYing, Jiaying, Du, Heming, Zhang, Kaihao, Li, Lincheng and Yu, Xin (2025). LDPose: Towards Inclusive Human Pose Estimation for Limb-Deficient Individuals in the Wild. IEEE. doi: 10.1109/iccv51701.2025.00920 |
|
2025 Conference Publication Dynamic derivation and elimination: audio visual segmentation with enhanced audio semanticsLiu, Chen, Yang, Liying, Li, Peike, Wang, Dadong, Li, Lincheng and Yu, Xin (2025). Dynamic derivation and elimination: audio visual segmentation with enhanced audio semantics. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/cvpr52734.2025.00298 |
|
2025 Conference Publication Blind bitstream-corrupted video recovery via metadata-guided diffusion modelWang, Shuyun, Zhang, Hu, Shen, Xin, Wang, Dadong and Yu, Xin (2025). Blind bitstream-corrupted video recovery via metadata-guided diffusion model. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. New York, NY USA: IEEE Computer Society. doi: 10.1109/CVPR52734.2025.02139 |
|
2025 Conference Publication Robust audio-visual segmentation via audio-guided visual convergent alignmentLiu, Chen, Li, Peike, Yang, Liying, Wang, Dadong, Li, Lincheng and Yu, Xin (2025). Robust audio-visual segmentation via audio-guided visual convergent alignment. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/cvpr52734.2025.02693 |
|
2025 Conference Publication EasyCraft: a robust and efficient framework for automatic avatar craftingWang, Suzhen, Chen, Weijie, Zhang, Wei, Zhao, Minda, Li, Lincheng, Zhang, Rongsheng, Hu, Zhipeng and Yu, Xin (2025). EasyCraft: a robust and efficient framework for automatic avatar crafting. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. New York, NY USA: IEEE Computer Society. doi: 10.1109/CVPR52734.2025.00524 |
|
2025 Conference Publication M3GYM: a large-scale multimodal multi-view multi-person pose dataset for fitness activity understanding in real-world settingsXu, Qingzheng, Cao, Ru, Shen, Xin, Du, Heming, Wang, Sen and Yu, Xin (2025). M3GYM: a large-scale multimodal multi-view multi-person pose dataset for fitness activity understanding in real-world settings. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, United States, 10 - 17 June 2025. Washington, DC, United States: I E E E Computer Society. doi: 10.1109/cvpr52734.2025.01147 |
|
2025 Conference Publication Cross-view isolated sign language recognition challenge: design, results and future researchShen, Xin, Du, Heming, Xu, Miao, Liu, Miaomiao and Yu, Xin (2025). Cross-view isolated sign language recognition challenge: design, results and future research. WWW '25: The ACM Web Conference 2025, Sydney, NSW Australia, 28 April-2 May 2025. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3701716.3717522 |
|
2025 Conference Publication MDAM 3: a misinformation detection and analysis framework for multitype multimodal mediaXu, Qingzheng, Du, Heming, Łukasik, Szymon, Zhu, Tianqing, Wang, Sen and Yu, Xin (2025). MDAM 3: a misinformation detection and analysis framework for multitype multimodal media. WWW '25: The ACM Web Conference 2025, Sydney, NSW Australia, 28 April-2 May 2025. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3696410.3714498 |
|
2025 Journal Article ICE: interactive 3D game character facial editing via dialogueWu, Haoqian, Zhao, Minda, Hu, Zhipeng, Fan, Changjie, Li, Lincheng, Chen, Weijie, Zhao, Rui and Yu, Xin (2025). ICE: interactive 3D game character facial editing via dialogue. IEEE Transactions on Multimedia, 27, 3210-4223. doi: 10.1109/tmm.2025.3557611 |