Skip to menu Skip to content Skip to footer

2026

Conference Publication

Mebm: Exploring the Synergy of Mixture of Experts in Background Matting

Wang, Yiru, Lu, Ming, Tian, Senmao, Yu, Xin and Zhang, Shunli (2026). Mebm: Exploring the Synergy of Mixture of Experts in Background Matting. IEEE. doi: 10.1109/icassp55912.2026.11464270

Mebm: Exploring the Synergy of Mixture of Experts in Background Matting

2026

Conference Publication

Content-Aware Model Slimming for Image Super-Resolution with Large Input

Tian, Senmao, Hong, Gangyi, Wang, Shuyun, Yu, Xin and Zhang, Shunli (2026). Content-Aware Model Slimming for Image Super-Resolution with Large Input. IEEE. doi: 10.1109/icassp55912.2026.11461944

Content-Aware Model Slimming for Image Super-Resolution with Large Input

2026

Journal Article

DFBSNet: Dual frequency-domain branch fusion and selection network for hyperspectral anomaly detection

Yao, Yiming, Wang, Qing, Zhao, Dong, You, Mingtao, Xiang, Pei, Asano, Yuta, Yu, Xin, Wang, Chao, Zhou, Huixin and Ren, Jinchang (2026). DFBSNet: Dual frequency-domain branch fusion and selection network for hyperspectral anomaly detection. Pattern Recognition, 180 113967, 113967. doi: 10.1016/j.patcog.2026.113967

DFBSNet: Dual frequency-domain branch fusion and selection network for hyperspectral anomaly detection

2026

Journal Article

Cluster-aware prompt ensemble learning for few-shot vision-language model adaptation

Chen, Zhi, Yu, Xin, Tao, Xiaohui, Li, Yan and Huang, Zi (2026). Cluster-aware prompt ensemble learning for few-shot vision-language model adaptation. Pattern Recognition, 172 (C) 112596. doi: 10.1016/j.patcog.2025.112596

Cluster-aware prompt ensemble learning for few-shot vision-language model adaptation

2026

Journal Article

Distributed Zero-Shot Learning for Visual Recognition

Chen, Zhi, Luo, Yadan, Huang, Zi, Li, Jingjing, Wang, Sen and Yu, Xin (2026). Distributed Zero-Shot Learning for Visual Recognition. IEEE Transactions on Multimedia, PP (99), 1-12. doi: 10.1109/TMM.2026.3673561

Distributed Zero-Shot Learning for Visual Recognition

2026

Book Chapter

High-Resolution and Multimodal Optogenetic fMRI of Brain Dynamics

He, Yi, Yuan, Jianyu, Liang, Mingyao, Xie, Zeping and Yu, Xin (2026). High-Resolution and Multimodal Optogenetic fMRI of Brain Dynamics. Neuromethods. (pp. 3-18) New York, NY: Springer US. doi: 10.1007/978-1-0716-5178-0_1

High-Resolution and Multimodal Optogenetic fMRI of Brain Dynamics

2026

Journal Article

Compression-Oriented Video Super-Resolution

Wang, Shuyun, Liu, Yanbin, Lu, Ming, Wu, Zhuojie, Tian, Senmao, Guo, Yandong and Yu, Xin (2026). Compression-Oriented Video Super-Resolution. IEEE Transactions on Image Processing, PP (99), 1-1. doi: 10.1109/tip.2026.3682128

Compression-Oriented Video Super-Resolution

2026

Journal Article

Safe and Reliable Diffusion Models via Subspace Projection

Chen, Huiqiang, Zhu, Tianqing, Wang, Linlin, Yu, Xin, Gao, Longxiang and Zhou, Wanlei (2026). Safe and Reliable Diffusion Models via Subspace Projection. IEEE Transactions on Dependable and Secure Computing, PP (99), 1-14. doi: 10.1109/TDSC.2026.3692493

Safe and Reliable Diffusion Models via Subspace Projection

2025

Journal Article

Hyperspectral video object tracking with cross-modal spectral complementary and memory prompt network

Jiang, Wenhao, Zhao, Dong, Wang, Chen, Yu, Xin, Arun, Pattathal V., Asano, Yuta, Xiang, Pei and Zhou, Huixin (2025). Hyperspectral video object tracking with cross-modal spectral complementary and memory prompt network. Knowledge-Based Systems, 330 (Part B) 114595, 1-16. doi: 10.1016/j.knosys.2025.114595

Hyperspectral video object tracking with cross-modal spectral complementary and memory prompt network

2025

Journal Article

Analytical Survey of Learning with Low-Resource Data: From Analysis to Investigation

Cao, Xiaofeng, Xu, Mingwei, Yu, Xin, Yao, Jiangchao, Ye, Wei, Huang, Shengjun, Zhang, Minling, Tsang, Ivor, Ong, Yew-Soon, Kwok, James T. and Shen, Heng Tao (2025). Analytical Survey of Learning with Low-Resource Data: From Analysis to Investigation. ACM Computing Surveys, 58 (6) 3773075, 1-47. doi: 10.1145/3773075

Analytical Survey of Learning with Low-Resource Data: From Analysis to Investigation

2025

Conference Publication

Cross-View Isolated Sign Language Recognition via View Synthesis and Feature Disentanglement

Shen, Xin, Wang, Xinyu, Shen, Lei, Zhang, Kaihao and Yu, Xin (2025). Cross-View Isolated Sign Language Recognition via View Synthesis and Feature Disentanglement. IEEE. doi: 10.1109/iccv51701.2025.01920

Cross-View Isolated Sign Language Recognition via View Synthesis and Feature Disentanglement

2025

Conference Publication

LDPose: Towards Inclusive Human Pose Estimation for Limb-Deficient Individuals in the Wild

Ying, Jiaying, Du, Heming, Zhang, Kaihao, Li, Lincheng and Yu, Xin (2025). LDPose: Towards Inclusive Human Pose Estimation for Limb-Deficient Individuals in the Wild. IEEE. doi: 10.1109/iccv51701.2025.00920

LDPose: Towards Inclusive Human Pose Estimation for Limb-Deficient Individuals in the Wild

2025

Conference Publication

Dynamic derivation and elimination: audio visual segmentation with enhanced audio semantics

Liu, Chen, Yang, Liying, Li, Peike, Wang, Dadong, Li, Lincheng and Yu, Xin (2025). Dynamic derivation and elimination: audio visual segmentation with enhanced audio semantics. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/cvpr52734.2025.00298

Dynamic derivation and elimination: audio visual segmentation with enhanced audio semantics

2025

Conference Publication

Blind bitstream-corrupted video recovery via metadata-guided diffusion model

Wang, Shuyun, Zhang, Hu, Shen, Xin, Wang, Dadong and Yu, Xin (2025). Blind bitstream-corrupted video recovery via metadata-guided diffusion model. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. New York, NY USA: IEEE Computer Society. doi: 10.1109/CVPR52734.2025.02139

Blind bitstream-corrupted video recovery via metadata-guided diffusion model

2025

Conference Publication

Robust audio-visual segmentation via audio-guided visual convergent alignment

Liu, Chen, Li, Peike, Yang, Liying, Wang, Dadong, Li, Lincheng and Yu, Xin (2025). Robust audio-visual segmentation via audio-guided visual convergent alignment. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/cvpr52734.2025.02693

Robust audio-visual segmentation via audio-guided visual convergent alignment

2025

Conference Publication

EasyCraft: a robust and efficient framework for automatic avatar crafting

Wang, Suzhen, Chen, Weijie, Zhang, Wei, Zhao, Minda, Li, Lincheng, Zhang, Rongsheng, Hu, Zhipeng and Yu, Xin (2025). EasyCraft: a robust and efficient framework for automatic avatar crafting. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. New York, NY USA: IEEE Computer Society. doi: 10.1109/CVPR52734.2025.00524

EasyCraft: a robust and efficient framework for automatic avatar crafting

2025

Conference Publication

M3GYM: a large-scale multimodal multi-view multi-person pose dataset for fitness activity understanding in real-world settings

Xu, Qingzheng, Cao, Ru, Shen, Xin, Du, Heming, Wang, Sen and Yu, Xin (2025). M3GYM: a large-scale multimodal multi-view multi-person pose dataset for fitness activity understanding in real-world settings. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, United States, 10 - 17 June 2025. Washington, DC, United States: I E E E Computer Society. doi: 10.1109/cvpr52734.2025.01147

M3GYM: a large-scale multimodal multi-view multi-person pose dataset for fitness activity understanding in real-world settings

2025

Conference Publication

Cross-view isolated sign language recognition challenge: design, results and future research

Shen, Xin, Du, Heming, Xu, Miao, Liu, Miaomiao and Yu, Xin (2025). Cross-view isolated sign language recognition challenge: design, results and future research. WWW '25: The ACM Web Conference 2025, Sydney, NSW Australia, 28 April-2 May 2025. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3701716.3717522

Cross-view isolated sign language recognition challenge: design, results and future research

2025

Conference Publication

MDAM 3: a misinformation detection and analysis framework for multitype multimodal media

Xu, Qingzheng, Du, Heming, Łukasik, Szymon, Zhu, Tianqing, Wang, Sen and Yu, Xin (2025). MDAM 3: a misinformation detection and analysis framework for multitype multimodal media. WWW '25: The ACM Web Conference 2025, Sydney, NSW Australia, 28 April-2 May 2025. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3696410.3714498

MDAM 3: a misinformation detection and analysis framework for multitype multimodal media

2025

Journal Article

ICE: interactive 3D game character facial editing via dialogue

Wu, Haoqian, Zhao, Minda, Hu, Zhipeng, Fan, Changjie, Li, Lincheng, Chen, Weijie, Zhao, Rui and Yu, Xin (2025). ICE: interactive 3D game character facial editing via dialogue. IEEE Transactions on Multimedia, 27, 3210-4223. doi: 10.1109/tmm.2025.3557611

ICE: interactive 3D game character facial editing via dialogue