Expert publications - About - The University of Queensland

All (187) Journal Article (64) Edited Outputs (1) Conference Publication (120) Book Chapter (2)

2026

Conference Publication

SynthIR: The First Workshop on Synthetic Content in Information Retrieval Ecosystems

Liu, Ping, Zheng, Zhedong, Culpepper, Shane and Yu, Xin (2026). SynthIR: The First Workshop on Synthetic Content in Information Retrieval Ecosystems. New York, NY, USA: ACM. doi: 10.1145/3805712.3808655

SynthIR: The First Workshop on Synthetic Content in Information Retrieval Ecosystems

2026

Conference Publication

Mebm: Exploring the Synergy of Mixture of Experts in Background Matting

Wang, Yiru, Lu, Ming, Tian, Senmao, Yu, Xin and Zhang, Shunli (2026). Mebm: Exploring the Synergy of Mixture of Experts in Background Matting. IEEE. doi: 10.1109/icassp55912.2026.11464270

Mebm: Exploring the Synergy of Mixture of Experts in Background Matting

2026

Conference Publication

Content-Aware Model Slimming for Image Super-Resolution with Large Input

Tian, Senmao, Hong, Gangyi, Wang, Shuyun, Yu, Xin and Zhang, Shunli (2026). Content-Aware Model Slimming for Image Super-Resolution with Large Input. IEEE. doi: 10.1109/icassp55912.2026.11461944

Content-Aware Model Slimming for Image Super-Resolution with Large Input

2026

Conference Publication

Augment to segment: tackling pixel-level imbalance in wheat disease and pest segmentation

Wei, Tianqi, Yu, Xin, Chen, Zhi, Chapman, Scott and Huang, Zi (2026). Augment to segment: tackling pixel-level imbalance in wheat disease and pest segmentation. 36th Australasian Database Conference, Sydney, NSW Australia and Bali, Indonesia, 4-6 December 2025. Singapore: Springer Singapore. doi: 10.1007/978-981-95-6196-4_3

Augment to segment: tackling pixel-level imbalance in wheat disease and pest segmentation

2026

Conference Publication

Dynamic orchestration of multi-agent system for real-world multi-image agricultural VQA

Ke, Yan, Yu, Xin, Du, Heming, Chapman, Scott and Huang, Helen (2026). Dynamic orchestration of multi-agent system for real-world multi-image agricultural VQA. 36th Australasian Database Conference, Sydney, NSW Australia and Bali, Indonesia, 4-6 December 2025. Singapore: Springer Singapore. doi: 10.1007/978-981-95-6196-4_11

Dynamic orchestration of multi-agent system for real-world multi-image agricultural VQA

2025

Conference Publication

Cross-View Isolated Sign Language Recognition via View Synthesis and Feature Disentanglement

Shen, Xin, Wang, Xinyu, Shen, Lei, Zhang, Kaihao and Yu, Xin (2025). Cross-View Isolated Sign Language Recognition via View Synthesis and Feature Disentanglement. IEEE. doi: 10.1109/iccv51701.2025.01920

Cross-View Isolated Sign Language Recognition via View Synthesis and Feature Disentanglement

2025

Conference Publication

3DRealCar: An In-the-Wild RGB-D Car Dataset with 360-Degree Views

Du, Xiaobiao, Wang, Yida, Sun, Haiyang, Wu, Zhuojie, Sheng, Hongwei, Wang, Shuyun, Ying, Jiaying, Lu, Ming, Zhu, Tianqing, Zhan, Kun and Yu, Xin (2025). 3DRealCar: An In-the-Wild RGB-D Car Dataset with 360-Degree Views. IEEE. doi: 10.1109/iccv51701.2025.02458

3DRealCar: An In-the-Wild RGB-D Car Dataset with 360-Degree Views

2025

Conference Publication

LDPose: Towards Inclusive Human Pose Estimation for Limb-Deficient Individuals in the Wild

Ying, Jiaying, Du, Heming, Zhang, Kaihao, Li, Lincheng and Yu, Xin (2025). LDPose: Towards Inclusive Human Pose Estimation for Limb-Deficient Individuals in the Wild. IEEE. doi: 10.1109/iccv51701.2025.00920

LDPose: Towards Inclusive Human Pose Estimation for Limb-Deficient Individuals in the Wild

2025

Conference Publication

EasyCraft: a robust and efficient framework for automatic avatar crafting

Wang, Suzhen, Chen, Weijie, Zhang, Wei, Zhao, Minda, Li, Lincheng, Zhang, Rongsheng, Hu, Zhipeng and Yu, Xin (2025). EasyCraft: a robust and efficient framework for automatic avatar crafting. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. New York, NY USA: IEEE Computer Society. doi: 10.1109/CVPR52734.2025.00524

EasyCraft: a robust and efficient framework for automatic avatar crafting

2025

Conference Publication

Blind bitstream-corrupted video recovery via metadata-guided diffusion model

Wang, Shuyun, Zhang, Hu, Shen, Xin, Wang, Dadong and Yu, Xin (2025). Blind bitstream-corrupted video recovery via metadata-guided diffusion model. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. New York, NY USA: IEEE Computer Society. doi: 10.1109/CVPR52734.2025.02139

Blind bitstream-corrupted video recovery via metadata-guided diffusion model

2025

Conference Publication

Dynamic derivation and elimination: audio visual segmentation with enhanced audio semantics

Liu, Chen, Yang, Liying, Li, Peike, Wang, Dadong, Li, Lincheng and Yu, Xin (2025). Dynamic derivation and elimination: audio visual segmentation with enhanced audio semantics. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/cvpr52734.2025.00298

Dynamic derivation and elimination: audio visual segmentation with enhanced audio semantics

2025

Conference Publication

Robust audio-visual segmentation via audio-guided visual convergent alignment

Liu, Chen, Li, Peike, Yang, Liying, Wang, Dadong, Li, Lincheng and Yu, Xin (2025). Robust audio-visual segmentation via audio-guided visual convergent alignment. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN USA, 10-17 June 2025. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/cvpr52734.2025.02693

Robust audio-visual segmentation via audio-guided visual convergent alignment

2025

Conference Publication

M3GYM: a large-scale multimodal multi-view multi-person pose dataset for fitness activity understanding in real-world settings

Xu, Qingzheng, Cao, Ru, Shen, Xin, Du, Heming, Wang, Sen and Yu, Xin (2025). M3GYM: a large-scale multimodal multi-view multi-person pose dataset for fitness activity understanding in real-world settings. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, United States, 10 - 17 June 2025. Washington, DC, United States: I E E E Computer Society. doi: 10.1109/cvpr52734.2025.01147

M3GYM: a large-scale multimodal multi-view multi-person pose dataset for fitness activity understanding in real-world settings

2025

Conference Publication

Cross-view isolated sign language recognition challenge: design, results and future research

Shen, Xin, Du, Heming, Xu, Miao, Liu, Miaomiao and Yu, Xin (2025). Cross-view isolated sign language recognition challenge: design, results and future research. WWW '25: The ACM Web Conference 2025, Sydney, NSW Australia, 28 April-2 May 2025. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3701716.3717522

Cross-view isolated sign language recognition challenge: design, results and future research

2025

Conference Publication

MDAM 3: a misinformation detection and analysis framework for multitype multimodal media

Xu, Qingzheng, Du, Heming, Łukasik, Szymon, Zhu, Tianqing, Wang, Sen and Yu, Xin (2025). MDAM 3: a misinformation detection and analysis framework for multitype multimodal media. WWW '25: The ACM Web Conference 2025, Sydney, NSW Australia, 28 April-2 May 2025. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3696410.3714498

MDAM 3: a misinformation detection and analysis framework for multitype multimodal media

2025

Conference Publication

FlashVTG: feature layering and adaptive score handling network for video temporal grounding

Cao, Zhuo, Zhang, Bingqing, Du, Heming, Yu, Xin, Li, Xue and Wang, Sen (2025). FlashVTG: feature layering and adaptive score handling network for video temporal grounding. 2025 Winter Conference on Applications of Computer Vision-WACV, Tucson, AZ, United States, 28 February-4 March 2025. Piscataway, NJ, United States: Institute of Electrical and Electronics Engineers. doi: 10.1109/wacv61041.2025.00894

FlashVTG: feature layering and adaptive score handling network for video temporal grounding

2025

Conference Publication

TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm

Zhang, Bingqing, Cao, Zhuo, Du, Heming, Yu, Xin, Li, Xue, Liu, Jiajun and Wang, Sen (2025). TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm. 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, AZ United States, 26 February - 6 March 2025. Piscataway, NJ United States: IEEE. doi: 10.1109/wacv61041.2025.00485

TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm

2025

Conference Publication

QGait: Toward Accurate Quantization for Gait Recognition

Tian, Senmao, Gao, Haoyu, Hong, Gangyi, Wang, Shuyun, Wang, Jingjie, Yu, Xin and Zhang, Shunli (2025). QGait: Toward Accurate Quantization for Gait Recognition. Institute of Electrical and Electronics Engineers Inc.. doi: 10.1109/IJCB65343.2025.11411264

QGait: Toward Accurate Quantization for Gait Recognition

2025

Conference Publication

Who is Being Impersonated? Deepfake audio detection and impersonated identification via extraction of ID-specific features

Guo, Tianchen, Du, Heming, Huo, Huan, Liu, Bo and Yu, Xin (2025). Who is Being Impersonated? Deepfake audio detection and impersonated identification via extraction of ID-specific features. 24th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP 2024), Macau, China, 29-31 October 2024. Singapore: Springer. doi: 10.1007/978-981-96-1548-3_21

Who is Being Impersonated? Deepfake audio detection and impersonated identification via extraction of ID-specific features

2025

Conference Publication

Transferable attacks for semantic segmentation

He, Mengqi, Zhang, Jing and Yu, Xin (2025). Transferable attacks for semantic segmentation. 35th Australasian Database Conference, Gold Coast, QLD, Australia, 16-18 December 2024. Heidelberg, Germany: Springer. doi: 10.1007/978-981-96-1242-0_28

Transferable attacks for semantic segmentation

SynthIR: The First Workshop on Synthetic Content in Information Retrieval Ecosystems

Mebm: Exploring the Synergy of Mixture of Experts in Background Matting

Content-Aware Model Slimming for Image Super-Resolution with Large Input

Augment to&nbsp;segment: tackling pixel-level imbalance in&nbsp;wheat disease and&nbsp;pest segmentation

Dynamic orchestration of multi-agent system for real-world multi-image agricultural VQA

Cross-View Isolated Sign Language Recognition via View Synthesis and Feature Disentanglement

3DRealCar: An In-the-Wild RGB-D Car Dataset with 360-Degree Views

LDPose: Towards Inclusive Human Pose Estimation for Limb-Deficient Individuals in the Wild

EasyCraft: a robust and efficient framework for automatic avatar crafting

Blind bitstream-corrupted video recovery via metadata-guided diffusion model

Dynamic derivation and elimination: audio visual segmentation with enhanced audio semantics

Robust audio-visual segmentation via audio-guided visual convergent alignment

M3GYM: a large-scale multimodal multi-view multi-person pose dataset for fitness activity understanding in real-world settings

Cross-view isolated sign language recognition challenge: design, results and future research

MDAM 3: a misinformation detection and analysis framework for multitype multimodal media

FlashVTG: feature layering and adaptive score handling network for video temporal grounding

TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm

QGait: Toward Accurate Quantization for Gait Recognition

Who is Being Impersonated? Deepfake audio detection and&nbsp;impersonated identification via&nbsp;extraction of&nbsp;ID-specific features

Transferable attacks for&nbsp;semantic segmentation

Augment to segment: tackling pixel-level imbalance in wheat disease and pest segmentation

Who is Being Impersonated? Deepfake audio detection and impersonated identification via extraction of ID-specific features

Transferable attacks for semantic segmentation