Skip to menu Skip to content Skip to footer

2025

Conference Publication

Exploring visual vulnerabilities via multi-loss adversarial search for jailbreaking vision-language models

Hao, Shuyang, Hooi, Bryan, Liu, Jun, Chang, Kai-Wei, Huang, Zi and Cai, Yujun (2025). Exploring visual vulnerabilities via multi-loss adversarial search for jailbreaking vision-language models. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TX, United States, 10 - 17 June 2025. Washington, DC, United States: I E E E Computer Society. doi: 10.1109/cvpr52734.2025.01852

Exploring visual vulnerabilities via multi-loss adversarial search for jailbreaking vision-language models

2025

Conference Publication

CON-RECALL: Detecting Pre-training Data in LLMs via Contrastive Decoding

Wang, Cheng, Wang, Yiwei, Hooi, Bryan, Cai, Yujun, Peng, Nanyun and Chang, Kai-Wei (2025). CON-RECALL: Detecting Pre-training Data in LLMs via Contrastive Decoding. 31st International Conference on Computational Linguistics, Abu Dhabi, United Arab Emirates, 19-24 January 2025. Stroudsburg, PA, United States: Association for Computational Linguistics (ACL).

CON-RECALL: Detecting Pre-training Data in LLMs via Contrastive Decoding

2025

Conference Publication

Energy-Calibrated VAE with Test Time Free Lunch

Luo, Yihong, Qiu, Siya, Tao, Xingjian, Cai, Yujun and Tang, Jing (2025). Energy-Calibrated VAE with Test Time Free Lunch. 18th European Conference on Computer Vision (ECCV), Milan Italy, Sep 29-Oct 04, 2024. Heidelberg, Germany: Springer. doi: 10.1007/978-3-031-73013-9_19

Energy-Calibrated VAE with Test Time Free Lunch

2025

Conference Publication

LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion

Li, Muchen, Christen, Sammy, Wan, Chengde, Cai, Yujun, Liao, Renjie, Sigal, Leonid and Ma, Shugao (2025). LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion. IEEE Computer Society. doi: 10.1109/CVPR52734.2025.01623

LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion

2025

Journal Article

SED-MVS: Segmentation-Driven and Edge-Aligned Deformation Multi-View Stereo with Depth Restoration and Occlusion Constraint

Yuan, Zhenlong, Yang, Zhidong, Cai, Yujun, Wu, Kuangxin, Liu, Mufan, Zhang, Dapeng, Jiang, Hao, Li, Zhaoxin and Wang, Zhaoqi (2025). SED-MVS: Segmentation-Driven and Edge-Aligned Deformation Multi-View Stereo with Depth Restoration and Occlusion Constraint. IEEE Transactions on Circuits and Systems for Video Technology, 1-1. doi: 10.1109/TCSVT.2025.3574473

SED-MVS: Segmentation-Driven and Edge-Aligned Deformation Multi-View Stereo with Depth Restoration and Occlusion Constraint

2024

Conference Publication

STMG: a machine learning microgesture recognition system for supporting thumb-based VR/AR input

Kin, Kenrick, Wan, Chengde, Koh, Ken, Marin, Andrei, Camgöz, Necati Cihan, Zhang, Yubo, Cai, Yujun, Kovalev, Fedor, Ben-Zacharia, Moshe, Hoople, Shannon, Nunes-Ueno, Marcos, Sanchez-Rodriguez, Mariel, Bhargava, Ayush, Wang, Robert, Sauser, Eric and Ma, Shugao (2024). STMG: a machine learning microgesture recognition system for supporting thumb-based VR/AR input. CHI '24: CHI Conference on Human Factors in Computing Systems, Honolulu, HI USA, 11-16 May 2024. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3613904.3642702

STMG: a machine learning microgesture recognition system for supporting thumb-based VR/AR input

2024

Conference Publication

Social diffusion: long-term multiple human motion anticipation

Tanke, Julian, Zhang, Linguang, Zhao, Amy, Tang, Chengcheng, Cai, Yujun, Wang, Lezi, Wu, Po-Chen, Gall, Juergen and Keskin, Cem (2024). Social diffusion: long-term multiple human motion anticipation. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1-6 October 2023. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/ICCV51070.2023.00880

Social diffusion: long-term multiple human motion anticipation

2024

Conference Publication

LLMs are good action recognizers

Qu, Haoxuan, Cai, Yujun and Liu, Jun (2024). LLMs are good action recognizers. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, United States, 16-22 June 2024. Washington, DC, United States: IEEE Computer Society. doi: 10.1109/CVPR52733.2024.01741

LLMs are good action recognizers

2024

Conference Publication

DisC-GS: discontinuity-aware Gaussian splatting

Qu, Haoxuan, Li, Zhuoling, Rahmani, Hossein, Cai, Yujun and Liu, Jun (2024). DisC-GS: discontinuity-aware Gaussian splatting. NeuralIPS 2024, Vancouver, BC, Canada, 10 - 15 December 2024. Maryland Heights, MO, United States: Morgan Kaufmann Publishers.

DisC-GS: discontinuity-aware Gaussian splatting

2024

Conference Publication

emg2pose: A Large and Diverse Benchmark for Surface Electromyographic Hand Pose Estimation

Salter, Sasha, Warren, Richard, Schlager, Collin, Spurr, Adrian, Han, Shangchen, Bhasin, Rohin, Cai, Yujun, Walkington, Peter, Bolarinwa, Anuoluwapo, Wang, Robert, Danielson, Nathan, Merel, Josh, Pnevmatikakis, Eftychios and Marshall, Jesse (2024). emg2pose: A Large and Diverse Benchmark for Surface Electromyographic Hand Pose Estimation. Neural information processing systems foundation.

emg2pose: A Large and Diverse Benchmark for Surface Electromyographic Hand Pose Estimation

2024

Conference Publication

6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation

Xu, Li, Qu, Haoxuan, Cai, Yujun and Liu, Jun (2024). 6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation. IEEE Computer Society. doi: 10.1109/CVPR52733.2024.00924

6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation

2023

Conference Publication

LMC: large model collaboration with cross-assessment for training-free open-set object recognition

Qu, Haoxuan, Hui, Xiaofei, Cai, Yujun and Liu, Jun (2023). LMC: large model collaboration with cross-assessment for training-free open-set object recognition. NIPS'23: 37th International Conference on Neural Information Processing Systems, New Orleans, LA USA, 10-16 December 2023. Maryland Heights, MO USA: Morgan Kaufmann Publishers. doi: 10.5555/3666122.3668138

LMC: large model collaboration with cross-assessment for training-free open-set object recognition

2023

Conference Publication

Primacy effect of ChatGPT

Wang, Yiwei, Cai, Yujun, Chen, Muhao, Liang, Yuxuan and Hooi, Bryan (2023). Primacy effect of ChatGPT. 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, Singapore, 6-10 December 2023. Kerrville, TX USA: Association for Computational Linguistics. doi: 10.18653/v1/2023.emnlp-main.8

Primacy effect of ChatGPT

2023

Conference Publication

How fragile is relation extraction under entity replacements?

Wang, Yiwei, Hooi, Bryan, Wang, Fei, Cai, Yujun, Liang, Yuxuan, Zhou, Wenxuan, Tang, Jing, Duan, Manjuan and Chen, Muhao (2023). How fragile is relation extraction under entity replacements?. 27th Conference on Computational Natural Language Learning (CoNLL), Singapore, Singapore, 6-7 December 2023. Kerrville, TX USA: Association for Computational Linguistics. doi: 10.18653/v1/2023.conll-1.27

How fragile is relation extraction under entity replacements?

2023

Journal Article

DeepEMD: differentiable Earth Mover's Distance for few-shot learning

Zhang, Chi, Cai, Yujun, Lin, Guosheng and Shen, Chunhua (2023). DeepEMD: differentiable Earth Mover's Distance for few-shot learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45 (5), 5632-5648. doi: 10.1109/TPAMI.2022.3217373

DeepEMD: differentiable Earth Mover's Distance for few-shot learning

2023

Conference Publication

A characteristic function-based method for bottom-up human pose estimation

Qu, Haoxuan, Cai, Yujun, Foo, Lin Geng, Kumar, Ajay and Liu, Jun (2023). A characteristic function-based method for bottom-up human pose estimation. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, 17-24 June 2023. Washington, DC USA: IEEE Computer Society. doi: 10.1109/CVPR52729.2023.01250

A characteristic function-based method for bottom-up human pose estimation

2022

Conference Publication

UmeTrack: unified multi-view end-to-end hand tracking for VR

Han, Shangchen, Wu, Po-Chen, Zhang, Yubo, Liu, Beibei, Zhang, Linguang, Wang, Zheng, Si, Weiguang, Zhang, Peizhao, Cai, Yujun, Hodan, Tomas, Cabezas, Randi, Tran, Luan, Akbay, Muzaffer, Yu, Tsz-Ho, Keskin, Cem and Wang, Robert (2022). UmeTrack: unified multi-view end-to-end hand tracking for VR. SA '22: SIGGRAPH Asia 2022, Daegu, Republic of Korea, New York, NY USA. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3550469.3555378

UmeTrack: unified multi-view end-to-end hand tracking for VR

2022

Conference Publication

Time-aware neighbor sampling on temporal graphs

Wang, Yiwei, Cai, Yujun, Liang, Yuxuan, Ding, Henghui, Wang, Changhu and Hooi, Bryan (2022). Time-aware neighbor sampling on temporal graphs. 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, 18-23 July 2022. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/IJCNN55064.2022.9892942

Time-aware neighbor sampling on temporal graphs

2022

Conference Publication

Should we rely on entity mentions for relation extraction? Debiasing relation extraction with counterfactual analysis

Wang, Yiwei, Chen, Muhao, Zhou, Wenxuan, Cai, Yujun, Liang, Yuxuan, Liu, Dayiheng, Yang, Baosong, Liu, Juncheng and Hooi, Bryan (2022). Should we rely on entity mentions for relation extraction? Debiasing relation extraction with counterfactual analysis. 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Seattle, WA USA, 10-15 July 2022. Kerrville, TX USA: Association for Computational Linguistics (ACL). doi: 10.18653/v1/2022.naacl-main.224

Should we rely on entity mentions for relation extraction? Debiasing relation extraction with counterfactual analysis

2022

Conference Publication

A unified 3D human motion synthesis model via conditional variational auto-encoder

Cai, Yujun, Wang, Yiwei, Zhu, Yiheng, Cham, Tat-Jen, Cai, Jianfei, Yuan, Junsong, Liu, Jun, Zheng, Chuanxia, Yan, Sijie, Ding, Henghui, Shen, Xiaohui, Liu, Ding and Thalmann, Nadia Magnenat (2022). A unified 3D human motion synthesis model via conditional variational auto-encoder. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, Canada, 10-17 October 2021. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/ICCV48922.2021.01144

A unified 3D human motion synthesis model via conditional variational auto-encoder