Expert publications - About - The University of Queensland

All (31) Journal Article (6) Conference Publication (25)

2025

Conference Publication

Multimodal Deepfake Generation and Detection: Challenges, Methods, and Future Directions

Dhall, Abhinav, Cai, Zhixi and Ghosh, Shreya (2025). Multimodal Deepfake Generation and Detection: Challenges, Methods, and Future Directions. New York, NY, USA: ACM. doi: 10.1145/3747327.3762826

Multimodal Deepfake Generation and Detection: Challenges, Methods, and Future Directions

2025

Conference Publication

GEMS: Group Emotion Profiling Through Multimodal Situational Understanding

Kataria, Anubhav, Madan, Surbhi, Ghosh, Shreya, Gedeon, Tom and Dhall, Abhinav (2025). GEMS: Group Emotion Profiling Through Multimodal Situational Understanding. IEEE. doi: 10.1109/mlsp62443.2025.11204342

GEMS: Group Emotion Profiling Through Multimodal Situational Understanding

2025

Conference Publication

7th ABAW competition: multi-task learning and compound expression recognition

Kollias, Dimitrios, Zafeiriou, Stefanos, Kotsia, Irene, Dhall, Abhinav, Ghosh, Shreya, Shao, Chunchang and Hu, Guanyu (2025). 7th ABAW competition: multi-task learning and compound expression recognition. Computer Vision – ECCV 2024 Workshops, Milan, Italy, 29 September-4 October 2024. Cham, Switzerland: Springer Cham. doi: 10.1007/978-3-031-91581-9_3

7th ABAW competition: multi-task learning and compound expression recognition

2025

Conference Publication

MIP-GAF: a MLLM-annotated benchmark for Most Important Person localization and group context understanding

Madan, S., Ghosh, S., Sookha, L. R., Ganaie, M. A., Subramanian, R., Dhall, A. and Gedeon, T. (2025). MIP-GAF: a MLLM-annotated benchmark for Most Important Person localization and group context understanding. 2025 Winter Conference on Applications of Computer Vision-WACV, Tucson, AZ USA, 28 February-4 March 2025. Los Alamitos, CA USA: IEEE Computer Society. doi: 10.1109/wacv61041.2025.00150

MIP-GAF: a MLLM-annotated benchmark for Most Important Person localization and group context understanding

2024

Conference Publication

AV-Deepfake1M: a large-scale LLM-driven audio-visual deepfake dataset

Cai, Zhixi, Ghosh, Shreya, Adatia, Aman Pankaj, Hayat, Munawar, Dhall, Abhinav, Gedeon, Tom and Stefanov, Kalin (2024). AV-Deepfake1M: a large-scale LLM-driven audio-visual deepfake dataset. MM '24: The 32nd ACM International Conference on Multimedia, Melbourne, VIC Australia, 28 October-1 November 2024. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3664647.3680795

AV-Deepfake1M: a large-scale LLM-driven audio-visual deepfake dataset

2024

Conference Publication

1M-Deepfakes Detection Challenge

Cai, Zhixi, Dhall, Abhinav, Ghosh, Shreya, Hayat, Munawar, Kollias, Dimitrios, Stefanov, Kalin and Tariq, Usman (2024). 1M-Deepfakes Detection Challenge. The 32nd ACM International Conference on Multimedia, Melbourne, VIC Australia, 28 October-1 November 2024. New York, NY USA: Association for Computing Machinery, Inc. doi: 10.1145/3664647.3689145

1M-Deepfakes Detection Challenge

2024

Conference Publication

MRAC Track 1: 2nd Workshop on Multimodal, Generative and Responsible Affective Computing

Ghosh, Shreya, Cai, Zhixi, Dhall, Abhinav, Kollias, Dimitrios, Goecke, Roland and Gedeon, Tom (2024). MRAC Track 1: 2nd Workshop on Multimodal, Generative and Responsible Affective Computing. The 32nd ACM International Conference on Multimedia, Melbourne, VIC Australia, 28 October-1 November 2024. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3689092.3690042

MRAC Track 1: 2nd Workshop on Multimodal, Generative and Responsible Affective Computing

2024

Conference Publication

MRAC '24 Chairs' Welcome

Tao, Jianhua, Ghosh, Shreya, Lian, Zheng, Cai, Zhixi, Schuller, Björn W., Dhall, Abhinav, Zhao, Guoying, Kollias, Dimitrios, Cambria, Erik, Goecke, Roland and Gedeon, Tom (2024). MRAC '24 Chairs' Welcome. MM '24: The 32nd ACM International Conference on Multimedia, Melbourne, VIC Australia, 28 October - 1 November 2024. New York, NY United States: Association for Computing Machinery.

MRAC '24 Chairs' Welcome

2024

Conference Publication

Emolysis: a multimodal open-source group emotion analysis and visualization toolkit

Ghosh, Shreya, Cai, Zhixi, Gupta, Parul, Sharma, Garima, Dhall, Abhinav, Hayat, Munawar and Gedeon, Tom (2024). Emolysis: a multimodal open-source group emotion analysis and visualization toolkit. 12th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos-ACIIW, Glasgow, Scotland, United Kingdom, 15 September 2024. Los Alamitos, CA USA: IEEE Computer Society. doi: 10.1109/aciiw63320.2024.00023

Emolysis: a multimodal open-source group emotion analysis and visualization toolkit

2024

Conference Publication

Attention-Based Multi-layer Perceptron to Categorize Affective Videos from Viewer's Physiological Signals

Shaiok, Lazib Sharar, Hoque, Ishtiaqul, Hasan, Md Rakibul, Ghosh, Shreya, Gedeon, Tom and Hossain, Md Zakir (2024). Attention-Based Multi-layer Perceptron to Categorize Affective Videos from Viewer's Physiological Signals. 16th Asian Conference on Intelligent Information and Database Systems (ACIIDS), Ras Al Khaimah, United Arab Emirates, 15-18 April 2024. Heidelberg, Germany: Springer. doi: 10.1007/978-981-97-5934-7_3

Attention-Based Multi-layer Perceptron to Categorize Affective Videos from Viewer's Physiological Signals

2023

Conference Publication

EfficienTransNet: an automated chest X-ray report generation paradigm

Mondal, Chayan, Pham, Duc-Son, Gupta, Ashu, Ghosh, Shreya, Tan, Tele and Gedeon, Tom (2023). EfficienTransNet: an automated chest X-ray report generation paradigm. The 31st ACM International Conference on Multimedia, Ottawa, Canada, 29 October 2023. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3607865.3616174

EfficienTransNet: an automated chest X-ray report generation paradigm

2023

Conference Publication

GraphITTI: Attributed Graph-based Dominance Ranking in Social Interaction Videos

Sharma, Garima, Ghosh, Shreya, Dhall, Abhinav, Hayat, Munawar, Cai, Jianfei and Gedeon, Tom (2023). GraphITTI: Attributed Graph-based Dominance Ranking in Social Interaction Videos. ICMI '23 Companion: Companion Publication of the 25th International Conference on Multimodal Interaction, Paris, France, 9 - 13 October 2023. New York, NY United States: Association for Computing Machinery. doi: 10.1145/3610661.3616184

GraphITTI: Attributed Graph-based Dominance Ranking in Social Interaction Videos

2023

Conference Publication

MARLIN: Masked Autoencoder for facial video Representation LearnINg

Cai, Zhixi, Ghosh, Shreya, Stefanov, Kalin, Dhall, Abhinav, Cai, Jianfei, Rezatofighi, Hamid, Haffari, Reza and Hayat, Munawar (2023). MARLIN: Masked Autoencoder for facial video Representation LearnINg. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC Canada, 17-24 June 2023. Piscataway, NJ United States: Institute of Electrical and Electronics Engineers. doi: 10.1109/cvpr52729.2023.00150

MARLIN: Masked Autoencoder for facial video Representation LearnINg

2023

Conference Publication

'Labelling the gaps': a weakly supervised automatic eye gaze estimation

Ghosh, Shreya, Dhall, Abhinav, Hayat, Munawar and Knibbe, Jarrod (2023). 'Labelling the gaps': a weakly supervised automatic eye gaze estimation. 16th Asian Conference on Computer Vision (ACCV), Macao, Peoples Republic of China, 4-8 December 2022. Cham, Switzerland: Springer Cham. doi: 10.1007/978-3-031-26316-3_44

'Labelling the gaps': a weakly supervised automatic eye gaze estimation

2022

Conference Publication

AV-GAZE: a study on the effectiveness of audio guided visual attention estimation for non-profilic faces

Ghosh, Shreya, Dhall, Abhinav, Hayat, Munawar and Knibbe, Jarrod (2022). AV-GAZE: a study on the effectiveness of audio guided visual attention estimation for non-profilic faces. 2022 IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 16-19 October 2022. Piscataway, NJ, United States: Institute of Electrical and Electronics Engineers. doi: 10.1109/icip46576.2022.9897360

AV-GAZE: a study on the effectiveness of audio guided visual attention estimation for non-profilic faces

2022

Conference Publication

MTGLS: Multi-Task Gaze Estimation with Limited Supervision

Ghosh, Shreya, Hayat, Munawar, Dhall, Abhinav and Knibbe, Jarrod (2022). MTGLS: Multi-Task Gaze Estimation with Limited Supervision. 22nd IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI United States, 4-8 January 2022. Piscataway, NJ, United States: Institute of Electrical and Electronics Engineers. doi: 10.1109/WACV51458.2022.00123

MTGLS: Multi-Task Gaze Estimation with Limited Supervision

2021

Conference Publication

Speak2Label: using domain knowledge for creating a large scale driver gaze zone estimation dataset

Ghosh, Shreya, Dhall, Abhinav, Sharma, Garima, Gupta, Sarthak and Sebe, Nicu (2021). Speak2Label: using domain knowledge for creating a large scale driver gaze zone estimation dataset. 18th IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, Canada, 11-17 October 2021. Los Alamitos, CA USA: IEEE Computer Society. doi: 10.1109/iccvw54120.2021.00324

Speak2Label: using domain knowledge for creating a large scale driver gaze zone estimation dataset

2020

Conference Publication

LSTM-DNN based Approach for Pain Intensity and Protective Behaviour Prediction

Li, Yi, Ghosh, Shreya, Joshi, Jyoti and Oviatt, Sharon (2020). LSTM-DNN based Approach for Pain Intensity and Protective Behaviour Prediction. 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG), Buenos Aires, Argentina, 16-20 November 2020. Piscataway, NJ United States: Institute of Electrical and Electronics Engineers. doi: 10.1109/fg47880.2020.00061

LSTM-DNN based Approach for Pain Intensity and Protective Behaviour Prediction

2019

Conference Publication

Automatic group level affect and cohesion prediction in videos

Sharma, Garima, Ghosh, Shreya and Dhall, Abhinav (2019). Automatic group level affect and cohesion prediction in videos. 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW), Cambridge, United Kingdom, 3-6 September 2019. Piscataway, NJ USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/aciiw.2019.8925231

Automatic group level affect and cohesion prediction in videos

2019

Conference Publication

Unsupervised learning of eye gaze representation from the web

Dubey, Neeru, Ghosh, Shreya and Dhall, Abhinav (2019). Unsupervised learning of eye gaze representation from the web. International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14-19 July 2019. New York, NY USA: Institute of Electrical and Electronics Engineers. doi: 10.1109/ijcnn.2019.8851961

Unsupervised learning of eye gaze representation from the web

Multimodal Deepfake Generation and Detection: Challenges, Methods, and Future Directions

GEMS: Group Emotion Profiling Through Multimodal Situational Understanding

7th ABAW competition: multi-task learning and&nbsp;compound expression recognition

MIP-GAF: a MLLM-annotated benchmark for Most Important Person localization and group context understanding

AV-Deepfake1M: a large-scale LLM-driven audio-visual deepfake dataset

1M-Deepfakes Detection Challenge

MRAC Track 1: 2nd Workshop on Multimodal, Generative and Responsible Affective Computing

MRAC '24 Chairs' Welcome

Emolysis: a multimodal open-source group emotion analysis and visualization toolkit

Attention-Based Multi-layer Perceptron to Categorize Affective Videos from Viewer's Physiological Signals

EfficienTransNet: an automated chest X-ray report generation paradigm

GraphITTI: Attributed Graph-based Dominance Ranking in Social Interaction Videos

MARLIN: Masked Autoencoder for facial video Representation LearnINg

'Labelling the gaps': a weakly supervised automatic eye gaze estimation

AV-GAZE: a study on the effectiveness of audio guided visual attention estimation for non-profilic faces

MTGLS: Multi-Task Gaze Estimation with Limited Supervision

Speak2Label: using domain knowledge for creating a large scale driver gaze zone estimation dataset

LSTM-DNN based Approach for Pain Intensity and Protective Behaviour Prediction

Automatic group level affect and cohesion prediction in videos

Unsupervised learning of eye gaze representation from the web

7th ABAW competition: multi-task learning and compound expression recognition