Shuai Wang

Email:: shuai.wang2@uq.edu.au

Background

Shuai Wang is a Research Fellow at ielab, The University of Queensland, working on AI-powered search. He builds systems that find information and answer questions, using large language models and retrieval-augmented generation (RAG), and he focuses on making those systems faster and cheaper to run. His broader research contributions span federated search optimization and improving model efficiency in IR and retrieval-augmented generation (RAG) applications. His work has been published at premier venues including SIGIR, ECIR, WSDM, and EMNLP. He has served on program committees for SIGIR, ECIR, ICTIR, and TOIS. A lot of his work comes back to one practical question: how do we get good search and reliable answers without depending on expensive, closed commercial AI?

Shuai has published 25+ papers at venues such as SIGIR, WSDM, ECIR, and EMNLP. He coordinates and teaches INFS7410 (Information Retrieval and Web Search) at UQ.

Shuai completed his PhD on automating medical systematic reviews using neural retrieval systems and generative models (thesis: AI-driven Automated Systematic Reviews). His doctoral work encompassed automatic MeSH term suggestion, screening prioritization, seed-driven retrieval methods, and automatic Boolean query formulation.

Education

PhD, The University of Queensland (2021–2025)
Master's degree, The University of Queensland (2020–2021)
Bachelor's degree, The University of Western Australia (2017–2019)

Availability

Dr Shuai Wang is:: Available for supervision

Qualifications

Masters (Coursework) of Software Engineering, The University of Queensland
Doctor of Philosophy of Information Retrieval and Web Search, The University of Queensland

Research interests

Efficient, Effective and Adaptive Retrieval

Search systems should return the right results without wasting computation. This area studies dense and sparse retrieval that stays accurate while controlling cost, and adaptive models that scale their effort to the difficulty of each query. Topics include multi-representation embeddings, model efficiency, and the trade-offs between effectiveness, speed, and memory in large-scale search.
Efficient Retrieval-Augmented Generation (RAG)

RAG lets language models answer questions using retrieved evidence, but it is costly to run at scale. This area focuses on making RAG cheaper and faster through context compression, reduced redundant computation, and better memory use, so that reliable, grounded question answering can run on local hardware rather than depending only on large commercial APIs.
Effective Search in the Agent Era

Search looks very different when the user is an AI agent rather than a person, so the way retrieval works needs rethinking. Agents issue many queries, chain steps together, and act on what they find, which calls for search designed specifically for them. This area asks how to build retrieval that is effective, efficient, and reliable for agentic systems.
AI for Systematic Reviews and Clinical Evidence

Finding and screening medical evidence is slow and labour-intensive. This area applies retrieval and language models to evidence synthesis, including query formulation, term suggestion, and screening automation, helping clinicians and researchers work faster while keeping the high recall and transparency that high-stakes reviews demand.

Search Professor Shuai Wang’s works on UQ eSpace

27 works between 2021 and 2026

All (27) Journal Article (2) Conference Publication (24) Book Chapter (1)

2026

Book Chapter

A simple sketch of a disposable coffee cup with a lid. The cup features a green circle in the center, possibly representing a logo or design element. The drawing is outlined in black with minimal detail. Starbucks: Improved Training for 2D Matryoshka Embeddings

Zhuang, Shengyao, Wang, Shuai, Zheng, Fabio, Koopman, Bevan and Zuccon, Guido (2026). A simple sketch of a disposable coffee cup with a lid. The cup features a green circle in the center, possibly representing a logo or design element. The drawing is outlined in black with minimal detail. Starbucks: Improved Training for 2D Matryoshka Embeddings. Lecture Notes in Computer Science. (pp. 67-82) Cham: Springer Nature Switzerland. doi: 10.1007/978-3-032-21289-4_5

A simple sketch of a disposable coffee cup with a lid. The cup features a green circle in the center, possibly representing a logo or design element. The drawing is outlined in black with minimal detail. Starbucks: Improved Training for 2D Matryoshka Embeddings

2026

Conference Publication

AutoBool: Reinforcement-Learned LLM for Effective Automatic Systematic Reviews Boolean Query Generation

Wang, Shuai, Scells, Harrisen, Koopman, Bevan and Zuccon, Guido (2026). AutoBool: Reinforcement-Learned LLM for Effective Automatic Systematic Reviews Boolean Query Generation. Stroudsburg, PA, USA: Association for Computational Linguistics. doi: 10.18653/v1/2026.eacl-long.68

AutoBool: Reinforcement-Learned LLM for Effective Automatic Systematic Reviews Boolean Query Generation

2026

Conference Publication

EvalugatorA stylized green crocodile with a yellow grid pattern on its back, resembling a calculator. The design combines elements of a crocodile and a calculator, symbolizing a playful or creative concept.—Rapid, Agile Development and Evaluation of Retrieval Augmented Generation Systems Without Labels

Koopman, Bevan, Li, Hang, Wang, Shuai and Zuccon, Guido (2026). EvalugatorA stylized green crocodile with a yellow grid pattern on its back, resembling a calculator. The design combines elements of a crocodile and a calculator, symbolizing a playful or creative concept.—Rapid, Agile Development and Evaluation of Retrieval Augmented Generation Systems Without Labels. Springer Science and Business Media Deutschland GmbH. doi: 10.1007/978-3-032-21321-1_8

EvalugatorA stylized green crocodile with a yellow grid pattern on its back, resembling a calculator. The design combines elements of a crocodile and a calculator, symbolizing a playful or creative concept.—Rapid, Agile Development and Evaluation of Retrieval Augmented Generation Systems Without Labels

2025

Conference Publication

2D Matryoshka Training for Information Retrieval

Wang, Shuai, Zhuang, Shengyao, Koopman, Bevan and Zuccon, Guido (2025). 2D Matryoshka Training for Information Retrieval. 48th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM SIGIR 2025), Padua, Italy, 13-18 July 2025. New York, NY, United States: Association for Computing Machinery. doi: 10.1145/3726302.3730330

2D Matryoshka Training for Information Retrieval

2025

Conference Publication

Reassessing Large Language Model Boolean query generation for systematic reviews

Wang, Shuai, Scells, Harrisen, Koopman, Bevan and Zuccon, Guido (2025). Reassessing Large Language Model Boolean query generation for systematic reviews. 48th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM SIGIR 2025), Padua, Italy, 13-18 July 2025. New York, NY, United States: Association for Computing Machinery. doi: 10.1145/3726302.3730329

Reassessing Large Language Model Boolean query generation for systematic reviews

2025

Conference Publication

Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition

Yao, Zheng, Wang, Shuai and Zuccon, Guido (2025). Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition. 48th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM SIGIR 2025), Padua, Italy, 13-18 July 2025. New York, NY, United States: Association for Computing Machinery. doi: 10.1145/3726302.3730332

Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition

2025

Conference Publication

ReSLLM: large language models are strong resource selectors for federated search

Wang, Shuai, Zhuang, Shengyao, Koopman, Bevan and Zuccon, Guido (2025). ReSLLM: large language models are strong resource selectors for federated search. The ACM Web Conference 2025, Sydney, NSW Australia, 28 April-2 May 2025. New York, NY United States: Association for Computing Machinery. doi: 10.1145/3701716.3715595

ReSLLM: large language models are strong resource selectors for federated search

2025

Conference Publication

An investigation of prompt variations for zero-shot LLM-based rankers

Sun, Shuoqi, Zhuang, Shengyao, Wang, Shuai and Zuccon, Guido (2025). An investigation of prompt variations for zero-shot LLM-based rankers. 47th European Conference on Information Retrieval, Lucca, Italy, 6-10 April 2025. Cham, Switzerland: Springer Cham. doi: 10.1007/978-3-031-88711-6_12

An investigation of prompt variations for zero-shot LLM-based rankers

2025

Conference Publication

Corpus subsampling: estimating the effectiveness of neural retrieval models on large corpora

Fröbe, Maik, Parry, Andrew, Scells, Harrisen, Wang, Shuai, Zhuang, Shengyao, Zuccon, Guido, Potthast, Martin and Hagen, Matthias (2025). Corpus subsampling: estimating the effectiveness of neural retrieval models on large corpora. 47th European Conference on Information Retrieval, Lucca, Italy, 6-10 April 2025. Cham, Switzerland: Springer Cham. doi: 10.1007/978-3-031-88708-6_29

Corpus subsampling: estimating the effectiveness of neural retrieval models on large corpora

2025

Conference Publication

Context embeddings for efficient answer generation in retrieval-augmented generation

Rau, David, Wang, Shuai, Déjean, Hervé, Clinchant, Stéphane and Kamps, Jaap (2025). Context embeddings for efficient answer generation in retrieval-augmented generation. 18th International Conference on Web Search and Data Mining-WSDM, Hannover, Germany, 10-14 March 2025. New York, NY USA: Association for Computing Machinery. doi: 10.1145/3701551.3703527

Context embeddings for efficient answer generation in retrieval-augmented generation

2024

Conference Publication

Large language models based stemming for information retrieval: promises, pitfalls and failures

Wang, Shuai, Zhuang, Shengyao and Zuccon, Guido (2024). Large language models based stemming for information retrieval: promises, pitfalls and failures. SIGIR '24: 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Washington, DC, United States, 14-18 July 2024. New York, NY, United States: ACM. doi: 10.1145/3626772.3657949

Large language models based stemming for information retrieval: promises, pitfalls and failures

2024

Conference Publication

FeB4RAG: evaluating federated search in the context of retrieval augmented generation

Wang, Shuai, Khramtsova, Ekaterina, Zhuang, Shengyao and Zuccon, Guido (2024). FeB4RAG: evaluating federated search in the context of retrieval augmented generation. SIGIR '24: 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Washington, DC, United States, 14-18 July 2024. New York, NY, United States: ACM. doi: 10.1145/3626772.3657853

FeB4RAG: evaluating federated search in the context of retrieval augmented generation

2024

Conference Publication

Evaluating generative ad hoc information retrieval

Gienapp, Lukas, Scells, Harrisen, Deckers, Niklas, Bevendorff, Janek, Wang, Shuai, Kiesel, Johannes, Syed, Shahbaz, Fröbe, Maik, Zuccon, Guido, Stein, Benno, Hagen, Matthias and Potthast, Martin (2024). Evaluating generative ad hoc information retrieval. SIGIR '24: 47th International ACM SIGIR Conference on Research and Development in Information Retrieva, Washington, DC, United States, 14-18 July 2024. New York, NY, United States: ACM. doi: 10.1145/3626772.3657849

Evaluating generative ad hoc information retrieval

2024

Journal Article

Report on the Collab-a-Thon at ECIR 2024

MacAvaney, Sean, Roegiest, Adam, Lipani, Aldo, Parry, Andrew, Engelmann, Björn Engelmann, Kreutz, Christin Katharina, Meng, Chuan, Frayling, Erlend, Yang, Eugene, Schlatt, Ferdinand, Faggioli, Guglielmo, Scells, Harrisen, Atanassova, Iana, Friese, Jana, Bevendorff, Janek, Sanz-Cruzado, Javier, Trippas, Johanne, Pathak, Kanaad, Dhole, Kaustubh, Azzopardi, Leif, Fröbe, Maik, Bertin, Marc, Prasad, Nishchal, Zerhoudi, Saber, Wang, Shuai, Chatterjee, Shubham, Jaenich, Thomas, Kruschwitz, Udo, Wang, Xi and Long, Zijun (2024). Report on the Collab-a-Thon at ECIR 2024. ACM SIGIR Forum, 58 (1), 1-11. doi: 10.1145/3687273.3687287

Report on the Collab-a-Thon at ECIR 2024

2024

Conference Publication

Zero-shot generative large language models for systematic review screening automation

Wang, Shuai, Scells, Harrisen, Zhuang, Shengyao, Potthast, Martin, Koopman, Bevan and Zuccon, Guido (2024). Zero-shot generative large language models for systematic review screening automation. 46th European Conference on Information Retrieval, ECIR 2024, Glasgow, United Kingdom, 24-28 March 2024. Cham, Switzerland: Springer Nature Switzerland. doi: 10.1007/978-3-031-56027-9_25

Zero-shot generative large language models for systematic review screening automation

2024

Conference Publication

BERGEN: a benchmarking library for retrieval-augmented generation

Rau, David, Déjean, Hervé, Chirkova, Nadezhda, Formal, Thibault, Wang, Shuai, Clinchant, Stéphane and Nikoulina, Vassilina (2024). BERGEN: a benchmarking library for retrieval-augmented generation. 2024 Conference on Empirical Methods in Natural Language Processing, Miami, FL, United States, 12 - 16 November 2024. Stroudsburg, PA, United States: Association for Computational Linguistics. doi: 10.18653/v1/2024.findings-emnlp.449

BERGEN: a benchmarking library for retrieval-augmented generation

2023

Conference Publication

Generating natural language queries for more effective systematic review screening prioritisation

Wang, Shuai, Scells, Harrisen, Koopman, Bevan, Potthast, Martin and Zuccon, Guido (2023). Generating natural language queries for more effective systematic review screening prioritisation. SIGIR-AP 2023 - Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, Beijing, China, 26 - 28 November 2023. New York, NY United States: Association for Computing Machinery. doi: 10.1145/3624918.3625322

Generating natural language queries for more effective systematic review screening prioritisation

2023

Conference Publication

Can ChatGPT write a good Boolean query for systematic review literature search?

Wang, Shuai, Scells, Harrisen, Koopman, Bevan and Zuccon, Guido (2023). Can ChatGPT write a good Boolean query for systematic review literature search?. 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, Taipei, Taiwan, 23 - 27 July 2023. New York, NY United States: Association for Computing Machinery. doi: 10.1145/3539618.3591703

Can ChatGPT write a good Boolean query for systematic review literature search?

2023

Conference Publication

Balanced topic aware sampling for effective dense retriever: a reproducibility study

Wang, Shuai and Zuccon, Guido (2023). Balanced topic aware sampling for effective dense retriever: a reproducibility study. 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, Taipei, Taiwan, 23 - 27 July 2023. New York, NY United States: Association for Computing Machinery. doi: 10.1145/3539618.3591915

Balanced topic aware sampling for effective dense retriever: a reproducibility study

2023

Conference Publication

MeSH suggester: a library and system for MeSH term suggestion for systematic review Boolean query construction

Wang, Shuai, Li, Hang and Zuccon, Guido (2023). MeSH suggester: a library and system for MeSH term suggestion for systematic review Boolean query construction. Sixteenth ACM International Conference on Web Search and Data Mining, Singapore, Singapore, 27 February - 3 March 2023. New York, NY, United States: ACM. doi: 10.1145/3539597.3573025

MeSH suggester: a library and system for MeSH term suggestion for systematic review Boolean query construction

Past funding

2023 - 2024

From Search to Synthesis: Automating the Systematic Review Creation Process

Universities Australia - Germany Joint Research Co-operation Scheme

Open grant

Availability

Dr Shuai Wang is:: Available for supervision

Looking for a supervisor? Read our advice on how to choose a supervisor.

Enquiries

For media enquiries about Dr Shuai Wang's areas of expertise, story ideas and help finding experts, contact our Media team:

communications@uq.edu.au

External profiles

Personal links

Personal Website

Social media

Update my profile

Shuai Wang

Overview

Background

Availability

Qualifications

Research interests

Efficient, Effective and Adaptive Retrieval

Efficient Retrieval-Augmented Generation (RAG)

Effective Search in the Agent Era

AI for Systematic Reviews and Clinical Evidence

Works

A simple sketch of a disposable coffee cup with a lid. The cup features a green circle in the center, possibly representing a logo or design element. The drawing is outlined in black with minimal detail. Starbucks: Improved Training for 2D Matryoshka Embeddings

AutoBool: Reinforcement-Learned LLM for Effective Automatic Systematic Reviews Boolean Query Generation

2D Matryoshka Training for Information Retrieval

Reassessing Large Language Model Boolean query generation for systematic reviews

Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition

ReSLLM: large language models are strong resource selectors for federated search

An investigation of&nbsp;prompt variations for&nbsp;zero-shot LLM-based rankers

Corpus subsampling: estimating the&nbsp;effectiveness of&nbsp;neural retrieval models on&nbsp;large corpora

Context embeddings for efficient answer generation in retrieval-augmented generation

Large language models based stemming for information retrieval: promises, pitfalls and failures

FeB4RAG: evaluating federated search in the context of retrieval augmented generation

Evaluating generative ad hoc information retrieval

Report on the Collab-a-Thon at ECIR 2024

Zero-shot generative large language models for&nbsp;systematic review screening automation

BERGEN: a benchmarking library for retrieval-augmented generation

Generating natural language queries for more effective systematic review screening prioritisation

Can ChatGPT write a good Boolean query for systematic review literature search?

Balanced topic aware sampling for effective dense retriever: a reproducibility study

MeSH suggester: a library and system for MeSH term suggestion for systematic review Boolean query construction

Funding

Past funding

Supervision

Availability

Media

Enquiries

An investigation of prompt variations for zero-shot LLM-based rankers

Corpus subsampling: estimating the effectiveness of neural retrieval models on large corpora

Zero-shot generative large language models for systematic review screening automation