Dohyeon Lee

I am a Ph.D. Student at the LDI Lab of Seoul National University.

I am fortunate to be advised by Prof. Seung-won Hwang.

My research interests focus on agentic AI on information retrieval system.

Research Area

An Agentic Retrieval System is a retrieval framework in which agents actively interact with their environment to adapt, make decisions, and improve performance over time. The notion of agentic behavior emphasizes that agents are not static components but autonomous actors capable of exploration and adaptation. The environment can take many forms—including data, tools, memory, and other agents—which shape how the agent perceives, learns, and acts within the retrieval ecosystem.

How Agentic Retrieval System Interact with Data

Agents are exposed to new domain-specific corpora and tasked with adapting their retrieval strategies accordingly. Through exploration, they identify informative patterns or signals—such as domain-specific term distributions or statistical cues like inverse document frequency (IDF)—which help refine queries and relevance assessments. This interaction enables adaptive retrieval in unfamiliar or evolving data environments.

How Agentic Retrieval System Interact with Other Agents

In multi-agent retrieval settings, each agent may specialize in a particular domain, retrieval strategy, or reasoning style. Effective collaboration depends on defining clear roles and encouraging complementary behaviors that enhance overall system performance. A key research focus is on quantifying and optimizing synergy among agents, such as minimizing redundancy or maximizing coverage across retrieved information.

How Agentic Retrieval System Interact with Tools

Agents interact with a variety of retrieval tools—such as query reformulators, rerankers, or multiple retrieval backends—and learn how to use them effectively. This includes not only deciding which tools to invoke but also determining what input to provide and how to interpret the output for downstream decision-making. In this view, agents serve as policy models that orchestrate tool usage based on context and task requirements.

How Agentic Retrieval System Interact with Memory

Agents benefit from storing and recalling useful patterns, strategies, or outcomes from past retrieval experiences. This memory may take the form of episodic traces, cached retrieval states, or learned priors that inform future decisions. A central challenge lies in determining what to remember, how to represent it, and when to leverage it to improve ongoing retrieval behavior.

Publications

From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval

Dohyeon Lee, Yeonseok Jeong, Seung-won Hwang

Arxiv 2025

Paper Code

ECoRAG: Evidentiality-guided Compression for Long Context RAG

Yeonseok Jeong, Jinsu Kim, Dohyeon Lee, Seung-won Hwang

Findings of ACL 2025

Paper Code

tRAG: Term-level Retrieval-Augmented Generation for Zero-shot Retrieval

Dohyeon Lee, Jongyoon Kim, Jihyuk Kim, Seung-won Hwang, and Joonsuk Park

NAACL 2025

Paper Slide Poster

Query-focused Referentiability Learning for Zero-shot Retrieval

Jaeyoung Kim, Dohyeon Lee, and Seung-won Hwang

NAACL 2025

Paper Slide Poster

DADA: Distribution-Aware Domain Adaptation of PLMs for Information Retrieval

Dohyeon Lee*, Jongyoon Kim*, Seung-won Hwang, and Joonsuk Park

Findings of ACL 2024

Paper Slide Poster Code

HIL: Hybrid Isotropy Learning for Zero-shot Performance in Dense retrieval

Jaeyoung Kim, Dohyeon Lee, and Seung-won Hwang

NAACL 2024

Paper Slide Poster

Script-mix: Mixing Scripts for Low-resource Language Parsing

Jaeseong Lee, Dohyeon Lee, and Seung-won Hwang

NAACL 2024

Paper Slide Poster

Chaining Event Spans for Temporal Relation Grounding

Jongho Kim, Dohyeon Lee, Minsoo Kim, and Seung-won Hwang

EACL 2024

Paper

Interventional Speech Noise Injection for ASR Generalizable Spoken Language Understanding

YeonJoon Jung, Jaeseong Lee, Seungtaek Choi, Dohyeon Lee, Minsoo Kim, and Seung-won Hwang

EMNLP 2024

Paper

On Complementarity Objectives for Hybrid Retrieval

Dohyeon Lee, Seung-won Hwang, Kyungjae Lee, Seungtaek Choi and, Sunghyun Park

ACL 2023

Paper Slide Poster

C2LIR: Continual Cross-lingual Transfer for Low-Resource Information Retrieval

Jaeseong Lee*, Dohyeon Lee*, Jongho Kim, and Seung-won Hwang

ECIR 2023 (short)

Paper

Script, Language, and Labels: Overcoming Three Discrepancies for Low-Resource Language Specialization

Jaeseong Lee, Dohyeon Lee, and Seung-won Hwang

AAAI 2023

Paper

PLM-based World Models for Text-based Games

Minsoo Kim, Yeonjoon Jung, Dohyeon Lee, and Seung-won Hwang

EMNLP 2022

Paper

Robustifying Multi-hop QA through Pseudo-Evidentiality Training

Kyungjae Lee, Seung-won Hwang, Sang-eun Han, and Dohyeon Lee

ACL/IJCNLP 2021

Paper

SCOPA - Soft Code-Switching and Pairwise Alignment for Zero-Shot Cross-lingual Transfer

Dohyeon Lee*, Jaeseong Lee*, Gyewon Lee, Byung-Gon Chun, and Seung-won Hwang

CIKM 2021 (short)

Paper

Education

Seoul National University - Ph.D. in Computer Science and Engineering

2021 - (In progress)

Advisor: Prof. Seung-won Hwang

Doctoral disseration: (TBD)

Yonsei University - M.S. in Computer Science

2019 - 2021

Advisor: Prof. Seung-won Hwang

Master's thesis: Orthogonal Disentanglement of Semantic and Symbolic Representation for Query-Document Matching

Yonsei University - B.S. in Computer Science

2015 - 2019

GPA: 4.06 / 4.5

Awards & Scholarships

BK21+ Outstanding Research Fellowship

2023

Korean Government Scholarship Program

Computer Science Department Scholarship

2019 - 2021

at Yonsei University (graduate school)

Computer Science Department Scholarship

2017

at Yonsei University (undergraduate)

Experience

NAVER Corp. - Research Intern

2020.3 - 2020.6

Research title: Sparse-Dense Hybrid Retrieval

Mentor: Ph.D. Sunghyun Park

Contact

waylight3@snu.ac.kr

(+82) 010-2001-4214