Dohyeon Lee - Academic Profile

Research Areas

Research Statement

My research focuses on information retrieval, retrieval-augmented generation, LLM-based agents, and agentic memory. I am particularly interested in building robust information-seeking agents that can adapt to unseen domains through feedback-driven alignment and long-term experience accumulation.

My work studies how retrieval systems can use corpus-level statistics, retrieval evidence, complementary sparse-dense signals, retrieval-state feedback, and memory mechanisms to improve data generation, query refinement, hybrid retrieval, reasoning-time decision making, and long-horizon agent behavior.

I also study how memories should be generated from past trajectories, retrieved for future tasks, updated as new evidence arrives, and managed efficiently under practical inference constraints. My broader goal is to develop self-improving retrieval agents that remain reliable, efficient, and grounded in external evidence while accumulating useful experience over time.

Keywords: Information Retrieval, Retrieval-Augmented Generation, LLM Agents, Agentic Retrieval, Agentic Memory, Long-term Memory, Domain Adaptation, Zero-shot Retrieval, Dense and Sparse Retrieval, Hybrid Retrieval, Query Refinement, Test-time Reasoning, Memory-augmented Agents.

Robust Information-Seeking Agents

I study agents that use retrieval, retrieval-state feedback, and test-time reasoning to search for information, refine queries, and make grounded decisions across unfamiliar domains.

Hybrid Retrieval and RAG

I work on retrieval systems that combine corpus-level statistics, retrieval evidence, and complementary dense-sparse signals to improve domain adaptation, zero-shot retrieval, data generation, and retrieval-augmented generation.

Agentic Memory

I study how long-term memories can be generated from past trajectories, retrieved for future tasks, updated with new evidence, and managed efficiently under practical inference constraints.

Publications

Beyond Markovian Forgetfulness: Episodic Memory for Reasoning-Intensive Retrieval

Dohyeon Lee, Yeonseok Jeong, and Seung-won Hwang.

ACL 2026.

D3: Dynamic Docid Decoding for Multi-Intent Generative Retrieval

Jaeyoung Kim*, Dohyeon Lee*, Soona Hong*, and Seung-won Hwang.

EACL 2026, Industry Track.

Paper

From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval

Dohyeon Lee, Yeonseok Jeong, and Seung-won Hwang.

Findings of EMNLP 2025.

Paper

ECoRAG: Evidentiality-guided Compression for Long Context RAG

Yeonseok Jeong, Jinsu Kim, Dohyeon Lee, and Seung-won Hwang.

Findings of ACL 2025.

Paper

Query-focused Referentiability Learning for Zero-shot Retrieval

Jaeyoung Kim, Dohyeon Lee, and Seung-won Hwang.

NAACL 2025.

Paper

tRAG: Term-level Retrieval-Augmented Generation for Zero-shot Retrieval

Dohyeon Lee, Jongyoon Kim, Jihyuk Kim, Seung-won Hwang, and Joonsuk Park.

NAACL 2025.

Paper

Interventional Speech Noise Injection for ASR Generalizable Spoken Language Understanding

YeonJoon Jung, Jaeseong Lee, Seungtaek Choi, Dohyeon Lee, Minsoo Kim, and Seung-won Hwang.

EMNLP 2024.

Paper

DADA: Distribution-Aware Domain Adaptation of PLMs for Information Retrieval

Dohyeon Lee*, Jongyoon Kim*, Seung-won Hwang, and Joonsuk Park.

Findings of ACL 2024.

Paper; Slides; Poster

HIL: Hybrid Isotropy Learning for Zero-shot Performance in Dense Retrieval

Jaeyoung Kim, Dohyeon Lee, and Seung-won Hwang.

NAACL 2024.

Paper; Slides; Poster

Script-mix: Mixing Scripts for Low-resource Language Parsing

Jaeseong Lee, Dohyeon Lee, and Seung-won Hwang.

NAACL 2024.

Paper; Slides; Poster

Chaining Event Spans for Temporal Relation Grounding

Jongho Kim, Dohyeon Lee, Minsoo Kim, and Seung-won Hwang.

EACL 2024.

Paper

On Complementarity Objectives for Hybrid Retrieval

Dohyeon Lee, Seung-won Hwang, Kyungjae Lee, Seungtaek Choi, and Sunghyun Park.

ACL 2023.

Paper; Slides; Poster

C2LIR: Continual Cross-lingual Transfer for Low-Resource Information Retrieval

Jaeseong Lee*, Dohyeon Lee*, Jongho Kim, and Seung-won Hwang.

ECIR 2023, short paper.

Paper

Script, Language, Labels: Overcoming Three Discrepancies for Low-Resource Language Specialization

Jaeseong Lee, Dohyeon Lee, and Seung-won Hwang.

AAAI 2023.

Paper

PLM-based World Models for Text-based Games

Minsoo Kim, Yeonjoon Jung, Dohyeon Lee, and Seung-won Hwang.

EMNLP 2022.

Paper

Robustifying Multi-hop QA through Pseudo-Evidentiality Training

Kyungjae Lee, Seung-won Hwang, Sang-eun Han, and Dohyeon Lee.

ACL/IJCNLP 2021.

Paper

SCOPA: Soft Code-Switching, Pairwise Alignment for Zero-Shot Cross-lingual Transfer

Dohyeon Lee*, Jaeseong Lee*, Gyewon Lee, Byung-Gon Chun, and Seung-won Hwang.

CIKM 2021, short paper.

Paper

Education

Seoul National University - Ph.D. in Computer Science and Engineering

2021 - 2026.

Advisor: Prof. Seung-won Hwang.

Doctoral dissertation: Feedback-Driven Alignment Framework for Robust Retrieval Agents in Unseen Domains.

Yonsei University - M.S. in Computer Science

2019 - 2021.

Advisor: Prof. Seung-won Hwang.

Master's thesis: Orthogonal Disentanglement of Semantic and Symbolic Representation for Query-Document Matching.

Yonsei University - B.S. in Computer Science

2015 - 2019.

GPA: 4.06 / 4.5.

Awards and Scholarships

BK21+ Outstanding Research Fellowship

2023. Korean Government Scholarship Program.

Computer Science Department Scholarship

2019 - 2021. Yonsei University, graduate school.

Computer Science Department Scholarship

2017. Yonsei University, undergraduate.

Experience

KAIST - Postdoctoral Researcher

March 2026 - present.

Research on information retrieval, retrieval-augmented generation, LLM-based agents, and agentic memory.

NAVER Corp. - Research Intern

March 2020 - June 2020.

Research title: Sparse-Dense Hybrid Retrieval.

Mentor: Sunghyun Park, Ph.D.

Contact

Email: waylight3@snu.ac.kr
GitHub: GitHub
LinkedIn: LinkedIn
Google Scholar: Google Scholar
ACL Anthology: ACL Anthology
Semantic Scholar: Semantic Scholar
ORCID: ORCID