Dohyeon Lee

Dohyeon Lee

Postdoctoral researcher at KAIST (Korea Advanced Institute of Science and Technology) since March 2026.

I received my Ph.D. in Computer Science and Engineering from Seoul National University in 2026, advised by Prof. Seung-won Hwang.

My research focuses on information retrieval, retrieval-augmented generation, LLM-based agents, and agentic memory.

View Designed Profile

Research Areas

Research Statement

My research focuses on information retrieval, retrieval-augmented generation, LLM-based agents, and agentic memory. I am particularly interested in building robust information-seeking agents that can adapt to unseen domains through feedback-driven alignment and long-term experience accumulation.

My work studies how retrieval systems can use corpus-level statistics, retrieval evidence, complementary sparse-dense signals, retrieval-state feedback, and memory mechanisms to improve data generation, query refinement, hybrid retrieval, reasoning-time decision making, and long-horizon agent behavior.

I also study how memories should be generated from past trajectories, retrieved for future tasks, updated as new evidence arrives, and managed efficiently under practical inference constraints. My broader goal is to develop self-improving retrieval agents that remain reliable, efficient, and grounded in external evidence while accumulating useful experience over time.

Robust Information-Seeking Agents

I study agents that use retrieval, retrieval-state feedback, and test-time reasoning to search for information, refine queries, and make grounded decisions across unfamiliar domains.

Hybrid Retrieval and RAG

I work on retrieval systems that combine corpus-level statistics, retrieval evidence, and complementary dense-sparse signals to improve domain adaptation, zero-shot retrieval, data generation, and retrieval-augmented generation.

Agentic Memory

I study how long-term memories can be generated from past trajectories, retrieved for future tasks, updated with new evidence, and managed efficiently under practical inference constraints.

Publications

Beyond Markovian Forgetfulness: Episodic Memory for Reasoning-Intensive Retrieval

Dohyeon Lee, Yeonseok Jeong, and Seung-won Hwang.

D3: Dynamic Docid Decoding for Multi-Intent Generative Retrieval

Jaeyoung Kim*, Dohyeon Lee*, Soona Hong*, and Seung-won Hwang.

From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval

Dohyeon Lee, Yeonseok Jeong, and Seung-won Hwang.

ECoRAG: Evidentiality-guided Compression for Long Context RAG

Yeonseok Jeong, Jinsu Kim, Dohyeon Lee, and Seung-won Hwang.

Query-focused Referentiability Learning for Zero-shot Retrieval

Jaeyoung Kim, Dohyeon Lee, and Seung-won Hwang.

tRAG: Term-level Retrieval-Augmented Generation for Zero-shot Retrieval

Dohyeon Lee, Jongyoon Kim, Jihyuk Kim, Seung-won Hwang, and Joonsuk Park.

Interventional Speech Noise Injection for ASR Generalizable Spoken Language Understanding

YeonJoon Jung, Jaeseong Lee, Seungtaek Choi, Dohyeon Lee, Minsoo Kim, and Seung-won Hwang.

DADA: Distribution-Aware Domain Adaptation of PLMs for Information Retrieval

Dohyeon Lee*, Jongyoon Kim*, Seung-won Hwang, and Joonsuk Park.

HIL: Hybrid Isotropy Learning for Zero-shot Performance in Dense Retrieval

Jaeyoung Kim, Dohyeon Lee, and Seung-won Hwang.

Script-mix: Mixing Scripts for Low-resource Language Parsing

Jaeseong Lee, Dohyeon Lee, and Seung-won Hwang.

Chaining Event Spans for Temporal Relation Grounding

Jongho Kim, Dohyeon Lee, Minsoo Kim, and Seung-won Hwang.

On Complementarity Objectives for Hybrid Retrieval

Dohyeon Lee, Seung-won Hwang, Kyungjae Lee, Seungtaek Choi, and Sunghyun Park.

C2LIR: Continual Cross-lingual Transfer for Low-Resource Information Retrieval

Jaeseong Lee*, Dohyeon Lee*, Jongho Kim, and Seung-won Hwang.

Script, Language, Labels: Overcoming Three Discrepancies for Low-Resource Language Specialization

Jaeseong Lee, Dohyeon Lee, and Seung-won Hwang.

PLM-based World Models for Text-based Games

Minsoo Kim, Yeonjoon Jung, Dohyeon Lee, and Seung-won Hwang.

Robustifying Multi-hop QA through Pseudo-Evidentiality Training

Kyungjae Lee, Seung-won Hwang, Sang-eun Han, and Dohyeon Lee.

SCOPA: Soft Code-Switching, Pairwise Alignment for Zero-Shot Cross-lingual Transfer

Dohyeon Lee*, Jaeseong Lee*, Gyewon Lee, Byung-Gon Chun, and Seung-won Hwang.

Education

Seoul National University - Ph.D. in Computer Science and Engineering

Advisor: Prof. Seung-won Hwang.

Doctoral dissertation: Feedback-Driven Alignment Framework for Robust Retrieval Agents in Unseen Domains.

Yonsei University - M.S. in Computer Science

Advisor: Prof. Seung-won Hwang.

Master's thesis: Orthogonal Disentanglement of Semantic and Symbolic Representation for Query-Document Matching.

Yonsei University - B.S. in Computer Science

GPA: 4.06 / 4.5.

Awards and Scholarships

BK21+ Outstanding Research Fellowship

Computer Science Department Scholarship

Computer Science Department Scholarship

Experience

KAIST - Postdoctoral Researcher

Research on information retrieval, retrieval-augmented generation, LLM-based agents, and agentic memory.

NAVER Corp. - Research Intern

Research title: Sparse-Dense Hybrid Retrieval.

Mentor: Sunghyun Park, Ph.D.

Contact

Email: waylight3@snu.ac.kr
GitHub: GitHub
LinkedIn: LinkedIn
Google Scholar: Google Scholar
ACL Anthology: ACL Anthology
Semantic Scholar: Semantic Scholar
ORCID: ORCID