Dohyeon Lee - Designed Academic Profile

Contact

Email waylight3@snu.ac.kr
GitHub GitHub
LinkedIn LinkedIn
Google Scholar Google Scholar
ACL Anthology ACL Anthology
Semantic Scholar Semantic Scholar
ORCID ORCID

Research Areas

Retrieval, agents, and memory

Research Statement

My research focuses on information retrieval, retrieval-augmented generation, LLM-based agents, and agentic memory. I am particularly interested in building robust information-seeking agents that can adapt to unseen domains through feedback-driven alignment and long-term experience accumulation.

My broader goal is to develop self-improving retrieval agents that remain reliable, efficient, and grounded in external evidence while accumulating useful experience over time.

Information Retrieval Retrieval-Augmented Generation LLM Agents Agentic Memory Domain Adaptation Test-time Reasoning

Robust Information-Seeking Agents

I study agents that use retrieval, retrieval-state feedback, and reasoning-time decision making to search, refine queries, and stay grounded in external evidence across unseen domains.

Agentic Retrieval Query Refinement Grounded Reasoning

Hybrid Retrieval and RAG

I build retrieval systems that use corpus-level statistics, retrieval evidence, and complementary dense-sparse signals to improve data generation, hybrid retrieval, and retrieval-augmented generation.

Dense and Sparse Retrieval Hybrid Retrieval RAG

Feedback-Driven Adaptation

I am interested in feedback-driven alignment for retrieval agents that adapt to new domains through retrieval-state feedback, evidence use, and zero-shot or test-time reasoning signals.

Domain Adaptation Zero-shot Retrieval Retrieval Feedback

Agentic Memory

I study how memories should be generated from past trajectories, retrieved for future tasks, updated as evidence changes, and managed efficiently under practical inference constraints.

Long-term Memory Memory-augmented Agents Experience Accumulation

Publications

Selected peer-reviewed work

ACL 2026

Beyond Markovian Forgetfulness: Episodic Memory for Reasoning-Intensive Retrieval

Dohyeon Lee, Yeonseok Jeong, and Seung-won Hwang.

EACL 2026

D3: Dynamic Docid Decoding for Multi-Intent Generative Retrieval

Jaeyoung Kim*, Dohyeon Lee*, Soona Hong*, and Seung-won Hwang.

Industry Track.

Paper

Findings of EMNLP 2025

From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval

Dohyeon Lee, Yeonseok Jeong, and Seung-won Hwang.

Paper

Findings of ACL 2025

ECoRAG: Evidentiality-guided Compression for Long Context RAG

Yeonseok Jeong, Jinsu Kim, Dohyeon Lee, and Seung-won Hwang.

Paper

NAACL 2025

Query-focused Referentiability Learning for Zero-shot Retrieval

Jaeyoung Kim, Dohyeon Lee, and Seung-won Hwang.

Paper

NAACL 2025

tRAG: Term-level Retrieval-Augmented Generation for Zero-shot Retrieval

Dohyeon Lee, Jongyoon Kim, Jihyuk Kim, Seung-won Hwang, and Joonsuk Park.

Paper

EMNLP 2024

Interventional Speech Noise Injection for ASR Generalizable Spoken Language Understanding

YeonJoon Jung, Jaeseong Lee, Seungtaek Choi, Dohyeon Lee, Minsoo Kim, and Seung-won Hwang.

Paper

Findings of ACL 2024

DADA: Distribution-Aware Domain Adaptation of PLMs for Information Retrieval

Dohyeon Lee*, Jongyoon Kim*, Seung-won Hwang, and Joonsuk Park.

Paper Slides Poster

NAACL 2024

HIL: Hybrid Isotropy Learning for Zero-shot Performance in Dense Retrieval

Jaeyoung Kim, Dohyeon Lee, and Seung-won Hwang.

Paper Slides Poster

NAACL 2024

Script-mix: Mixing Scripts for Low-resource Language Parsing

Jaeseong Lee, Dohyeon Lee, and Seung-won Hwang.

Paper Slides Poster

EACL 2024

Chaining Event Spans for Temporal Relation Grounding

Jongho Kim, Dohyeon Lee, Minsoo Kim, and Seung-won Hwang.

Paper

ACL 2023

On Complementarity Objectives for Hybrid Retrieval

Dohyeon Lee, Seung-won Hwang, Kyungjae Lee, Seungtaek Choi, and Sunghyun Park.

Paper Slides Poster

ECIR 2023

C2LIR: Continual Cross-lingual Transfer for Low-Resource Information Retrieval

Jaeseong Lee*, Dohyeon Lee*, Jongho Kim, and Seung-won Hwang.

Short paper.

Paper

AAAI 2023

Script, Language, Labels: Overcoming Three Discrepancies for Low-Resource Language Specialization

Jaeseong Lee, Dohyeon Lee, and Seung-won Hwang.

Paper

EMNLP 2022

PLM-based World Models for Text-based Games

Minsoo Kim, Yeonjoon Jung, Dohyeon Lee, and Seung-won Hwang.

Paper

ACL/IJCNLP 2021

Robustifying Multi-hop QA through Pseudo-Evidentiality Training

Kyungjae Lee, Seung-won Hwang, Sang-eun Han, and Dohyeon Lee.

Paper

CIKM 2021

SCOPA: Soft Code-Switching, Pairwise Alignment for Zero-Shot Cross-lingual Transfer

Dohyeon Lee*, Jaeseong Lee*, Gyewon Lee, Byung-Gon Chun, and Seung-won Hwang.

Short paper.

Paper

Education

Seoul National University - Ph.D. in Computer Science and Engineering

2021 - 2026

Advisor: Prof. Seung-won Hwang.

Doctoral dissertation: Feedback-Driven Alignment Framework for Robust Retrieval Agents in Unseen Domains.
Yonsei University - M.S. in Computer Science

2019 - 2021

Advisor: Prof. Seung-won Hwang.

Master's thesis: Orthogonal Disentanglement of Semantic and Symbolic Representation for Query-Document Matching.
Yonsei University - B.S. in Computer Science

2015 - 2019

GPA: 4.06 / 4.5.

Awards and Scholarships

BK21+ Outstanding Research Fellowship

2023. Korean Government Scholarship Program.
Computer Science Department Scholarship

2019 - 2021. Yonsei University, graduate school.
Computer Science Department Scholarship

2017. Yonsei University, undergraduate.

Experience

KAIST - Postdoctoral Researcher

March 2026 - present

Research on information retrieval, retrieval-augmented generation, LLM-based agents, and agentic memory.

NAVER Corp. - Research Intern

March 2020 - June 2020

Research title: Sparse-Dense Hybrid Retrieval. Mentor: Sunghyun Park, Ph.D.

Research Areas

Research Statement

Robust Information-Seeking Agents

Hybrid Retrieval and RAG

Feedback-Driven Adaptation

Agentic Memory

Publications

Beyond Markovian Forgetfulness: Episodic Memory for Reasoning-Intensive Retrieval

D3: Dynamic Docid Decoding for Multi-Intent Generative Retrieval

From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval

ECoRAG: Evidentiality-guided Compression for Long Context RAG

Query-focused Referentiability Learning for Zero-shot Retrieval

tRAG: Term-level Retrieval-Augmented Generation for Zero-shot Retrieval

Interventional Speech Noise Injection for ASR Generalizable Spoken Language Understanding

DADA: Distribution-Aware Domain Adaptation of PLMs for Information Retrieval

HIL: Hybrid Isotropy Learning for Zero-shot Performance in Dense Retrieval

Script-mix: Mixing Scripts for Low-resource Language Parsing

Chaining Event Spans for Temporal Relation Grounding

On Complementarity Objectives for Hybrid Retrieval

C2LIR: Continual Cross-lingual Transfer for Low-Resource Information Retrieval

Script, Language, Labels: Overcoming Three Discrepancies for Low-Resource Language Specialization

PLM-based World Models for Text-based Games

Robustifying Multi-hop QA through Pseudo-Evidentiality Training

SCOPA: Soft Code-Switching, Pairwise Alignment for Zero-Shot Cross-lingual Transfer

Education

Seoul National University - Ph.D. in Computer Science and Engineering

Yonsei University - M.S. in Computer Science

Yonsei University - B.S. in Computer Science

Awards and Scholarships

BK21+ Outstanding Research Fellowship

Computer Science Department Scholarship

Computer Science Department Scholarship

Experience

KAIST - Postdoctoral Researcher

NAVER Corp. - Research Intern