Profile image

Dohyeon Lee

I am a Ph.D. Student at the LDI Lab of Seoul National University.

I am fortunate to be advised by Prof. Seung-won Hwang.

My research interests focus on information retrieval and hybrid approaches.

Research Area

Representation-level Hybrid Retrieval

This approach aims to combine the precise lexical matching capability inherent to sparse models with the rich semantic representation strength of dense models.

Information Retrieval Hybrid Retrieval Sparse Models Dense Models Lexical Representation Semantic Representation

Distribution-level Hybrid Retrieval

This approach aims to integrate both corpus-wide inverse document frequency (idf) distributions and document-level term distributions into the retrieval training process.

Information Retrieval Hybrid Retrieval Domain Adaptation Zero-shot Retrieval Inverse Document Frequency Term Distribution

Term-level Hybrid Retrieval

This approach aims to augment pseudo-queries by incorporating terms previously unseen during training or even within the target corpus, thereby reducing the mismatch between pseudo-queries and real-world queries by expanding effective vocabulary coverage.

Information Retrieval Hybrid Retrieval Domain Adaptation Zero-shot Retrieval Pseudo-query Generation Query Expansion/Refinement Vocabulary Expansion Out-of-vocabulary Terms

Agent-level Hybrid Retrieval

This approach consists of multiple specialized yet closely coordinated agents: a retriever, a rewriter, and a reranker. Each agent specializes in a specific aspect of the retrieval process, with coordination facilitated through iterative interactions and state-sharing mechanisms. Through this iterative cooperation, these agents progressively enhance retrieval effectiveness and relevance.

Information Retrieval Hybrid Retrieval Domain Adaptation Zero-shot Retrieval Query Expansion/Refinement Multi-agent Systems Cooperative Retrieval

Publications

tRAG: Term-level Retrieval-Augmented Generation for Zero-shot Retrieval

Dohyeon Lee, Jongyoon Kim, Jihyuk Kim, Seung-won Hwang, and Joonsuk Park

NAACL 2025

Query-focused Referentiability Learning for Zero-shot Retrieval

Jaeyoung Kim, Dohyeon Lee, and Seung-won Hwang

NAACL 2025

DADA: Distribution-Aware Domain Adaptation of PLMs for Information Retrieval

Dohyeon Lee*, Jongyoon Kim*, Seung-won Hwang, and Joonsuk Park

Findings of ACL 2024

HIL: Hybrid Isotropy Learning for Zero-shot Performance in Dense retrieval

Jaeyoung Kim, Dohyeon Lee, and Seung-won Hwang

NAACL 2024

Script-mix: Mixing Scripts for Low-resource Language Parsing

Jaeseong Lee, Dohyeon Lee, and Seung-won Hwang

NAACL 2024

Chaining Event Spans for Temporal Relation Grounding

Jongho Kim, Dohyeon Lee, Minsoo Kim, and Seung-won Hwang

EACL 2024

Interventional Speech Noise Injection for ASR Generalizable Spoken Language Understanding

YeonJoon Jung, Jaeseong Lee, Seungtaek Choi, Dohyeon Lee, Minsoo Kim, and Seung-won Hwang

EMNLP 2024

On Complementarity Objectives for Hybrid Retrieval

Dohyeon Lee, Seung-won Hwang, Kyungjae Lee, Seungtaek Choi and, Sunghyun Park

ACL 2023

C2LIR: Continual Cross-lingual Transfer for Low-Resource Information Retrieval

Jaeseong Lee*, Dohyeon Lee*, Jongho Kim, and Seung-won Hwang

ECIR 2023 (short)

Script, Language, and Labels: Overcoming Three Discrepancies for Low-Resource Language Specialization

Jaeseong Lee, Dohyeon Lee, and Seung-won Hwang

AAAI 2023

PLM-based World Models for Text-based Games

Minsoo Kim, Yeonjoon Jung, Dohyeon Lee, and Seung-won Hwang

EMNLP 2022

Robustifying Multi-hop QA through Pseudo-Evidentiality Training

Kyungjae Lee, Seung-won Hwang, Sang-eun Han, and Dohyeon Lee

ACL/IJCNLP 2021

SCOPA - Soft Code-Switching and Pairwise Alignment for Zero-Shot Cross-lingual Transfer

Dohyeon Lee*, Jaeseong Lee*, Gyewon Lee, Byung-Gon Chun, and Seung-won Hwang

CIKM 2021 (short)

Educaiton

Seoul National University - Ph.D. in Computer Science and Engineering

2021 - (In progress)

Advisor: Prof. Seung-won Hwang

Doctoral disseration: (TBD)

Yonsei University - M.S. in Computer Science

2019 - 2021

Advisor: Prof. Seung-won Hwang

Master's thesis: Orthogonal Disentanglement of Semantic and Symbolic Representation for Query-Document Matching

Yonsei University - B.S. in Computer Science

2015 - 2019

GPA: 4.06 / 4.5

Awards & Scholarships

BK21+ Outstanding Research Fellowship

2023

Korean Government Scholarship Program

Computer Science Department Scholarship

2019 - 2021

at Yonsei University (graduate school)

Computer Science Department Scholarship

2017

at Yonsei University (undergraduate)

Experience

NAVER Corp. - Research Intern

2020.3 - 2020.6

Research title: Sparse-Dense Hybrid Retrieval

Mentor: Ph.D. Sunghyun Park

Skills

Programming Language

Python 10+ years
C/C++ 5+ years
C# 5+ years
Java 3+ years

Deep Learning Frameworks

PyTorch 7+ years
Huggingface 5+ years
Ray 5+ years
Pyserini 3+ years

Languages

Korean Native
English TOEFL 85 / TEPS 321
Japanese Second foreign language

Contact

waylight3@snu.ac.kr
(+82) 010-2001-4214