CV

Curriculum vitae. A PDF version is available via the download button.

Contact Information

Name Jonggeun Lee
Professional Title M.S. Student in Data Science
Email jonggeun.lee@snu.ac.kr

Professional Summary

M.S. student at Seoul National University working on multi-modal foundation models, voice assistants, tool-augmented agents, and post-training & reinforcement learning.

Experience

  • 2024 - 2024

    Hwasung, Korea

    Software Engineering Intern, ML Brain SW Development Team
    Samsung Electronics
    • Research on Retrieval Augmented Generation (RAG) for internal chatbot system
    • Fine-tuned retriever and re-ranker using contrastive learning
    • Improved internal document retrieval performance from 20% to 71% Hit@1 for user queries
  • 2023 - 2024

    Daejeon, Korea

    Undergraduate Research Intern
    KAIST, Data Science and Artificial Intelligence Lab
    • Advisor: Prof. Chanyoung Park
    • Conducted literature review and implementation of representative papers in recommender systems
    • Explored LLM-based recommendation approaches (LLM4Rec, TALLRec)
  • 2023 - 2023

    Seoul, Korea

    Research Intern
    LG AI Research, EXAONE Lab
    • Advisor: Dr. Hyeongu Yun
    • Built a multi-modal (vision + text) data extraction pipeline to curate large-scale pretraining data for the EXAONE foundation model
    • Combined vision-based layout detection models with rule-based parsing to extract structured text, tables, and images from PDF documents
    • Produced 73GB+ of pretraining-grade data, validating scalability toward web-scale corpora
  • 2023 - 2023

    Seoul, Korea

    Software Engineering Intern
    Kounosoft
    • Advisor: Dr. Woongmyung Kim
    • Constructed Arduino-related Q&A dataset for education platform
    • Performed supervised fine-tuning of KoGPT2 model on the custom dataset
    • Developed complete chat interface and system using Vue3.js, FastAPI

Education

  • 2024 - present

    Seoul, Korea

    M.S.
    Seoul National University
    Data Science
    • GPA: 4.08/4.3
    • Advisor: Prof. Yohan Jo
    • Research interests: Multi-modal Foundation Models, Voice Assistants, Tool-augmented Agents, Post-training & Reinforcement Learning
  • 2019 - 2024

    Seoul, Korea

    B.S.
    Korea University
    Industrial Management Engineering
    • GPA: 4.08/4.5 (Major: 4.14/4.5)
    • Dean’s List in College of Engineering (GPA: 4.5/4.5; 2023 Fall)
    • Graduated with Great Honors (with one semester early graduation)

Publications

  • 2026
    Don't Adapt Small Language Models for Tools; Adapt Tool Schemas to the Models
    ACL 2026 Main (acceptance rate: 19%)

    Proposed PA-Tool, a training-free method that adapts tool schemas to align with models’ pretrained knowledge, improving tool-use performance by up to 17% and reducing schema misalignment errors by 80%.

  • 2026
    SpeakerSleuth: Can LALMs Judge Speaker Consistency across Multi-turn Dialogues?
    ACL 2026 Main (acceptance rate: 19%)

    Introduced a benchmark evaluating whether Large Audio-Language Models can reliably judge speaker consistency across multi-turn conversations, revealing significant biases in prioritizing text over acoustics.

  • 2026
    SpokenUS: A Spoken User Simulator for Task-Oriented Dialogue
    Under review; Machine Learning for Audio Workshop @ ICML 2026

    Developed a spoken user simulator that jointly generates text and speech tokens, modeling realistic spoken behaviors (cross-turn slots, barge-in, disfluency, emotion-aware speech) for task-oriented dialogue systems.

  • 2026
    SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents
    ICLR 2026 Oral (acceptance rate: 1.13%)

    Developed a time-accelerated smart home simulation environment with 600 benchmark episodes, revealing that even top models struggle with temporal scheduling and state verification.

  • 2026
    Quantifying Data Contamination in Psychometric Evaluations of LLMs
    EACL 2026 Findings (acceptance rate: 36.2%)

    Proposed a framework to systematically measure data contamination in psychometric evaluations of LLMs, providing evidence of strong contamination in popular inventories.

  • 2025
    Tool-Augmented Agents: Evolution from Autonomy to Interaction
    Korean Institute of Information Scientists and Engineers, Vol. 43, No. 11, pp. 14-25

    Comprehensive survey examining the evolution of tool-augmented agents, focusing on the shift from autonomous capabilities to interactive paradigms in human-centered interaction.

Projects

  • Knowledge Graph Construction from Messenger Conversations

    Industry-academic research project with Samsung Electronics on dynamically extracting user information from multi-session messenger conversations and constructing knowledge graphs for hyper-personalization.

    • Fine-tuned Llama3-8B-Instruct model for knowledge graph extraction from messenger dialogues
    • Achieved 29.4% higher F1-Score than GPT-4o on the extraction benchmark