참고

[1]From KMMLU-REDUX to PRO: A Professional Korean Benchmark Suite for LLM Evaluation, EMNLP 2025 Findings, https://arxiv.org/pdf/2507.08924

[2]KOBALT: KOREAN BENCHMARK FOR ADVANCED LINGUISTIC TASKS, ArXiv Preprint, https://arxiv.org/pdf/2505.16125

[3]Cross-lingual QA: A Key to Unlocking In-context Cross-lingual Performance, ICML 2024 ICL Workshop, https://arxiv.org/pdf/2305.15233