• 2026.05.08 (Fri)
  • All articles
  • LOGIN
  • JOIN
Global Economic Times
fashionrunwayshow2026
  • Synthesis
  • World
  • Business
  • Industry
  • ICT
  • Distribution Economy
  • Well+Being
  • Travel
  • Eco-News
  • Education
  • Korean Wave News
  • Opinion
  • Arts&Culture
  • Sports
  • People & Life
    • International Student Report
    • With Ambassador
  • Column
    • Cho Kijo Column
    • Cherry Garden Story
    • Ko Yong-chul Column
    • Kim Seul-Ong Column
    • Lee Yeon-sil Column
  • Photo News
  • New Book Guide
MENU
 
Home > Synthesis

South Korean AI Models Flunk College Entrance Math Exams, Lagging Far Behind Global Leaders

Yim Kwangsoo Correspondent / Updated : 2025-12-15 07:01:13
  • -
  • +
  • Print

(C) Seeking Alpha


SEOUL— A recent performance comparison of South Korea's leading large language models (LLMs), often dubbed "National AI" contenders, revealed a significant gap in mathematical problem-solving ability compared to their international counterparts. The domestic models largely failed to achieve passing grades on standardized mathematics tests, including the highly challenging Suneung (College Scholastic Ability Test).

A research team led by Professor Kim Jong-rak of Sogang University's Department of Mathematics conducted the rigorous assessment. They tested five major South Korean LLMs—Upstage’s Solar Pro-2, LG AI Research’s Exaone 4.0.1, Naver’s HCX-007, SK Telecom’s A.X 4.0 (72B), and NCSOFT’s lightweight model Llama Varco 8B Instruct—against five frontier international models, including GPT-5.1, Gemini 3 Pro Preview, Claude Opus 4.5, Grok 4.1 Fast, and DeepSeek V3.2.

Rigorous Testing Methodology

The researchers administered a total of 50 mathematics problems across two categories:

Suneung (CSAT) Math (20 Problems): The 20 questions were selected as the most difficult from the common subjects, Probability and Statistics, Calculus, and Geometry sections of the highly competitive South Korean CSAT.
Essay-Type/Advanced Math (30 Problems): This set comprised questions from the entrance exams of 10 domestic universities, 10 questions from the Indian university entrance examination, and 10 questions from the mathematics section of the graduate school entrance exam for the University of Tokyo's Faculty of Engineering.
In the initial test comprising the 20 Suneung and 30 essay-type problems, the performance disparity was stark. International models consistently scored high, ranging from 76 to 92 points. In sharp contrast, the South Korean models struggled immensely. Only Solar Pro-2 managed a score of 58 points, while the others languished in the 20s. NCSOFT's Llama Varco 8B Instruct recorded the lowest score, a mere 2 points.

The research team noted that even after designing the domestic models to use Python as a tool to enhance problem-solving accuracy beyond simple inference, the results remained discouraging.

Second Test: EntropyMath Dataset Confirms Lag

The researchers conducted a second test using a proprietary dataset they developed called 'EntropyMath,' which features 100 questions of varying difficulty, from university-level to professorial research standards. Ten selected questions from this set were presented to the 10 AI models.

The results mirrored the first test: International models achieved scores between 82.8 and 90 points, whereas the domestic models were significantly lower, ranging from 7.1 to 53.3 points.

In a third attempt, where the models were given three chances to solve a problem for a correct answer, the international models again demonstrated dominance. Grok 4.1 Fast achieved a perfect score, and the rest of the overseas models scored 90 points. The best-performing domestic model, Solar Pro-2, scored 70 points, followed by Exaone at 60 points. The other domestic contenders, HCX-007, A.X 4.0, and Llama Varco 8B Instruct, recorded 40, 30, and 20 points, respectively.

Call for Improvement and Future Plans

"There was a lot of inquiry about why there was no evaluation of the five domestic sovereign AI models on Suneung problems, so our team conducted this test," Professor Kim explained. "It confirmed that the level of domestic models is significantly behind that of the overseas frontier models."

The research team acknowledged that the domestic models tested were based on existing public versions and plan to conduct a re-evaluation once the updated, dedicated "National AI" versions from each team are officially released.

Professor Kim also announced the launch of a dedicated mathematics leaderboard based on the EntropyMath dataset, with the goal of expanding it to an international standard. He added that the team will improve their proprietary problem-generation algorithms and pipelines to create specialized datasets for domains beyond mathematics, including science, manufacturing, and culture, to contribute to the performance enhancement of domain-specific AI models.

The study was jointly supported by Sogang University's Institute of Mathematical Sciences and Data Science (IMDS) and Deep Fountain.

[Copyright (c) Global Economic Times. All Rights Reserved.]

  • #globaleconomictimes
  • #micorea
  • #mykorea
  • #nammidonganews
  • #singaporenewsk
  • #Samsung
  • #Daewoo
  • #Hyosung
  • #Apple
  • #korea
Yim Kwangsoo Correspondent
Yim Kwangsoo Correspondent

Popular articles

  • LG AI Research Unveils ‘EXAONE 4.5’: A New Multimodal Powerhouse Outperforming Global Tech Giants

  • Ghana Appoints Carlos Queiroz as New Head Coach for 2026 World Cup, Passing Over Paulo Bento

  • Pentagon’s Arsenal Drained by Iran Conflict: Mounting Fears Over Deterrence Gaps in Korea and Taiwan

I like it
Share
  • Facebook
  • X
  • Kakaotalk
  • LINE
  • BAND
  • NAVER
  • https://www.globaleconomictimes.kr/article/1065563947796469 Copy URL copied.
Comments >

Comments 0

Weekly Hot Issue

  • South Korea’s KOSPI Surges to 7th in Global Market Cap, Overtaking Canada and UK
  • Global Pay Parity Demands Shaking Tech Giants: Samsung and SK Hynix Face Rising Labor Unrest in China
  • the 28th Overseas Koreans Literary Awards
  • Ambassador Hyuk-sang Sohn attended the "2026 Educational Community Sports Day" held at the Korean School of Paraguay on Friday, May 1.
  • Official Presentation of Credentials in Paraguay
  • U.S. World Cup "Host City Boom" Fizzles: Hotel Bookings Slump One Month Before Kickoff

Most Viewed

1
Iran Imposes Transit Fees on Strait of Hormuz Amid Escalating Maritime Tensions
2
Korea and Vietnam Forge Strategic Partnership in Science, Technology, and Innovation
3
Kurly Abandons 'All-Paper' Packaging Strategy Amid Rising Cost Pressures
4
Tradition Meets the Public: Chungju’s Gugak Busking
5
80% of Enterprises Hit by 'AI Agent Anomalies': SailPoint Calls for Integrated Identity Governance
광고문의
임시1
임시3
임시2

Hot Issue

Hyundai Motor Group Bets $700 Million on Mexico Amid Trade Policy Volatility

Honda Halts $15B Canada EV Plant Plans Amid Strategic Pivot to Hybrids

Digital Ghosts: The Rise of AI Ex-Partner Replicas and the Ethics of "Technological Mourning"

Kakao Hits Record Q1 Performance: Operating Profit Surges 66% as Focus Shifts to "Agentic AI"

Fashion Runway Show 2026

Global Economic Times
korocamia@naver.com
CEO : LEE YEON-SIL
Publisher : KO YONG-CHUL
Registration number : Seoul, A55681
Registration Date : 2024-10-24
Youth Protection Manager: KO YONG-CHUL
Singapore Headquarters
5A Woodlands Road #11-34 The Tennery. S'677728
Korean Branch
Phone : +82(0)10 4724 5264
#304, 6 Nonhyeon-ro 111-gil, Gangnam-gu, Seoul
Copyright © Global Economic Times All Rights Reserved
  • 에이펙2025
  • APEC2025가이드북TV
  • 반달곰 프로젝트
Search
Category
  • All articles
  • Synthesis
  • World
  • Business
  • Industry
  • ICT
  • Distribution Economy
  • Well+Being
  • Travel
  • Eco-News
  • Education
  • Korean Wave News
  • Opinion
  • Arts&Culture
  • Sports
  • People & Life 
    • 전체
    • International Student Report
    • With Ambassador
  • Column 
    • 전체
    • Cho Kijo Column
    • Cherry Garden Story
    • Ko Yong-chul Column
    • Kim Seul-Ong Column
    • Lee Yeon-sil Column
  • Photo News
  • New Book Guide
  • Multicultural News
  • Jobs & Workers