Hon Hai Research Institute Launches Traditional Chinese LLM With Reasoning Capabilities
10 March
2025, Taipei, Taiwan – Hon Hai Research
Institute announced today the launch of the first Traditional Chinese Large
Language Model (LLM), setting another milestone in the development of Taiwan’s
AI technology with a more efficient and lower-cost model training method
completed in just four weeks.
The
institute, which is backed by Hon Hai Technology Group (“Foxconn”) (TWSE:2317),
the world’s largest electronics manufacturer and leading technological
solutions provider, said the LLM – code named FoxBrain – will be open sourced
and shared publicly in the future. It was originally designed for applications
used in the Group’s internal systems, covering functions such as data analysis,
decision support, document collaboration, mathematics, reasoning and problem
solving, and code generation.
FoxBrain
not only demonstrates powerful comprehension and reasoning capabilities but is
also optimized for Taiwanese users' language style, showing excellent
performance in mathematical and logical reasoning
tests.
"In recent months, the
deepening of reasoning capabilities and the efficient use of GPUs have
gradually become the mainstream development in the field of AI. Our FoxBrain
model adopted a very efficient training strategy, focusing on optimizing the training
process rather than blindly accumulating computing power,” said Dr. Yung-Hui
Li, Director of the Artificial Intelligence Research Center at Hon Hai Research
Institute. ”Through carefully designed training methods and resource
optimization, we have successfully built a local AI model with powerful
reasoning capabilities."
The FoxBrain training
process was powered by 120 NVIDIA
H100 GPUs, scaled with NVIDIA
Quantum-2 InfiniBand networking, and finished in just about four
weeks. Compared with inference models recently launched in the market, the more
efficient and lower-cost model training method sets a new milestone for the
development of Taiwan's AI technology.
FoxBrain is based on the
Meta Llama 3.1 architecture with 70B parameters. In most categories among
TMMLU+ test dataset, it outperforms Llama-3-Taiwan-70B of the same scale,
particularly exceling in mathematics and logical reasoning (For TMMLU+
benchmark of FoxBrain, please refer to Fig.1). The
following are the technical specifications and training strategies for FoxBrain:
●
Established
data augmentation methods and quality assessment for 24 topic categories
through proprietary technology, generating 98B tokens of high-quality
pre-training data for Traditional Chinese
●
Context
window length: 128 K tokens
●
Utilized
120 NVIDIA H100 GPUs for training, with total computational cost of 2,688 GPU
days
●
Employed
multi-node parallel training architecture to ensure high performance and
stability
●
Used
a unique Adaptive Reasoning Reflection technique to train the model in
autonomous reasoning
● Fig. 1:TMMLU+ benchmark results of FoxBrain, Meta-Llama-3.1-70B and
Taiwan-Llama-70B
In test results, FoxBrain
showed comprehensive improvements in mathematics compared to the base Meta Llama
3.1 model. It achieved significant progress in mathematical tests compared to
Taiwan Llama, currently the best Traditional Chinese large model, and surpassed
Meta's current models of the same class in mathematical reasoning ability.
While there is still a slight gap with DeepSeek's distillation model, its
performance is already very close to world-leading standards.
FoxBrain's development –
from data collection, cleaning and augmentation, to Continual Pre-Training,
Supervised Finetuning, RLAIF, and Adaptive Reasoning Reflection – was
accomplished step by step through independent research, ultimately achieving
benefits approaching world-class AI models despite limited computational
resources. This large language model research demonstrates that Taiwan's
technology talent can compete with international counterparts in the AI model
field.
Although FoxBrain was
originally designed for internal group applications, in the future, the Group
will continue to collaborate with technology partners to expand FoxBrain's
applications, share its open-source information, and promote AI in manufacturing,
supply chain management, and intelligent decision-making.
During model training,
NVIDIA provided support through the Taipei-1 Supercomputer and technical
consultation, enabling Hon Hai Research Institute to successfully complete the
model pre-training with NVIDIA NeMo. FoxBrain will also become an important
engine to drive the upgrade of Foxconn’s three major platforms: Smart
Manufacturing. Smart EV. Smart City.
The results of FoxBrain is
scheduled to be shared for the first time at a major conference during NVIDIA
GTC 2025 Session Talk “From
Open Source to Frontier AI: Build, Customize, and Extend Foundation Models”
on March 20.
About Hon Hai Research Institute
The institute has five
research centers. Each center has an average of 40 high technology R&D professionals, all of whom are focused on the
research and development of new technologies, the strengthening of Foxconn’s
technology and product innovation pipeline, efforts to support the Group’s
transformation from "brawn" to "brains", and the
enhancement of the competitiveness of Foxconn’s "3+3" strategy.
2025/03/10