Tokyo, February 12, 2025 – Rakuten Group, Inc. has announced the release of both Rakuten AI 2.0, the company’s first Japanese large language model (LLM) based on a Mixture of Experts (MoE)*1 architecture, and Rakuten AI 2.0 mini, the company’s first small language model (SLM). Both models were unveiled in December 2024, and after further fine-tuning, Rakuten has released Rakuten AI 2.0 foundation*2 and instruct models*3 along with Rakuten AI 2.0 mini foundation and instruct models to empower companies and professionals developing AI applications.
Rakuten AI 2.0 is an 8x7B MoE model based on the Rakuten AI 7B model released in March 2024. This MoE model is comprised of eight 7 billion parameter models, each as a separate expert. Each individual token is sent to the two most relevant experts, as decided by the router. The experts and router are continually trained together with vast amounts of high-quality Japanese and English language data.
Rakuten AI 2.0 mini is a 1.5 billion parameter dense model, trained from scratch on a mix of English and Japanese data, and developed to enable cost-effective deployments on edge devices for focused use cases. The instruct versions of the model are developed after instruction fine-tuning and preference optimization of the respective foundation models.
All the models are released under the Apache 2.0 license*4 and are available from the official Rakuten Group Hugging Face repository*5. All models can be used commercially for various text generation tasks such as summarizing content, answering questions, general text understanding and building dialogue systems. In addition, the models can be used as a base for building other models.
"I am incredibly proud of how our team has combined data, engineering and science to deliver Rakuten AI 2.0," commented Ting Cai, Chief AI & Data Officer of Rakuten Group. "Our new AI models deliver powerful, cost-effective solutions that empower businesses to make intelligent tradeoffs that accelerate time to value and unlock new possibilities. By sharing open models, we aim to accelerate AI development in Japan. We’re encouraging every business in Japan to build, experiment and grow with AI, and hope to foster a collaborative community that drives progress for all."
Innovative technique for LLM human preferences optimization
During the Rakuten AI 2.0 foundation model fine-tuning process, the development team leveraged the innovative SimPO*6 (Simple Preference Optimization with a Reference-Free Reward) technique for preference optimization. Compared with traditional RLHF (Reinforcement Learning from Human Feedback) or the simplified DPO (Direct Preference Optimization), SimPO combines the benefits of simplicity, stability and efficiency, making it a cost-efficient and practical alternative for fine-tuning AI models to align with human preferences.
Best-in-class Japanese performance
After further fine-tuning of the foundation models for conversational and instruction following abilities, Rakuten conducted model evaluations with the average scores from Japanese MT-Bench*7. Currently, Japanese MT-Bench is the standard for measuring fine-tuned instruct models and specifically measures conversational and instruction following ability.
Comparative scores for Rakuten AI 2.0 instruct model and Rakuten AI 2.0 mini instruct model and other top performing models made by other Japanese companies and academic institutions are given in the following table*8:
- Rakuten Group, Inc.
Rakuten AI 2.0 Large Language Model and Small Language Model Optimized for Japanese Now Available
Rakuten AI 2.0 (LLM)
Instruct Model Name |
Size |
Active Parameters |
Japanese Score |
Llm-jp/llm-jp-3-13b-instruct |
13B |
13B |
5.68 |
Elyza/ELYZA-japanese-Llama-2-13b-instruct |
13B |
13B |
4.09 |
Rakuten/Rakuten AI 2.0-8x7B-instruct |
8x7B(47B) *9 |
13B |
7.08 |
Karakuri-ai/karakuri-lm-8x7b-instruct-v0.1 |
8x7B(47B) |
13B |
5.92 |
Weblab-GENIAC/Tanuki-8x8B-dpo-v1.0 |
47B |
13B |
6.96 |
Cyberagent/calm3-22b-chat |
22B |
22B |
6.93 |
Rakuten AI 2.0 mini (SLM)
Instruct Model Name |
Size |
Japanese Score |
Rakuten/Rakuten AI 2.0-mini-instruct |
1.5B |
4.91 |
Llm-jp/llm-jp-3-1.8b-instruct |
1.8B |
4.70 |
Llm-jp/llm-jp-3-3.7b-instruct |
3.7B |
4.98 |
SakanaAI/EvoLLM-JP-A-v1-7B |
7B |
3.80 |
SakanaAI/EvoLLM-JP-v1-7B |
7B |
4.58 |
Rakuten AI 2.0-instruct is top performing compared with other top open models made by other Japanese companies and academic institutions with a similar number of active parameters. Rakuten AI 2.0-mini-instruct is top performing among similar-sized open models.
Rakuten is continuously pushing the boundaries of innovation to develop best-in-class LLMs for R&D and deliver best-in-class AI services to its customers. By developing models in-house, Rakuten can build up its knowledge and expertise, and create models optimized to support the Rakuten Ecosystem. By making the models open, Rakuten aims to contribute to the open-source community and accelerate the development of local AI applications and Japanese language LLMs.
As new breakthroughs in AI trigger transformations across industries, Rakuten’s AI-nization initiative aims to implement AI in every aspect of its business to drive further growth. Rakuten is committed to making AI a force for good that augments humanity, drives productivity and fosters prosperity.
*1 The Mixture of Experts model architecture is an AI model architecture where the model is divided into multiple sub models, known as experts. During inference and training, only a subset of the experts is activated and used to process the input.
*2 Foundation models are models that have been pre-trained on vast amounts of data and can then be fine-tuned for specific tasks or applications.
*3 An instruct model is a version of a foundation model fine-tuned on instruction-style data. This results in the model replying to prompted instructions.
*4 About the Apache 2.0 License: https://www.apache.org/licenses/LICENSE-2.0
*5 Rakuten Group Official Hugging Face repository: https://huggingface.co/Rakuten
*6 SimPO uses the average probability of model output as the implicit reward instead of relying on a reference model. This method reduces computational overhead and enables preference optimization of larger models: https://arxiv.org/abs/2405.14734
*7 Results of evaluation tests are carried out on Japanese MT-Bench. Japanese MT-Bench is a set of 80 challenging open-ended questions for evaluating chat assistants on eight dimensions: writing, roleplay, reasoning, math, coding, extraction, STEM, humanities: https://github.com/Stability-AI/FastChat/tree/jp-stable/fastchat/llm_judge
Evaluation of responses is conducted with GPT4(gpt-4o-2024-05-13) as a judge, in line with a public leaderboard.
*8 Scores for other models are taken from a public leaderboard maintained by Weights and Biases on January 27, 2025: https://wandb.ai/wandb-japan/llm-leaderboard3/reports/Nejumi-LLM-3--Vmlldzo3OTg2NjM2
*9 The size of 8x7B models are less than 56B due to parameters, except for experts, being shared.