LLM-jp LLM-jp

News

Mid-Training LLM-jp on OLMo2 Data: Setup, Results, and Practical Tips

We evaluated the Ai2-released OLMo2 mid-training datasets on the LLM-jp model.
The results were clear: applying mid-training led to a significant improvement on GSM8K.
For further details, please see this blog.