Learning Korean with MLX feat. Mistral 7B
Goal
-
Train on Korean data with MLX
-
Build a text-generating AI that writes wuxia novels in the style of Jin Yong
Result: Failure
But it did get a little better at Korean
Training Process
-
Machine: M1 Pro
-
Framework: MLX
-
Training model: Mistral 7B
-
Training data: The Return of the Condor Heroes, converted into JSONL format myself
-
Time spent: 1 hour
- Training method: LoRa
- lora script: pythyon lora.py –model ./mlx_model –train –iters 600
Training Results
Mistral 7B base model: barely speaks Korean

Lora: it does respond

python lora.py –model ./mlx_model –adapter-file ./adapters.npz –max-tokens 50 –prompt “서독 구양봉이 사용하는 무공의 위력은 “
Presumed Cause of Failure
Insufficient Korean training of the model: a model trained on the original English text (Will Durant’s The Story of Civilization, Volume 13, Greece-Rome model training) works well
Lack of data: 8 books’ worth of data seems insufficient
Future Tasks
Collect high-quality data for Korean training
Gain a deeper understanding of MLX
Review successful Korean training examples
20240130
Leave a comment