Soket AI: Frontier Bits
About
Exploring the Frontier of AI
Categories
All
(3)
Indic Languages
(1)
LLM
(1)
LLMs
(2)
Model Initialization
(1)
MuonClip
(1)
Optimisation
(1)
Transfer Learning
(2)
QK-Clip: Taking Muon one step further on the road to scale up
LLM
MuonClip
Optimisation
Translated to English. Original can be found here: QK-Clip:让Muon在Scaleup之路上更进一步
Abhishek Upperwal (Translated) | Original: Jianlin Su
Jul 12, 2025
Transfer Learning for Low-Resource Languages in LLMs
LLMs
Transfer Learning
Indic Languages
Large Language Models (LLMs) rely on huge training corpora – often trillions of tokens – which gives an advantage to high-resource languages like English. In contrast, many…
Abhishek Upperwal
Apr 18, 2025
Transferring Weight Distributions for Efficient Language Model Initialization
LLMs
Transfer Learning
Model Initialization
Training large language models (LLMs) from scratch requires enormous amounts of compute and data. Typically, this involves initializing model weights randomly and then…
Abhishek Upperwal
Apr 18, 2025
No matching items