Exploring the Frontier of AI – Soket AI: Frontier Bits

QK-Clip: Taking Muon one step further on the road to scale up

Translated to English. Original can be found here: QK-Clip：让Muon在Scaleup之路上更进一步

Abhishek Upperwal (Translated) | Original: Jianlin Su

Transfer Learning for Low-Resource Languages in LLMs

Transfer Learning

Indic Languages

Large Language Models (LLMs) rely on huge training corpora – often trillions of tokens – which gives an advantage to high-resource languages like English. In contrast, many…

Abhishek Upperwal

Transferring Weight Distributions for Efficient Language Model Initialization

Transfer Learning

Model Initialization

Training large language models (LLMs) from scratch requires enormous amounts of compute and data. Typically, this involves initializing model weights randomly and then…

Abhishek Upperwal