1 Comment

> LiteLlama-460M-1T

A bit wonder, the model only trained on RedPajama which is for pre-training like next word prediction. Can it be directly evaluated on MMLU task? by using zero or few shot?

Expand full comment