Discussion about this post

User's avatar
Andreas's avatar

> LiteLlama-460M-1T

A bit wonder, the model only trained on RedPajama which is for pre-training like next word prediction. Can it be directly evaluated on MMLU task? by using zero or few shot?

Expand full comment

No posts