推翻OpenAI结论,DeepMind重新定义预训练的参数和规模关系!
大数据文摘
共 2673字,需浏览 6分钟
·
2022-12-22 06:25
Training Compute-Optimal Large Language Models
https://arxiv.org/pdf/2203.15556.pdf
[1]Scaling Laws for Neural Language Models
[2]https://www.lesswrong.com/posts/midXmMb2Xg37F2Kgn/new-scaling-laws-for-large-language-models
[3]https://www.zhihu.com/question/570189639/answer/2787763735
评论