Hugging Face发布PyTorch新库「Accelerate」:适用于多GPU、TPU、混合精度训练
多数 PyTorch 高级库都支持分布式训练和混合精度训练,但是它们引入的抽象化往往需要用户学习新的 API 来定制训练循环。许多 PyTorch 用户希望完全控制自己的训练循环,但不想编写和维护训练所需的样板代码。Hugging Face 最近发布的新库 Accelerate 解决了这个问题。
import torch
import torch.nn.functional as F
from datasets import load_dataset
+ from accelerate import Accelerator
+ accelerator = Accelerator()
- device = 'cpu'
+ device = accelerator.device
model = torch.nn.Transformer().to(device)
optim = torch.optim.Adam(model.parameters())
dataset = load_dataset('my_dataset')
data = torch.utils.data.DataLoader(dataset, shuffle=True)
+ model, optim, data = accelerator.prepare(model, optim, data)
model.train()
for epoch in range(10):
for source, targets in data:
source = source.to(device)
targets = targets.to(device)
optimizer.zero_grad()
output = model(source)
loss = F.cross_entropy(output, targets)
+ accelerator.backward(loss)
- loss.backward()
optimizer.step()
import torch
import torch.nn.functional as F
from datasets import load_dataset
+ from accelerate import Accelerator
+ accelerator = Accelerator()
- device = 'cpu'
+ model = torch.nn.Transformer()
- model = torch.nn.Transformer().to(device)
optim = torch.optim.Adam(model.parameters())
dataset = load_dataset('my_dataset')
data = torch.utils.data.DataLoader(dataset, shuffle=True)
+ model, optim, data = accelerator.prepare(model, optim, data)
model.train()
for epoch in range(10):
for source, targets in data:
- source = source.to(device)
- targets = targets.to(device)
optimizer.zero_grad()
output = model(source)
loss = F.cross_entropy(output, targets)
+ accelerator.backward(loss)
- loss.backward()
optimizer.step()
accelerate config
accelerate launch my_script.py --args_to_my_script
accelerator = Accelerator()
model, optim, data = accelerator.prepare(model, optim, data)
accelerator.backward(loss)
CPU
单 GPU
单一节点多 GPU
多节点多 GPU
TPU
带有本地 AMP 的 FP16(路线图上的顶点)
End
声明:部分内容来源于网络,仅供读者学术交流之目的,文章版权归原作者所有。如有不妥,请联系删除。
猜您喜欢:
附下载 |《TensorFlow 2.0 深度学习算法实战》
附下载 | 超100篇!CVPR 2020最全GAN论文梳理汇总!
评论