Data Parallelism: How to Train Deep Learning Models on Multiple GPUs

Data Parallelism: How to Train Deep Learning Models on Multiple GPUs

by
753 753 people viewed this event.

Modern deep learning challenges leverage increasingly larger datasets and more complex models. As a result, significant computational power is required to train models effectively and efficiently. Learning to distribute data across multiple GPUs during training makes possible an incredible wealth of new applications that utilize deep learning.

Effectively using systems with multiple GPUs also reduces training time, allowing for faster application development and much faster iteration cycles. Teams who can train with multiple GPUs have an edge, building models trained on more data in shorter periods and with greater engineer productivity.

This workshop teaches you techniques for data-parallel deep learning training on multiple GPUs to shorten the training time required for data-intensive applications. Working with deep learning tools, frameworks, and workflows to perform neural network training, you’ll learn how to decrease model training time by distributing data to multiple GPUs while retaining the accuracy of training on a single GPU.

In this workshop, attendees will learn how to:

  • Perform data-parallel deep learning training with multiple GPUs
  • Achieve maximum throughput when training for the best use of multiple GPUs
  • Distribute training to multiple GPUs using PyTorch Distributed Data Parallel (DDP)
  • Understand and utilize algorithmic considerations specific to multi-GPU training performance and accuracy

Tools, libraries, and frameworks: PyTorch, PyTorch Distributed Data Parallel, NVIDIA Collective Communications Library (NCCL)

More info and Registration

To register for this event please visit the following URL: https://events.it4i.cz/event/195/registrations/99/ →

 

Date And Time

04-10-23 @ 08:00 to
04-10-23 @ 16:00
 

Location

Share With Friends