Transformer Engine

What is (TE) Transfermer Engine

Most deep learning frameworks train with FP32 by default. This is not essential, however, to achieve full accuracy for many deep learning models. Using mixed-precision training, which combines single-precision (FP32) with lower precision (FP16) format when training a model, results in significant speedups with minimal differences in accuracy as compared to FP32 training. With Hopper and Ada GPU architecture FP8 precision was introduced, which offers improved performance over FP16 with no degradation in accuracy. Although all major deep learning frameworks support FP16, FP8 support is not available natively in frameworks today.

NVIDIA^® Transformer Engine is a library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.^[1]

TE addresses the problem of FP8 support by providing APIs, a Python API consisting of modules to easily build a Transformer layer as well as a framework-agnostic library in C++ including structs and kernels needed for FP8 training, greatly simplifying mixed precision training for users.

TE is available at open source under Apache-2.0 license, TE supports

Support for FP8 on NVIDIA Hopper and NVIDIA Ada GPUs
Support for optimizations across all precisions (FP16, BF16) on NVIDIA Ampere GPU architecture generations and later
Optimizations (e.g. fused kernels) for Transformer models
Easy-to-use modules for building Transformer layers with FP8 support

Transformer Engine has been integrated with popular LLM frameworks such as:^[2]

DeepSpeed
Hugging Face Accelerate
Lightning
MosaicML Composer
NVIDIA JAX Toolbox
NVIDIA Megatron-LM
NVIDIA NeMo Framework
Amazon SageMaker Model Parallel Library
Levanter
Hugging Face Nanotron - Coming soon!
Colossal-AI - Coming soon!
PeriFlow - Coming soon!
GPT-NeoX - Coming soon!

TE introduction videos

References

[1] ttps://docs.nvidia.com/deeplearning/transformer-engine/

[2] ttps://github.com/NVIDIA/TransformerEngine

[1]

[2]

Transformer Engine

What is (TE) Transfermer Engine

TE introduction videos

References

Navigation menu

Search