transformer architecture

Transformer architecture is a neural network design that models sequence dependencies using self-attention instead of recurrence or convolutions.

A standard transformer stacks encoder and decoder blocks composed of multihead self-attention and positionwise feed-forward layers, wrapped with residual connections and layer normalization.

Transformers can be specialized for different goals, such as encoder-only models for representation and discrimination, decoder-only models for autoregressive generation, and encoder–decoder models for sequence-to-sequence tasks.

Tutorial

Hugging Face Transformers: Leverage Open-Source AI in Python

As the AI boom continues, the Hugging Face platform stands out as the leading open-source model hub. In this tutorial, you'll get hands-on experience with Hugging Face and the Transformers library in Python.

intermediate ai

For additional information on related topics, take a look at the following resources:

PyTorch vs TensorFlow for Your Python Deep Learning Project (Tutorial)
Hugging Face Transformers (Quiz)
Python Deep Learning: PyTorch vs Tensorflow (Course)

By Leodanis Pozo Ramos • Updated May 29, 2026

AI Coding Glossary Share Feedback

transformer architecture

Related Resources

Hugging Face Transformers: Leverage Open-Source AI in Python