
Topic 1 The Bottleneck Of Sequential Models Introduction
Break the sequential bottleneck! Compare RNN O(n) constraints with Transformer parallelization. Analyze hardware efficiency and the shift to self-attention.
Content adapted from Attention Is All You Need by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin.Original Source