Transformer architecture for machine translation explained


Let’s say we want to translate from English to German.

We can clearly see, that we cannot directly translate each word from one language to the other. For example, the english word “to” can be translated into “der” or “die” depending on the gender of the noun. Also the sentence order can vary greatly…


