silicon valley stories: Encoder Decoder Explained

Thursday, May 28, 2020

Encoder Decoder Explained

Neural Machine Translation

x1,x2,....,xTx - input sentence

y1,y2,....yTy - output sentence

Tx!=Ty, which means the length of the input sentence can be different from the output sentence

The problem with regular encoder decoder architectures arise when we have long sentences because RNNs dont do well in these scenarios. For eg : while translating a long sentence we probably dont read the whole sentence and then translate it. The human mind probably reads parts of the sentence and then processes the translation. This leads us to attention models. While translating a word, it weighs in the inputs to the word differently. We will cover attention models in a separate post. We will explore another encoder decoder architecture where the input is an image, hence the encoder produces an image encoding.

Image Caption Generation

Encoder : Alexnet or any other Computer Vision model can generate the image encoding

Decoder : RNN like architecture can decode the encoding to create the image caption

y1, y2..., yT - image encoding

silicon valley stories

Pages

Thursday, May 28, 2020

Encoder Decoder Explained

Neural Machine Translation

Image Caption Generation

No comments:

Post a Comment

Books I am reading