Shijie Yao

Since its debut in Neural Machine Translation(NMT), attention mechanism has been witnessed widely and frequently in various studies of Natural Language Processing(NLP) and Computer Vision(CV). As I am unfamiliar with CV, here I will only talk a bit about what attention is by intuition and in implementation, how it works and hopefully why it proves useful.

The editor converts LaTeX equations in double-dollars $$: $ax^2+bx+c=0$ . All equations are rendered as block equations. If you need inline ones, you can add the prefix \inline: $\inline p={1\over q}$ . But it is a good practice to place big equations on separate lines: