無標題文章

#structrual attention

##motivationattention works as a soft-selection module. It could model structural dependencies implicitly.##defination$x=[x_1,...,x_n]$reperesent a sequence of inputs, let q be a query, and$z$be a categorical latent variable with sample space${1,...,n}$.input space is accessed by attention distribution$z\simp(z|x,q)$. The context over a sequence is defined as exectation$c=\mathbb{E}_{z\simp(z|x,q)}f(x,z)$. f(x,z) is an*annotation function*.in this definition,? annotation functino works as a selection function in conventional attention function,$f(x,z)=x_z$. teh context vector can then be computed using a simple sum:$\textbf{c}=\mathbb{E}_{z\simp(z|x,q)}f(x,z)=\sum_{i=1}^np(z=i|x,q)\textbf{x}_i$##method

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容