ihit's diary

ちょっとしたメモに

多次元の微分まとめ

ゼロから作るDeep Learningで多次元の微分の式(5.13)がわからなかったので書き下してみた。 個人的な理由からX,Wの掛け算の順序が逆になっているが、本質は変わらないはず。 まずは定義から \begin{eqnarray} \bf{W} &=& \left( \begin{array}{cc} w_{11} & w_{12} \\ w_{21} & w_{22} \\ w_{31} & w_{32} \end{array} \right) \\ \bf{X} &=& \left( \begin{array}{c} x_{1} \\ x_{2} \end{array} \right) \\ \bf{B} &=& \left( \begin{array}{c} b_{1} \\ b_{2} \\ b_{3} \end{array} \right) \\ \bf{Y} &=& \bf{W} \cdot \bf{X} + \bf{B}\\ \end{eqnarray}

こっから実際に多項式微分を書き下してみる \begin{eqnarray} \frac{\partial L}{\partial \bf{B}} &=& \left( \frac{\partial \bf{Y}}{\partial \bf{B}} \right)^{\mathrm{T}} \cdot \frac{\partial L}{\partial \bf{Y}} \\ &=& \left( \frac{\partial \left(\bf{W} \cdot \bf{X} + \bf{B}\right)}{\partial \bf{B}} \right)^{\mathrm{T}} \cdot \frac{\partial L}{\partial \bf{Y}} \\ &=& \left( \frac{\partial \left( \begin{array}{c} w_{11}x_{1}+w_{12}x_2+b_{1} \\ w_{21}x_{1}+w_{22}x_2+b_{2} \\ w_{31}x_{1}+w_{32}x_2+b_{3} \end{array} \right)}{\partial \left( \begin{array}{c} b_{1} \\ b_{2} \\ b_{3} \end{array} \right)} \right)^{\mathrm{T}} \cdot \frac{\partial L}{\partial \bf{Y}} \\ &=& \left( \begin{array}{ccc} \frac{\partial w_{11}x_{1}+w_{12}x_2+b_{1}}{\partial b_{1}} & \frac{\partial w_{11}x_{1}+w_{12}x_2+b_{1}}{\partial b_{2}} & \frac{\partial w_{11}x_{1}+w_{12}x_2+b_{1}}{\partial b_{3}}\\ \frac{\partial w_{21}x_{1}+w_{22}x_2+b_{2}}{\partial b_{1}} & \frac{\partial w_{21}x_{1}+w_{22}x_2+b_{2}}{\partial b_{2}} & \frac{\partial w_{21}x_{1}+w_{22}x_2+b_{2}}{\partial b_{3}}\\ \frac{\partial w_{31}x_{1}+w_{32}x_2+b_{3}}{\partial b_{1}} & \frac{\partial w_{31}x_{1}+w_{32}x_2+b_{3}}{\partial b_{2}} & \frac{\partial w_{31}x_{1}+w_{32}x_2+b_{3}}{\partial b_{3}} \end{array} \right)^{\mathrm{T}} \cdot \frac{\partial L}{\partial \bf{Y}} \\ &=& \left( \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \right) \cdot \frac{\partial L}{\partial \bf{Y}} \\ &=& \frac{\partial L}{\partial \bf{Y}} \end{eqnarray}

お次はこっち \begin{eqnarray} \frac{\partial L}{\partial \bf{X}} &=& \left( \frac{\partial \bf{Y}}{\partial \bf{X}} \right)^{\mathrm{T}} \cdot \frac{\partial L}{\partial \bf{Y}} \\ &=& \left( \frac{\partial \left(\bf{W} \cdot \bf{X} + \bf{B}\right)}{\partial \bf{X}} \right)^{\mathrm{T}} \cdot \frac{\partial L}{\partial \bf{Y}} \\ &=& \left( \begin{array}{ccc} \frac{\partial w_{11}x_{1}+w_{12}x_2+b_{1}}{\partial x_{1}} & \frac{\partial w_{11}x_{1}+w_{12}x_2+b_{1}}{\partial x_{2}}\\ \frac{\partial w_{21}x_{1}+w_{22}x_2+b_{2}}{\partial x_{1}} & \frac{\partial w_{21}x_{1}+w_{22}x_2+b_{2}}{\partial x_{2}}\\ \frac{\partial w_{31}x_{1}+w_{32}x_2+b_{3}}{\partial x_{1}} & \frac{\partial w_{31}x_{1}+w_{32}x_2+b_{3}}{\partial x_{2}} \end{array} \right)^{\mathrm{T}} \cdot \frac{\partial L}{\partial \bf{Y}} \\ &=& \left( \begin{array}{ccc} w_{11} & w_{12}\\ w_{21} & w_{22}\\ w_{31} & w_{32} \end{array} \right)^{\mathrm{T}} \cdot \frac{\partial L}{\partial \bf{Y}} \\ &=& \bf{W}^{\mathrm{T}} \cdot \frac{\partial \it{L}}{\partial \bf{Y}} \end{eqnarray} 最後に行列で偏微分 \begin{eqnarray} \frac{\partial L}{\partial \bf{W}} &=& \frac{\partial L}{\partial \bf{Y}} \cdot \left( \frac{\partial \bf{Y}}{\partial \bf{W}} \right)^{\mathrm{T}}\\ &=& \frac{\partial L}{\partial \bf{Y}} \cdot \left( \frac{\partial \left(\bf{W} \cdot \bf{X}+\bf{B}\right)}{\partial \bf{W}} \right)^{\mathrm{T}}\\ &=& \frac{\partial L}{\partial \bf{Y}} \cdot \bf{X}^{\mathrm{T}}\\ \end{eqnarray} う~んよくわからない…

ここまでやってみて、

  1. 行列計算における連鎖律(なんで転置するの?、なんでWとXで$$\frac{\partial L}{\partial \bf{Y}}$$は逆なのか等)

  2. 行列で微分する の二点が不明であることがわかった。

もうちょい調べてみるか