[BERT] FFNN에서 position별 parameter 공유

Learning Log

[BERT] FFNN에서 position별 parameter 공유

카제xd 2023. 8. 21. 16:18

While the linear transformations are the same across different positions, they use different parameters from layer to layer.

-> BERT에서 FFNN은 position-wise라서, position마다 각자의 FFNN을 가지는데, 각자 독립적으로 가중치가 다르다.

[출처]

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.

'Learning Log' 카테고리의 다른 글

[GEN] Alignment (0)	2023.08.22
[BERT] pad_sequence 차원 이해 (2)	2023.08.21
[BERT] emb_dim과 hid_dim의 구분 (0)	2023.08.21
[BERT] BertSelfOutput에 있는 Linear의 용도 (0)	2023.08.21
[BERT] finetuning시 경고 (lm_head) (0)	2023.08.21

현재글[BERT] FFNN에서 position별 parameter 공유

거꾸로 강을 거슬러 오르는 저 힘찬 자연어들처럼

HTHT2021컨퍼런스, AIEd, HTHT2021, 교육, AI교육, HTHT, HTHT컨퍼런스, 에듀테크, 하이터치하이테크, HighTouchHighTech, Ai, 교육컨퍼런스, 교육혁신,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

거꾸로 강을 거슬러 오르는 저 힘찬 자연어들처럼