[GEN] prompt engineering vs. prompt tuning (+PEFT, ICL)

Learning Log

[GEN] prompt engineering vs. prompt tuning (+PEFT, ICL)

카제xd 2023. 8. 22. 16:43

1. prompt engineering (uses manually-provided “hard prompts”)

- 우리가 기대한대로 LLM이 좋은 답을 주는 Prompt를 찾는 노력을 Prompt Engineering이라고 한다.

- In-context learning (ICL) is a specific method of prompt engineering where demonstrations of the task are provided to the model as part of the prompt (in natural language); fine tuning과 다르게 LLM 자체는 건드리지 않고, inference 시에(질문할 때) 질문을 잘 해보자는 접근이다. 질문할 때 예시를 몇개 주느냐에 따라 zero-shot인지 few-shot인지로 나뉜다.

2. prompt tuning (uses “soft prompts” that are generated by a small set of learnable parameters.)

prompt tuning: pre-trained 단일 모델이 모든 downstream task에 대해 recycled되므로, frozen model의 효율적인 서비스 이점을 유지한다.

- 사전학습된 LLM의 대부분의 파라미터를 프리징하고 일부의 파라미터만을 파인튜닝하는 Parameter-efficient fine-tuning (PEFT)의 일종

- effective mechanism for learning “soft prompts” to condition frozen language models to perform specific downstream tasks.

- pre-trained model을 freeze하고 입력 텍스트에 추가되는 downstream task당 조정 가능한 k개의 토큰만 허용한다. <The Power of Scale for Parameter-Efficient Prompt Tuning> 논문에서의 튜닝된 prompt는 prompt length가 5 token이라고 가정할 때 task당 20,480개의 매개 변수(규모 5배 이상 감소)만 필요하다.

- Instead of manipulating the input tokens, we use a fixed set of prompts P for each task (e.g. translation or sentiment analysis) and we use a small set of prompt weights WP to manipulate the actual input embedding values that we feed into the base model so as to minimize the cross entropy loss associated with the prompt responses in the dataset.

- 프롬프트 튜닝은 입력 프롬프트와 해당 프롬프트에 대해 사용자가 원하는 출력을 포함하는 특수 데이터셋으로 LLM을 fine-tuning 함으로써 모델의 동작을 최적화하고 향후 유사한 프롬프트를 처리하는 능력을 향상시킬 수 있다.

- 프롬프트 튜닝은 Pre-training된 모델의 weight를 변경하지 않고 모델에 입력되는 프롬프트에 해당하는 weight만을 학습한다.

- 모델 weight 보다 프롬프트 weight가 훨씬 더 작기 때문에 fine-tuning보다 프롬프트 튜닝의 학습 시간이 빠른 장점이 있다.

- Unlike hard prompts, AI-designed soft prompts are unrecognizable to the human eye. Each prompt consists of an embedding, or string of numbers, that distills knowledge from the larger model. One drawback of prompt-tuning is its lack of interpretability. The AI discovers prompts optimized for a given task but can’t explain why it chose those embeddings.

(출처: https://velog.io/@jukyung-j/The-Power-of-Scale-for-Parameter-Efficient-Prompt-Tuning-%EB%A6%AC%EB%B7%B0)

차이:

- 프롬프트 엔지니어링은 출력을 더 많이 제어할 수 있기 때문에 프롬프트 튜닝보다 좀더 효과적이다.

- 프롬프트 엔지니어링은 사람의 입력이 더 많이 필요하기 때문에 프롬프트 튜닝보다 더 많은 시간이 소요된다.

- 프롬프트 튜닝이 더 자동화되어있다. 사용자가 프롬프트를 제공하기만 하면 LLM이 나머지 작업을 수행한다.

- 프롬프트는 모델에 의해 자동으로 생성되며 사람의 입력이 필요하지 않는다.

- One technique that straddles the line between prompt engineering and fine tuning is called “prompt tuning”.

[출처]

1. https://moon-walker.medium.com/the-art-of-prompt-engneering-1-prompt-engineering%EC%9D%B4%EB%9E%80-%EB%AC%B4%EC%97%87%EC%9D%B8%EA%B0%80-4a7a88ce67c

2. https://mlops.community/fine-tuning-vs-prompt-engineering-llms/

3. https://www.kaggle.com/code/shreyasajal/prompt-tuning-bert-commonlit-readability

4. https://jins-sw.tistory.com/51

5. https://research.ibm.com/blog/what-is-ai-prompt-tuning

6. https://velog.io/@jukyung-j/The-Power-of-Scale-for-Parameter-Efficient-Prompt-Tuning-%EB%A6%AC%EB%B7%B0