Few-Shot Learning助力机器学习模型提升性能

Cross post from huggingface.co/blog

In many Machine Learning applications, the amount of available labeled data is a barrier to producing a high-performing model. The latest developments in NLP show that you can overcome this limitation by providing a few examples at inference time with a large language model - a technique known as Few-Shot Learning. In this blog post, we'll explain what Few-Shot Learning is, and explore how a large language model called GPT-Neo, and the 🤗 Accelerated Inference API, can be used to generate your own predictions.

What is Few-Shot Learning?

Few-Shot Learning refers to the practice of feeding a machine learning model with a very small amount of training data to guide its predictions, like a few examples at inference time, as opposed to standard fine-tuning techniques which require a relatively large amount of training data for the pre-trained model to adapt to the desired task with accuracy.

This technique has been mostly used in computer vision, but with some of the latest Language Models, like EleutherAI GPT-Neo and OpenAI GPT-3, we can now use it in Natural Language Processing (NLP).

In NLP, Few-Shot Learning can be used with Large Language Models, which have learned to perform a wide number of tasks implicitly during their pre-training on large text datasets. This enables the model to generalize, that is to understand related but previously unseen tasks, with just a few examples.

Few-Shot NLP examples consist of three main components:

Task Description

Examples

Prompt

Image from

Language Models are Few-Shot Learners

Creating these few-shot examples can be tricky, since you need to articulate the “task” you want the model to perform through them. A common issue is that models, especially smaller ones, are very sensitive to the way the examples are written.

An approach to optimize Few-Shot Learning in production is to learn a common representation for a task and then train task-specific classifiers on top of this representation.

OpenAI showed in the GPT-3 Paper that the few-shot prompting ability improves with the number of language model parahttps://www.philschmid.de/static/blog/few-shot-learning-gpt-neo/few-shot-performance.png/few-shot-performance.png" alt="few-shot-performance">

Image from

Language Models are Few-Shot Learners

Let's now take a look at how at how GPT-Neo and the 🤗 Accelerated Inference API can be used to generate your own Few-Shot Learning predictions!

What is GPT-Neo?

GPT⁠-⁠Neo is a family of transformer-based language models from EleutherAI based on the GPT architecture. EleutherAI's primary goal is to train a model that is equivalent in size to GPT⁠-⁠3 and make it available to the public under an open license.

All of the currently available GPT-Neo checkpoints are trained with the Pile dataset, a large text corpus that is extensively documented in (Gao et al., 2021). As such, it is expected to function better on the text that matches the distribution of its training text; we recommend keeping this in mind when designing your examples.

🤗 Accelerated Inference API

The Accelerated Inference API is our hosted service to run inference on any of the 10,000+ models publicly available on the 🤗 Model Hub, or your own private models, via simple API calls. The API includes acceleration on CPU and GPU with up to 100x speedup compared to out of the box deployment of Transformers.

To integrate Few-Shot Learning predictions with GPT-Neo in your own apps, you can use the 🤗 Accelerated Inference API with the code snippet below. You can find your API Token here, if you don't have an account you can get started here.

import jsonimport requests API_TOKEN = "" def query(payload='',parameters=None,options={'use_cache': False}):    API_URL = "https://api-inference.huggingface.co/models/EleutherAI/gpt-neo-2.7B"       headers = {"Authorization": f"Bearer {API_TOKEN}"}    body = {"inputs":payload,'parameters':parameters,'options':options}    response = requests.request("POST", API_URL, headers=headers, data= json.dumps(body))    try:      response.raise_for_status()    except requests.exceptions.HTTPError:        return "Error:"+" ".join(response.json()['error'])    else:      return response.json()[0]['generated_text'] parameters = {    'max_new_tokens':25,  # number of generated tokens    'temperature': 0.5,   # controlling the randomness of generations    'end_sequence': "###" # stopping sequence for generation} prompt="...."             # few-shot prompt data = query(prompt,parameters,options)

Practical Insights

Here are some practical insights, which help you get started using GPT-Neo and the 🤗 Accelerated Inference API.

Since GPT-Neo (2.7B) is about 60x smaller than GPT-3 (175B), it does not generalize so well to zero-shot problems and needs 3-4 examples to achieve good results. When you provide more examples GPT-Neo understands the task and takes the end_sequence into account, which allows us to controlhttps://www.philschmid.de/static/blog/few-shot-learning-gpt-neo/insights-benefit-of-examples.png-learning-gpt-neo/insights-benefit-of-examples.png" alt="insights-benefit-of-examples">

The hyperparameter End Sequence, Token Length & Temperature can be used to control the text-generation of the model and you can use this to your advantage to solve the task you need. The Temperature controlls the randomness of your generations, lower temperature results in less random generations ahttps://www.philschmid.de/static/blog/few-shot-learning-gpt-neo/insights-benefit-of-hyperparameter.pngtatic/blog/few-shot-learning-gpt-neo/insights-benefit-of-hyperparameter.png" alt="insights-benefit-of-hyperparameter">

In the example, you can see how important it is to define your hyperparameter. These can make the difference between solving your task or failing miserably.

To use GPT-Neo or any Hugging Face model in your own application, you can start a free trial of the 🤗 Accelerated Inference API.If you need help mitigating bias in models and AI systems, or leveraging Few-Shot Learning, the 🤗 Expert Acceleration Program can offer your team direct premium support from the Hugging Face team.

What is Few-Shot Learning?

What is GPT-Neo?

🤗 Accelerated Inference API

Practical Insights

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签