无服务器BERT部署实践

4 months ago I wrote the article"Serverless BERT with HuggingFace and AWS Lambda",which demonstrated how to use BERT in a serverless way with AWS Lambda and the Transformers Library from HuggingFace.

In this article, I already predicted that"BERT and its fellow friends RoBERTa, GPT-2, ALBERT, and T5 will drive business and business ideas in the next few years and will change/disrupt business areas like the internet once did."

Since then the usage of BERT in Google Search increased from10% of English queries to almost100% of English-based queries. But that's not it. Google powers now over70 languages with BERT for Google Search.

https://youtu.be/ZL5x3ovujiM?t=484

In this article, we are going to tackle all the drawbacks from my previous article like model load time, and dependencysize, and usage.

We are going to build the same "Serverless BERT powered Question-Answering API" as last time. But instead of usingcompressing techniques to fit our Python dependencies into our AWS Lambda function, we are using a tool calledefsync. I built efsync to automatically upload dependencies to an AWS EFSfilesystem and then mount them into our AWS Lambda function. This allows us to include our machine learning model intoour function without the need to load it from S3.

TL;DR;

We are going to build a serverless Question-Answering API using the Serverless Framework,AWS Lambda, AWS EFS,efsync, Terraform, thetransformers Library from HuggingFace, and a mobileBert model fromGoogle fine-tuned on SQuADv2.

You find the complete code for it in thisGithub repository.

Serverless Frahttps://www.philschmid.de/static/blog/new-serverless-bert-with-huggingface-aws-lambda/serverless-logo.pngambda/serverless-logo.png" alt="serverless-logo">
The Serverless Framework helps us develop and deploy AWS Lambda functions. It’s a CLIthat offers structure, automation, and best practices right out of the box.

https://aws.amazon.com/de/lambda/features/
AWS Lambda is a serverless computing service that lets yourun code without managing servers. It executes your code only when required and scales automatically, from a fewrequests per day to thousands per second.

Amazon Elastic File System (EFS)

Amazon EFS is a fully-managed service that makes it easy to set up, scale, andcost-optimize file storage in the Amazon Cloud. Since June 2020 you can mount AWS EFS to AWS Lambda functions

Efsync

Efsync is a CLI/SDK tool, which automatically syncs files and dependencies toAWS EFS. It enables you to install dependencies with the AWS Lambda runtime directly into your EFS filesystem and usethem in your AWS Lambda function.

https://www.terraform.io/logos.html
Terraform is an Infrastructure as Code (IaC) tool for building cloud-native infrastructuresafely and efficiently. Terraform enables you to use HCL (HashiCorp Configuration Language) to describe yourcloud-native infrastructure.

Transformers library provides state-of-the-art machine learningarchitectures like BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet, T5 for Natural Language Understanding (NLU) and NaturalLanguage Generation (NLG). It also provides thousands of pre-trained models in 100+ different languages.

Tutorial
Before we get started, make sure you have the Serverless Framework andTerraform configured and set up. Furthermore, you need access to an AWS Account to createan EFS Filesystem, API Gateway, and the AWS Lambda function.
In the tutorial, we are going to build a Question-Answering API with a pre-trained `BERT` model from Google.
We are going to send a context (small paragraph) and a question to the lambda function, which will respond with theanswer to the question.
What are we going to do:
create the required infrastructure using `terraform`.use `efsync` to upload our Python dependencies to AWS EFS.create a Python Lambda function with the Serverless Framework.add the `BERT`model to our function and create an inference pipeline.Configure the `serverless.yaml`, add EFS and set up an API Gateway for inference.deploy & test the function.
You will need a new IAM user called `serverless-bert` with `AdministratorAccess` and configured it with the AWS CLIusing `aws configure --profile serverless-bert`. This IAM user is used in the complete tutorial. If you don´t know howto do this check out this link.
Note: I don´t recommend create a IAM User for production usage with `AdministratorAccess`
Before we start, I want to say that we're not gonna go into detail for every step. If you want to understand more abouthow to use Deep Learning in AWS Lambda I suggest you check out my other articles:
You find the complete code in this Github repository.

Create the required infrastructure using `terraform`

At first, we define and create the required infrastructure using terraform. If you haven´t set it up you can check outthis tutorial.

As infrastructure, we need an AWS EFS filesystem, an access point, and a mount target to be able to use it in our AWSLambda function. We could also create a VPC, but for the purpose of this tutorial, we are going to use the default VPCand his subnets.

Next, we create a directory serverless-bert/, which contains all code for this tutorial with a subfolder terraform/including our main.tf file.

mkdir serverless-bert serverless-bert/terraform && touch serverless-bert/terraform/main.tf

Afterwards, we open the main.tf with our preferred IDE and add the terraform resources. I provided a basic templatefor all of them. If you want to customize them or add extra resources check out thedocumentation for all possibilities.

# providerprovider "aws" { region                  = "eu-central-1"  shared_credentials_file = "~/.aws/credentials"  profile                 = "serverless-bert"} # get all available availability zones data "aws_vpc" "default" {  default = true} data "aws_subnet_ids" "subnets" {  vpc_id = data.aws_vpc.default.id} # EFS File System resource "aws_efs_file_system" "efs" {  creation_token = "serverless-bert"} # Access Point resource "aws_efs_access_point" "access_point" {  file_system_id = aws_efs_file_system.efs.id} # Mount Targets resource "aws_efs_mount_target" "efs_targets" {  for_each = data.aws_subnet_ids.subnets.ids  subnet_id      = each.value  file_system_id = aws_efs_file_system.efs.id} ## SSM Parameter for serverless# resource "aws_ssm_parameter" "efs_access_point" {  name      = "/efs/accessPoint/id"  type      = "String"  value     = aws_efs_access_point.access_point.id  overwrite = true}

To change the name of EFS you can edit the value creation_token in the aws_efs_filesystem resource. Otherwise, thename of the EFS will be "serverless-bert". Additionally, we create an SSM parameter for the efs_access_point_id at theend to use it later in our serverless.yaml.

To use terraform we first run terraform init to initialize our project and provider (AWS). Be aware we have to be inthe terraform/ directory.

Afterwards, we check our IaC definitions with terraform plan

Whenhttps://www.philschmid.dehttps://www.philschmid.de/static/blog/new-serverless-bert-with-huggingface-aws-lambda/terraform-apply.png

Use `efsync` to upload our Python dependencies to AWS EFS

The next step is to add and install our dependencies on our AWS EFS filesystem. Therefore we use a tool called efsync.I created efsync to install dependencies with the AWS Lambda runtime directlyinto your EFS filesystem and use them in your AWS Lambda function.

install efsync by running pip3 install efsync

After it is installed we create a requirements.txt in our root directory serverless-bert/ and add our dependenciesto it.

https://download.pytorch.org/whl/cpu/torch-1.5.0%2Bcpu-cp38-cp38-linux_x86_64.whltransformers==3.4.0

Efsync provides different configurations. This time we use the CLI with ayaml configuration. For that, we create an efsync.yaml file in our root directory.

#standard configurationefs_filesystem_id: <efs-filesystem-id> # aws efs filesystem idsubnet_Id: <subnet-id-of-mount-target> # subnet of which the efs is running inec2_key_name: efsync-asd913fjgq3 # required key name for starting the ec2 instanceclean_efs: all # Defines if the EFS should be cleaned up before. values: `'all'`,`'pip'`,`'file'` uploading# aws profile configurationaws_profile: serverless-bert # aws iam profile with required permission configured in .aws/credentialsaws_region: eu-central-1 # the aws region where the efs is running # pip dependencies configurationsefs_pip_dir: lib # pip directory on ec2python_version: 3.8 # python version used for installing pip dependencies -> should be used as lambda runtime afterwadsrequirements: requirements.txt # path + file to requirements.txt which holds the installable pip dependencies

Here we have to adjust the values of

TL;DR;

https://aws.amazon.com/de/lambda/features/
AWS Lambda is a serverless computing service that lets yourun code without managing servers. It executes your code only when required and scales automatically, from a fewrequests per day to thousands per second.

Amazon Elastic File System (EFS)

Efsync

https://www.terraform.io/logos.html
Terraform is an Infrastructure as Code (IaC) tool for building cloud-native infrastructuresafely and efficiently. Terraform enables you to use HCL (HashiCorp Configuration Language) to describe yourcloud-native infrastructure.

Transformers library provides state-of-the-art machine learningarchitectures like BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet, T5 for Natural Language Understanding (NLU) and NaturalLanguage Generation (NLG). It also provides thousands of pre-trained models in 100+ different languages.

Tutorial

Create the required infrastructure using `terraform`

Use `efsync` to upload our Python dependencies to AWS EFS

Create a Python Lambda function with the Serverless Framework

Add the `BERT`model to our function and create an inference pipeline

Configure the `serverless.yaml`, add EFS, and set up an API Gateway for inference.

Deploy & Test the function

Conclusion

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签

TL;DR;

https://aws.amazon.com/de/lambda/features/AWS Lambda is a serverless computing service that lets yourun code without managing servers. It executes your code only when required and scales automatically, from a fewrequests per day to thousands per second.

Amazon Elastic File System (EFS)

Efsync

https://www.terraform.io/logos.htmlTerraform is an Infrastructure as Code (IaC) tool for building cloud-native infrastructuresafely and efficiently. Terraform enables you to use HCL (HashiCorp Configuration Language) to describe yourcloud-native infrastructure.

The Transformers library provides state-of-the-art machine learningarchitectures like BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet, T5 for Natural Language Understanding (NLU) and NaturalLanguage Generation (NLG). It also provides thousands of pre-trained models in 100+ different languages.

Tutorial

Create the required infrastructure using terraform

Use efsync to upload our Python dependencies to AWS EFS

Create a Python Lambda function with the Serverless Framework

Add the BERTmodel to our function and create an inference pipeline

Configure the serverless.yaml, add EFS, and set up an API Gateway for inference.

Deploy & Test the function

Conclusion

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签

https://aws.amazon.com/de/lambda/features/
AWS Lambda is a serverless computing service that lets yourun code without managing servers. It executes your code only when required and scales automatically, from a fewrequests per day to thousands per second.

https://www.terraform.io/logos.html
Terraform is an Infrastructure as Code (IaC) tool for building cloud-native infrastructuresafely and efficiently. Terraform enables you to use HCL (HashiCorp Configuration Language) to describe yourcloud-native infrastructure.

Transformers library provides state-of-the-art machine learningarchitectures like BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet, T5 for Natural Language Understanding (NLU) and NaturalLanguage Generation (NLG). It also provides thousands of pre-trained models in 100+ different languages.

Create the required infrastructure using `terraform`

Use `efsync` to upload our Python dependencies to AWS EFS

Add the `BERT`model to our function and create an inference pipeline

Configure the `serverless.yaml`, add EFS, and set up an API Gateway for inference.