philschmid RSS feed 09月30日 19:14
AWS Lambda与EFS结合使用指南
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了如何将AWS Lambda与Amazon EFS结合使用,以解决无服务器架构在深度学习应用中的存储限制问题。通过挂载EFS文件系统到Lambda函数,用户可以轻松访问大型机器学习模型和库,如TensorFlow、PyTorch和BERT。文章详细阐述了使用Serverless Framework配置EFS挂载的步骤,并提供了示例代码,展示了如何在Python Lambda函数中导入和使用pandas和pyjokes。此外,还讨论了如何使用AWS Datasync或手动方式上传文件到EFS,以及如何优化Lambda函数的性能和扩展性。

😊AWS Lambda与EFS结合使用,解决无服务器架构在深度学习应用中的存储限制问题,允许访问大型机器学习模型和库。

📚通过Serverless Framework配置EFS挂载,详细步骤包括设置自定义变量、使用CloudFormation扩展和调整handler.py文件,以导入和使用EFS上的依赖库。

🔧提供使用AWS Datasync或手动方式上传文件到EFS的解决方案,确保Lambda函数可以访问必要的依赖项,如pip包和机器学习模型。

🚀优化Lambda函数的性能和扩展性,通过挂载EFS实现冷启动时间减少和自动扩展,提高应用响应速度和并发处理能力。

📈展示实际应用案例,如使用pandas和pyjokes的Lambda函数,验证EFS挂载的有效性和易用性,为读者提供可参考的实施指南。

"Just like wireless internet has wires somewhere, serverless architectures still have servers somewhere. What‘serverless’ really means is that as a developer you don’t have to think about those servers. You just focus oncode." - serverless.com

This focus is only possible if we make some tradeoffs. Currently, all Serverless FaaS Services likeAWS Lambda, Google Cloud Functions,Azure Functions are having limits. For example, there is noreal state or no endless configurable memory.

These limitations have led to serverless architectures being used more for software development and less for machinelearning, especially deep learning.

A big hurdle to overcome in serverless deep learning with tools like AWS Lambda,Google Cloud Functions,Azure Functions is storage.Tensorflow and Pytorch are having a huge size and newer "State ofthe art" models like BERT have a size of over 300MB. So far it was only possible to use them if you used somecompression techniques. You can check out two of my posts on how you could do this:

But last month AWS announced mountable storage to your serverless functions. They added support forAmazon Elastic File System (EFS), ascalable and elastic NFS file system. This allows you to mount your AWS EFS filesystem to yourAWS Lambda function.

In theirblog post, theyexplain to connect an AWS lambda function to AWS EFS. The blog post is very nice, definitely check it out.

In this post, we are going to do the same, but a bit better with using theServerless Framework and without the manual work.

PREVIEW: I am building a CLI tool called efsync which enables you to upload automatically files (pip packages,ML models, ...) to an EFS file system.

Until I finished efsync you can useAWS Datasync to upload you data to an AWS EFS file system.


What is AWS Lambda?

You are probably familiar with AWS Lambda, but to makethings clear AWS Lambda is a computing service that lets you run code without managing servers. It executes your codeonly when required and scales automatically, from a few requests per day to thousands per second. You only pay for thecompute time you consume - there is no charge when your code is not rhttps://www.philschmid.de/static/blog/mount-your-aws-efs-volume-into-aws-lambda-with-the-serverless-framework/lambda-logo.pngframework/lambda-logo.png" alt="AWS Lambda Logo">

https://aws.amazon.com/de/lambda/features/


What is AWS EFS?

Amazon EFS is a fully-managed service that makes it easy to set up, scale, and cost-optimize file storage in the AmazonCloud. Amazon EFS-filesystems can automatically scale from gigabytes to petabytes of data without needing to provisionstorage. Amazon EFS is designed to be highly durable and highly available. With Amazon EFS, there is no minimum fee orsetup costs, and you pay only for what you use.


Serverless Framework

The Serverless Framework helps us develop and deploy AWS Lambda functions. It’s a CLI that offers structure, automation,and best practices right out of the box. It also allows us to focus on building sophisticated, event-driven, serverlessarchitectures, cohttps://www.philschmid.de/static/blog/mount-your-aws-efs-volume-into-aws-lambda-with-the-serverless-framework/serverless-logo.png-with-the-serverless-framework/serverless-logo.png" alt="Serverless Framework Logo">

If you aren’t familiar or haven’t set up the Serverless Framework, take a look atthis quick-start with the Serverless Framework.


Tutorial

We build an AWS Lambda function with python3.8 as runtime, which is going to import and use pip packages located onour EFS-filesystem. As an example, we use pandas and pyjokes. They could easily be replaced by Tensorflow orPytorch.

Before we get started, make sure you have the Serverless Framework configured and anEFS-filesystem set up with the required dependencies. We are not going to cover the steps on how to install thedependencies and upload them to EFS in this blog post. You can either userAWS Datasync or start an ec2-instanceconnect with ssh, mount the EFS-filesystem with amazon-efs-utils, and use pip install -t to install the pippackages on efs.

We are going to do:

    create a Python Lambda function with the Serverless Frameworkconfigure the serverless.yaml and add our EFS-filesystem as mount volumeadjust the handler.py and import pandas and pyjokes from EFSdeploy & test the function

1. Create a Python Lambda function

First, we create our AWS Lambda function by using the Serverless CLI with the aws-python3 template.

serverless create --template aws-python3 --path serverless-efs

This CLI command creates a new directory containing a handler.py, .gitignore, and serverless.yaml file. Thehandler.py contains some basic boilerplate code.

import json def hello(event, context):    body = {        "message": "Go Serverless v1.0! Your function executed successfully!",        "input": event    }    response = {        "statusCode": 200,        "body": json.dumps(body)    }    return response

2. Configure the serverless.yaml and add our EFS-filesystem as mount volume

I provide the complete serverless.yamlfor this example, but we go through all the details we need for ourEFS-filesystem and leave out all standard configurations. If you want to learn more about the serverless.yaml, Isuggest you check outScaling Machine Learning from ZERO to HERO. Inthis article, I went through each configuration and explain the usage of them.

service: blog-serverless-efs plugins:  - serverless-pseudo-parameters custom:  efsAccessPoint: <your-efs-accesspoint>  LocalMountPath: <mount-directory-in-aws-lambda-function>  subnetsId: <subnetid-in-which-efs-is>  securityGroup: <any-security-group> provider:  name: aws  runtime: python3.8  region: eu-central-1 package:  exclude:    - node_modules/**    - .vscode/**    - .serverless/**    - .pytest_cache/**    - __pychache__/** functions:  joke:    handler: handler.handler    environment: # Service wide environment variables      MNT_DIR: ${self:custom.LocalMountPath}    vpc:      securityGroupIds:        - ${self:custom.securityGroup}      subnetIds:        - ${self:custom.subnetsId}    iamManagedPolicies:      - arn:aws:iam::aws:policy/AmazonElasticFileSystemClientReadWriteAccess    events:      - http:          path: joke          method: get resources:  extensions:    # Name of function <joke>    JokeLambdaFunction:      Properties:        FileSystemConfigs:          - Arn: 'arn:aws:elasticfilesystem:${self:provider.region}:#{AWS::AccountId}:access-point/${self:custom.efsAccessPoint}'            LocalMountPath: '${self:custom.LocalMountPath}'

First, we need to install the serverless-pseudo-parameters plugin with the following command.

npm install serverless-pseudo-parameters

We use the serverless-pseudo-parameters plugin to get our AWS::AccountID referenced in the serverless.yaml. Allcustom needed variables are referenced under custom.

    efsAccessPoint should be the value of your EFS access point. You can find it in the AWS Management Console underEFS. This one should look similar to this fsap-0a31095162dd0ca44LocalMountPath is the path under which EFS is mounted in the AWS Lambda functionsubnetsId should have the same id as the EFS-filesystem. If you started your filesystem in multiple AvailabilityZones you can choose the one you want.securityGroup can be any security group in the AWS account. We need this to deploy our AWS Lambda function into therequired subnet. We can use the default security group id. This one should look like this sg-1018g448.

We utilize Cloudformation extensions to mount the EFS-filesystem after our lambda is created. Therefore we use thislittle snippet.Extensions can be used to override Cloudformation Resources.

resources:  extensions:    # Name of function <joke>    JokeLambdaFunction:      Properties:        FileSystemConfigs:          - Arn: "arn:aws:elasticfilesystem:${self:provider.region}:#{AWS::AccountId}:access-point/${self:custom.efsAccessPoint}"            LocalMountPath: "${self:custom.LocalMountPath}"

3. Adjust the handler.py and import pandas and pyjokes from EFS

The last step before we can deploy is to adjust our handler.py and import pandas and pyjokes from EFS. In myexample, I used /mnt/efs as localMountPath and installed my pip packages in lib/.

To use our dependencies from our EFS-filesystem we have to add our localMountPath path to our PYTHONPATH. Thereforewe add a small try/except statement at the top of your handler.py, which appends our mnt/efs/lib to thePYTHONPATH. Lastly, we add some demo calls to show our 2 dependencies work.

try:    import sys    import os    sys.path.append(os.environ['MNT_DIR']+'/lib')  # nopep8 # noqaexcept ImportError:    pass import jsonimport osimport pyjokesfrom pandas import DataFrame def handler(event, context):    data = {'Product': ['Desktop Computer', 'Tablet', 'iPhone', 'Laptop'],            'Price': [700, 250, 800, 1200]            }     df = DataFrame(data, columns=['Product', 'Price'])     body = {        "frame": df.to_dict(),        "joke": pyjokes.get_joke()    }     response = {        "statusCode": 200,        "body": json.dumps(body)    }     return response

4. Deploy & Test the function

In order to deploy the function we only have to run serverless deploy.

Aftehttps://www.philschmid.de/static/blog/mount-your-aws-efs-volume-into-aws-lambda-with-the-serverless-framework/serverless-deployment.pngume-into-aws-lambda-with-the-serverless-framework/serverless-deployment.png" alt="serverless bash deployment">

To test our Lambda function we can use Insomnia, Postman, or any other REST client. https://www.philschmid.de/static/blog/mount-your-aws-efs-volume-into-aws-lambda-with-the-serverless-framework/insomnia-request.pngic/blog/mount-your-aws-efs-volume-into-aws-lambda-with-the-serverless-framework/insomnia-request.png" alt="insomnia-request">

The first request to the cold AWS Lambda function took around 8 seconds. After it is warmed up it takes around 100-150msas you can see in the screenshot.

The best thing is, our AWS Lambda function automatically scales up if there are several incoming requests up tothousands of parallel requests without any worries.

If you rebuild this, you have to be careful that the first request could take a while.


You can find the GitHub repository with the complete codehere.

Thanks for reading. If you have any questions, feel free to contact me or comment on this article. You can also connectwith me on Twitter orLinkedIn.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AWS Lambda Amazon EFS 无服务器架构 深度学习 Serverless Framework 存储优化
相关文章