Step #2 – Install GRPC Support pip install mock grpcio awscli boto3 flask pillow requests autogradīazel is the build tool used to build TensorFlow Serving and the model.
Step #1 – Install the necessary libraries apt-get update apt-get upgrade apt-get install -y \ build-essential \ curl \ git \ libfreetype6-dev \ libpng12-dev \ libzmq3-dev \ mlocate \ pkg-config \ python-dev \ python-numpy \ python-pip \ software-properties-common \ swig \ zip \ zlib1g-dev \ libcurl3-dev \ wget
It is assumed that all these steps are run with sudo su. SSH into your new Instance and run through the following steps. In order to take full advantage of the GPU, TensorFlow Serving will need to be built with GPU support. The Deep Learning Base AMI (Ubuntu) created by AWS serves as a good starting point since it already has the nVidia driver installed as well as Cuda 8.0 and Cuda 9.0. When spinning up your Instance, select an AMI based on Ubuntu 16.04 as this is the simplest and most used path. Depending on what Instance you choose, you will need to determine the compute capability of that Instance. For the p2.xlarge, TensorFlow supports a compute capability of 3.5įor our purposes, we found that the p2.xlarge was sufficient and at a good price point.
Here are a few different Instance types that we considered: Type In our situation, it made much more sense to not use Docker at all.
If you still think that Docker is a fit for your use case, we will post a future article on how to use Docker with GPU support. Only a single TensorFlow Serving container can be run at a time.There are many AWS customizations to the ECS image (eg.Amazon ECS AMI’s are not supported by nvidia/Docker2.
It drastically simplifies your deployments and allows you to run on virtually any kind of hardware with Docker installed. To Docker or Not to Dockerĭockerizing your model is a great option if you are not taking advantage of any GPU specific hardware and are running on the CPU. This will involve building TensorFlow Serving for GPU instances. In this post, we are going to show you how to productionize a pre-trained model using TensorFlow Serving in AWS. In order to be successful at launching a production ready image classification API you need both Data Engineering and Software Engineering disciplines. What we observed was that under production loads, it was still not fast enough. This was a giant leap forward in terms of performance and accuracy than our previous attempt which utilized a COTS Machine Learning API.Īn important point is that TensorFlow Serving is by default compiled to run on the CPU. The model was served using a dockerized version of TensorFlow Serving and wrapped in a Python REST service. Our Data Engineering team trained a model using real estate images in order to infer what those images were of – bathroom, bedroom, swimming pool, etc. An Engineering Approach To Deploying A TensorFlow Based API on AWS GPU Instances