Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to train and deploy model in script mode on Sagemaker without using jupyter notebook instance (serverless)?

I have been using a jupyter notebook instance to spin up a training job (on separate instance) and deploy the endpoint (on another instance). I am using sagemaker tensorflow APIs for this as shown below:

# create Tensorflow object and provide and entry point script
tf_estimator = TensorFlow(entry_point='tf-train.py', role='SageMakerRole',
                      train_instance_count=1, train_instance_type='ml.p2.xlarge',
                      framework_version='1.12', py_version='py3')

# train model on data on s3 and save model artifacts to s3
tf_estimator.fit('s3://bucket/path/to/training/data')

# deploy model on another instance using checkpoints saved on S3
predictor = estimator.deploy(initial_instance_count=1,
                         instance_type='ml.c5.xlarge',
                         endpoint_type='tensorflow-serving')

I have been doing all of these steps through a jupyter notebook instance. What AWS services I can use to get rid off the dependency of jupyter notebook instance and automate these tasks of training and deploying the model in serverless fashion?

like image 243
exAres Avatar asked Aug 31 '25 05:08

exAres


1 Answers

I recommend AWS Step Functions. Been using it to schedule SageMaker Batch Transform and preprocessing jobs since it integrates with CloudWatch event rules. It can also train models, perform hpo tuning, and integrates with lambda. There is a SageMaker/Step Functions SDK as well as you can use Step Functions directly by creating state machines. Some examples and documentation:

https://aws.amazon.com/about-aws/whats-new/2019/11/introducing-aws-step-functions-data-science-sdk-amazon-sagemaker/

https://docs.aws.amazon.com/step-functions/latest/dg/connect-sagemaker.html

like image 89
thePurplePython Avatar answered Sep 02 '25 20:09

thePurplePython