Because of a faulty score.py file in my InferenceConfig, a Model.Deploy failed to Azure Machine Learning, using ACI. I wanted to create the endpoint in the cloud, but the only state I can see in the portal is Unhealthy. My local script to deploy the model (using ) keeps running, until it times out. (using the service.wait_for_deployment(show_output=True)statement).
Is there an option to get more insights in the actual reason/error message of the deployment turning "Unhealthy"?
Usually the timeout is caused by an error in init() function in scoring script. You can get the detailed logs using print(service.get_logs()) to find the Python error.
For more comprehensive troubleshooting guide, see:
https://learn.microsoft.com/en-us/azure/machine-learning/how-to-troubleshoot-deployment
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With