amazon web services - AWS Sagemaker - ClientError: Data download failed:Could not download

Question

Welcome To Ask or Share your Answers For Others

amazon web services - AWS Sagemaker - ClientError: Data download failed:Could not download

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

amazon web services - AWS Sagemaker - ClientError: Data download failed:Could not download

I encountered and error when I deploy my training job in my notebook instance. This what it says: "UnexpectedStatusException: Error for Training job tensorflow-training-2021-01-26-09-55-05-768: Failed. Reason: ClientError: Data download failed:Could not download s3://forex-model-data/data/train2001_2020.npz: insufficient disk space"

I deploy training job to try running it to different instances in 3 epoch. I use ml.c5.4xlarge, ml.c5.18xlarge, ml.m5.24xlarge, also I have two sets of training data, train2001_2020.npz and train2016_2020.npz.

First, I run train2001_2020 to ml.c5.18xlarge and ml.c5.18xlarge and the training job completed, then I switch to train2016_2020 and run it to ml.c5.4xlarge and ml.c5.18xlarge and it goes well. Then when I tried to run it using ml.m5.24xlarge I got an error (quoted above), but my dataset is train2016_2020 not train2001_2020 then when I rerun it again with all other instances it has the same error. What happen?

I stopped the instances and refresh everything, but I encountered same issue.

question from:https://stackoverflow.com/questions/65902366/aws-sagemaker-clienterror-data-download-failedcould-not-download

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T19:15:11+0000

It's not really clear to all the test are you doing, but that error usually means that there is not enough disk space on the instance you are using for the training job. You can try to increase the additional storage for the instance (you can do in the estimator parameters if you are using the sagemaker SDK in a notebook).

Categories

amazon web services - AWS Sagemaker - ClientError: Data download failed:Could not download

amazon web services - AWS Sagemaker - ClientError: Data download failed:Could not download

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags