I'm trying to make use of dask on a cluster and I'm interested in terminating all the workers as soon as all the jobs are done.
I was trying to do that with the retire_workers method, but that doesn't seem to kill the workers.
Here is an example.
import time
import os
from dask.distributed import Client
def long_func(x):
time.sleep(2)
return 1
if __name__ == '__main__':
C = Client(scheduler_file='sched.json')
res = []
for _ in range(10):
res.append(C.submit(long_func, _))
for r in res:
r.result()
workers = list(C.scheduler_info()['workers'])
# C.run(lambda: os._exit(0), workers=workers)
C.retire_workers(workers=workers, close_workers=True)
The scheduler and a worker were started with these commands:
dask-scheduler --scheduler-file sched.json
dask-worker --scheduler-file sched.json --nthreads=1 --lifetime='5minutes'
The hope was that after executing the python code above, the worker would terminate (after 20 seconds), but it does not, staying for the whole 5 minutes. Any advice how to fix that ?
question from:
https://stackoverflow.com/questions/66052839/terminating-dask-workers-after-jobs-are-done 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…