You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hey, I am experiencing time out when downloading a dataset. I would like to be able to increase this time out, either having a longer default or via env variable.
Reproduction
I am using the following dataset load_dataset("allenai/c4", "en", streaming=True) in streaming mode and get the error below.
This only happened when suing torchrun with 8 workers, using 2 workers is working. My guess is that the worker fight for bandwith leading to the time out when there are too many workers.
I actually "fix" the issue locally by patching the time out in this line:
I would like to increase this timeout in a more secure way.
Thanks in advance 🙏
Logs
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 2491, in repo_info
return method(
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 2363, in dataset_info
r = get_session().get(path, headers=headers, timeout=timeout, params=params)
File "/opt/conda/lib/python3.10/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/opt/conda/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 66, in send
returnsuper().send(request, *args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/requests/adapters.py", line 532, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: (ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co
Hi @samsja, thanks for reporting and sorry for the delay. This timeout value is actually hard-coded to 100s in the datasets library (see here). Do you think your workers are blocked in a way that requires more than 100s to complete the HTTP call?
Describe the bug
hey, I am experiencing time out when downloading a dataset. I would like to be able to increase this time out, either having a longer default or via env variable.
Reproduction
I am using the following dataset
load_dataset("allenai/c4", "en", streaming=True)
in streaming mode and get the error below.This only happened when suing torchrun with 8 workers, using 2 workers is working. My guess is that the worker fight for bandwith leading to the time out when there are too many workers.
I actually "fix" the issue locally by patching the time out in this line:
huggingface_hub/src/huggingface_hub/hf_api.py
Line 2306 in 5ff2d15
I would like to increase this timeout in a more secure way.
Thanks in advance 🙏
Logs
System info
The text was updated successfully, but these errors were encountered: