-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: standardize devices parameter and device initialization #3062
Conversation
haystack/document_stores/memory.py
Outdated
@@ -64,6 +65,8 @@ def __init__( | |||
you have at least `embedding_dim`*`scoring_batch_size`*4 bytes available in GPU memory. | |||
Since the data is originally stored in CPU memory there is little risk of overruning memory | |||
when running on CPU. | |||
:param devices: List of GPU devices to limit inference to certain GPUs and not use all available ones (e.g. [torch.device('cuda:0')]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would consider changing the doc string to something like List of torch devices (e.g. cuda, cpu, mps) to limit inference to certain devices ...
since a torch.device
does not have to be a GPU. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I just read the docstring for initialize_device_settings
and see that devices
is only used if use_cuda
is True so I think we can leave this unchanged. Perhaps add a sentence explaining that this list is only used if use_gpu
is set to True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, excellent idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this one but it doesn't seem like it is going to work out of the box. For Tutorial 3, during pipeline run I get the following exception:
File "/Users/vblagoje/workspace/haystack/haystack/pipelines/base.py", line 512, in run
node_output, stream_id = self._run_node(node_id, node_input)
File "/Users/vblagoje/workspace/haystack/haystack/pipelines/base.py", line 439, in _run_node
return self.graph.nodes[node_id]["component"]._dispatch_run(**node_input)
File "/Users/vblagoje/workspace/haystack/haystack/nodes/base.py", line 201, in _dispatch_run
return self._dispatch_run_general(self.run, **kwargs)
File "/Users/vblagoje/workspace/haystack/haystack/nodes/base.py", line 245, in _dispatch_run_general
output, stream = run_method(**run_inputs, **run_params)
File "/Users/vblagoje/workspace/haystack/haystack/nodes/reader/base.py", line 84, in run
results = predict(query=query, documents=documents, top_k=top_k)
File "/Users/vblagoje/workspace/haystack/haystack/nodes/reader/base.py", line 145, in wrapper
ret = fn(*args, **kwargs)
File "/Users/vblagoje/workspace/haystack/haystack/nodes/reader/farm.py", line 867, in predict
predictions = self.inferencer.inference_from_objects(
File "/Users/vblagoje/workspace/haystack/haystack/modeling/infer.py", line 596, in inference_from_objects
return self.inference_from_dicts(
File "/Users/vblagoje/workspace/haystack/haystack/modeling/infer.py", line 578, in inference_from_dicts
return Inferencer.inference_from_dicts(
File "/Users/vblagoje/workspace/haystack/haystack/modeling/infer.py", line 349, in inference_from_dicts
return list(predictions)
File "/Users/vblagoje/workspace/haystack/haystack/modeling/infer.py", line 424, in _inference_with_multiprocessing
predictions = self._get_predictions_and_aggregate(dataset, tensor_names, baskets)
File "/Users/vblagoje/workspace/haystack/haystack/modeling/infer.py", line 512, in _get_predictions_and_aggregate
logits = self.model.forward(
File "/Users/vblagoje/workspace/haystack/haystack/modeling/model/adaptive_model.py", line 484, in forward
output_tuple = self.language_model.forward(
File "/Users/vblagoje/workspace/haystack/haystack/modeling/model/language_model.py", line 364, in forward
return self.model(**params, return_dict=return_dict)
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/transformers/models/roberta/modeling_roberta.py", line 841, in forward
embedding_output = self.embeddings(
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/transformers/models/roberta/modeling_roberta.py", line 128, in forward
inputs_embeds = self.word_embeddings(input_ids)
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 160, in forward
return F.embedding(
File "/Users/vblagoje/miniconda3/envs/haystack/lib/python3.8/site-packages/torch/nn/functional.py", line 2206, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Placeholder storage has not been allocated on MPS device!
And yes, I have initialised FarmReader with:
reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2", devices=[torch.device("mps")])
Do you @sjrl have M1/M2 silicone to try it out as well? Use vblagoje:t_fix_devices
branch, of course
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @vblagoje what version of PyTorch do you have installed? It looks like this issue could be related to the issue described here: https://discuss.pytorch.org/t/torch-embedding-fails-with-runtimeerror-placeholder-storage-has-not-been-allocated-on-mps-device/152124/6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I traced this issue to QA inferencer class constructor not passing devices parameter.
Nice, great catch!
I'll add the devices parameter in infer.py but we'll have to cleanup it up soon after we integrate this PR.
What cleanup do you mean exactly here?
But even with this fix my kernel dies due to some MPS torch unsupported ops.
Does this mean we need to wait on PyTorch to support more ops with MPS before we can use a FARMReader
on MPS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cleanup related to multiprocessing. I'll create an issue about it today. Yes, unfortunately, I was not able to run FarmReader. Here is the error:
NotImplementedError: The operator 'aten::cumsum.out' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
Using CPU as fallback blew up the kernel. Seems risky even if it worked. So it seems like we'll have to wait a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can like/vote for the comments mentioning aten::cumsum.out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using CPU as fallback blew up the kernel. Seems risky even if it worked. So it seems like we'll have to wait a bit more.
It seems like the fallback only works if you build pytorch from source pytorch/pytorch#77764 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also a separate issue for aten::cumsum.out
in PyTorch here: pytorch/pytorch#79112
haystack/modeling/utils.py
Outdated
@@ -68,8 +68,12 @@ def initialize_device_settings( | |||
devices_to_use = [torch.device("cpu")] | |||
n_gpu = 0 | |||
elif devices: | |||
devices_to_use = devices | |||
n_gpu = sum(1 for device in devices if "cpu" not in device.type) | |||
if isinstance(devices[0], str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could consider replacing this if statement with any(isinstance(x, str) for x in devices)
which will check if any of the devices provided are of type string. For example, if the user happens to provide [torch.device('cuda:0'), 'cuda:1']
then we would still construct a list of only torch.device
types.
This will work because torch.device(torch.device('cuda'))
still returns torch.device('cuda')
so there is no danger there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point, thank you @sjrl
haystack/document_stores/memory.py
Outdated
@@ -64,6 +65,8 @@ def __init__( | |||
you have at least `embedding_dim`*`scoring_batch_size`*4 bytes available in GPU memory. | |||
Since the data is originally stored in CPU memory there is little risk of overruning memory | |||
when running on CPU. | |||
:param devices: List of GPU devices to limit inference to certain GPUs and not use all available ones (e.g. [torch.device('cuda:0')]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(
cuda:0)]
.
haystack/document_stores/memory.py
Outdated
@@ -64,6 +65,8 @@ def __init__( | |||
you have at least `embedding_dim`*`scoring_batch_size`*4 bytes available in GPU memory. | |||
Since the data is originally stored in CPU memory there is little risk of overruning memory | |||
when running on CPU. | |||
:param devices: List of GPU devices to limit inference to certain GPUs and not use all available ones (e.g. [torch.device('cuda:0')]). | |||
Unused if `use_gpu` is False. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused if `use_gpu` is False. | |
This parameter is not used if you set `use_gpu` to False. |
haystack/modeling/infer.py
Outdated
@@ -70,11 +71,13 @@ def __init__( | |||
:func:`~farm.infer.Inferencer.close_multiprocessing_pool` after you are | |||
done using this class. The garbage collector will not do this for you! | |||
:param disable_tqdm: Whether to disable tqdm logging (can get very verbose in multiprocessing) | |||
:param devices: List of GPU devices to limit inference to certain GPUs and not use all available ones (e.g. [torch.device('cuda:0')]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change it to:
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
@@ -97,6 +98,9 @@ def __init__( | |||
:param embed_title: Embedded the title of passage while generating embedding | |||
:param prefix: The prefix used by the generator's tokenizer. | |||
:param use_gpu: Whether to use GPU. Falls back on CPU if no GPU is available. | |||
:param progress_bar: Whether to show a progress bar, defaults to True. | |||
:param devices: List of GPU devices to limit inference to certain GPUs and not use all available ones (e.g. [torch.device('cuda:0')]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change to:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
@@ -341,6 +346,8 @@ def __init__( | |||
:param min_length: Minimum length of generated text | |||
:param num_beams: Number of beams for beam search. 1 means no beam search. | |||
:param use_gpu: Whether to use GPU or the CPU. Falls back on CPU if no GPU is available. | |||
:param devices: List of GPU devices to limit inference to certain GPUs and not use all available ones (e.g. [torch.device('cuda:0')]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
@@ -66,15 +66,14 @@ def __init__( | |||
only predicts a single label. For multi-label predictions, no scaling is applied. Set this | |||
to False if you do not want any scaling of the raw predictions. | |||
:param progress_bar: Whether to show a progress bar while processing the documents. | |||
:param devices: List of GPU devices to limit inference to certain GPUs and not use all available ones (e.g. [torch.device('cuda:0')]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
haystack/nodes/reader/table.py
Outdated
@@ -104,6 +105,8 @@ def __init__( | |||
:param max_seq_len: Max sequence length of one input table for the model. If the number of tokens of | |||
query + table exceed max_seq_len, the table will be truncated by removing rows until the | |||
input size fits the model. | |||
:param devices: List of GPU devices to limit inference to certain GPUs and not use all available ones (e.g. [torch.device('cuda:0')]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes @agnieszka-m, but see #3062 (comment) for the wording. We shouldn't say GPUs as we are implying that only GPUs are the devices that can be used - but any device available on the machine and addressable from torch device (cuda, cpu, mps, xla etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not simply what it is now List of torch devices (e.g. cuda, cpu, mps) to limit inference to certain devices and not use all available ones (e.g. [torch.device('cuda:0')]). Unused if
use_gpu is False.
How would you reword it @agnieszka-m ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about List of torch devices (e.g. cuda, cpu, mps) to limit inference to specific devices. A list containing torch device objects and/or strings is supported (e.g. [torch.device('cuda:0'), "mps", "cuda:1"]). Specifying
use_gpu=False cancels the devices parameter and falls back to a single cpu device.
@agnieszka-m @sjrl ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good! I would only suggest a few edits
List of torch devices (e.g. cuda, cpu, mps) to limit inference to specific devices.
A list containing torch device objects and/or strings is supported (For example [torch.device('cuda:0'), "mps", "cuda:1"]).
When specifying `use_gpu=False` the devices parameter is not used and a single cpu device is used for inference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is perfect, I'll go with @sjrl version @agnieszka-m
@@ -66,10 +68,12 @@ def __init__( | |||
:param max_seq_len: max sequence length of one input text for the model | |||
:param doc_stride: length of striding window for splitting long texts (used if len(text) > max_seq_len) | |||
:param batch_size: Number of documents to process at a time. | |||
:param devices: List of GPU devices to limit inference to certain GPUs and not use all available ones (e.g. [torch.device('cuda:0')]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
@@ -88,10 +90,12 @@ def __init__( | |||
Important: The summary will depend on the order of the supplied documents! | |||
:param batch_size: Number of documents to process at a time. | |||
:param progress_bar: Whether to show a progress bar. | |||
:param devices: List of GPU devices to limit inference to certain GPUs and not use all available ones (e.g. [torch.device('cuda:0')]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
@@ -63,10 +65,12 @@ def __init__( | |||
:param clean_up_tokenization_spaces: Whether or not to clean up the tokenization spaces. (default True) | |||
:param use_gpu: Whether to use GPU or the CPU. Falls back on CPU if no GPU is available. | |||
:param progress_bar: Whether to show a progress bar. | |||
:param devices: List of GPU devices to limit inference to certain GPUs and not use all available ones (e.g. [torch.device('cuda:0')]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean the following:
List of GPU devices you want to limit inference to so as not to use all available GPUs. Example: [torch.device(cuda:0)].
@sjrl, this one is ready for review now. cc @bogdankostic |
@@ -341,6 +348,10 @@ def __init__( | |||
:param min_length: Minimum length of generated text | |||
:param num_beams: Number of beams for beam search. 1 means no beam search. | |||
:param use_gpu: Whether to use GPU or the CPU. Falls back on CPU if no GPU is available. | |||
:param devices: List of torch devices (e.g. cuda, cpu, mps) to limit inference to specific devices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the missing docstring for the progress_bar?
self.model = _Text2SpeechModel.from_pretrained( | ||
model_name_or_path, device=devices[0].type, **(transformers_params or {}) | ||
model_name_or_path, device=resolved_devices[0].type, **(transformers_params or {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand here it looks like only one device can be used to load the _Text2SpeechModel
. If that is True should we enforce that devices
can only have a length of 1
using an assert statement? Otherwise users might be confused as to why only one device is used if they provide multiple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more on this I wonder if it would be better to change the parameter from devices
to device
and make it type Union[str, torch.device]
(so removing the List aspect altogether). Then we could just make sure to pass device
as a list to initialize_device_settings()
. For example,
resolved_devices, _ = initialize_device_settings(devices=[device], use_cuda=use_gpu, multi_gpu=False)
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this as well. But is it possible that in the future there'll be support for multiple devices for components? The best trade-off seems to be to allow a list but to log a warning that if multiple devices are passed the first on is selected. We can do this in all cases where only one device is currently supported. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that sounds good to me. I like the idea of just logging a warning, but having the code still run.
devices, _ = initialize_device_settings(use_cuda=use_gpu, multi_gpu=False) | ||
device = 0 if devices[0].type == "cuda" else -1 | ||
resolved_devices, _ = initialize_device_settings(devices=devices, use_cuda=use_gpu, multi_gpu=False) | ||
device = 0 if resolved_devices[0].type == "cuda" else -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that only cuda
or cpu
is supported for this document classifier? If so we may want to update the docstring to say that mps
isn't supported for this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow good catch. No reason why not support mps here. I'll remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line looks to be present in quite a few places throughout Haystack just as a heads up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, actually, right now pipelines do not support devices other than cuda or cpu. HF unfortunately uses: torch.device("cpu" if device < 0 else f"cuda:{device}")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's frustrating. They have comment related to this issue here: huggingface/transformers#17342 (comment) Do you know if this line from HF torch.device("cpu" if device < 0 else f"cuda:{device}")
has not been updated yet? If so we could think about opening an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh it appears this has been fixed in the current main branch of HF https://github.com/huggingface/transformers/blob/d90a36d192e2981a41122c30a765c63158dd0557/src/transformers/pipelines/base.py#L763-L773
We will need to wait until they release a new version of the Transformers library I think before we can add support for mps here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: good news @sjrl HF will support MPS for pipeline in the next minor release. See huggingface/transformers#18494 for more details
): | ||
super().__init__() | ||
|
||
self.devices, _ = initialize_device_settings(use_cuda=use_gpu, multi_gpu=False) | ||
self.devices, _ = initialize_device_settings(devices=devices, use_cuda=use_gpu, multi_gpu=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, consider adding an assert here for devices to be length 1 since we only support a single device for the model to be loaded onto.
Additionally below (on line 58) we have the line device=0 if self.devices[0].type == "cuda" else -1
which makes me concerned that mps
would not be supported for loading this model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, not supported on mps right now. Will make notes for the components where only cpu and cuda are supported.
@@ -81,6 +86,10 @@ def __init__( | |||
:type batch_size: int (optional) | |||
:param progress_bar: Whether to show a progress bar, defaults to True. | |||
:type progress_bar: bool (optional) | |||
:param devices: List of torch devices (e.g. cuda, cpu, mps) to limit CrossEncoder inference to specific devices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a docstring for use_gpu
@@ -98,9 +107,9 @@ def __init__( | |||
) | |||
else: | |||
raise ValueError("Provide either a QuestionGenerator or a non-empty list of questions/document pairs.") | |||
|
|||
self.devices, _ = initialize_device_settings(devices=devices, use_cuda=use_gpu, multi_gpu=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding assert for devices
to be of length 1 since we do not support multi GPU loading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, not assert but actual check as assert can be removed during code optimization.
devices, _ = initialize_device_settings(use_cuda=use_gpu, multi_gpu=False) | ||
device = 0 if devices[0].type == "cuda" else -1 | ||
resolved_devices, _ = initialize_device_settings(devices=devices, use_cuda=use_gpu, multi_gpu=False) | ||
device = 0 if resolved_devices[0].type == "cuda" else -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above about whether mps
would be supported here.
""" | ||
super().__init__() | ||
self.devices, _ = initialize_device_settings(use_cuda=use_gpu, multi_gpu=False) | ||
self.devices, _ = initialize_device_settings(devices=devices, use_cuda=use_gpu, multi_gpu=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above for adding assert to devices
being length 1.
:param devices: List of torch devices (e.g. cuda, cpu, mps) to limit inference to specific devices. | ||
A list containing torch device objects and/or strings is supported (For example | ||
[torch.device('cuda:0'), "mps", "cuda:1"]). When specifying `use_gpu=False` the devices | ||
parameter is not used and a single cpu device is used for inference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate doc string of devices
. It appears again a few lines down. Please delete one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Bogdan. It will be supported 4.22.2 which should be released imminently. After we bump to that HF release we'll integrate this change. Hopefully we can release it in 1.8. We'll see. Anyways, no rush to integrate this one until HF upgrade and thorough test on mps devices as well. |
@sjrl @bogdankostic please have a quick look at the last few commits and the changes made. |
if len(resolved_devices) > 1: | ||
logger.warning( | ||
f"Multiple devices are not supported in {self.__class__.__name__} inference, " | ||
f"using the first device {resolved_devices}." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to resolved_devices[0]
to grab first device
Looks good! One last comment. Also, I would consider adding a task to the description to remind us to pin the HF library to the new version in our |
Looks also good for me! Looking forward to using this soon 🚀 |
HF released transformers v4.21.2 which includes pipeline fixes we were waiting for. Can we pin this transformers release for our 1.8 release @julian-risch @masci? |
Yes, sounds good. What you could do as preparation now already is to create a PR that pins the version to 4.21.2 and check whether any tests are failing, for example due to deprecated/discontinued features or breaking changes. Further, you can already have a look at the changelog and check for any changes that affect Haystack and the jump from the currently pinned version 4.20.1 to 4.21.2. If there are any relevant changes, please add them to the description of that PR. |
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Although torch mps backend still does not have some crucial operators implemented and therefore we can not yet use Haystack on Apple Silicone - I think we should integrate this PR now. We are still early on our path to 1.9 release and we can catch potential issues sooner. I am mostly concerned about subtle bugs that might not be visible to our test suite. Once torch implements these operators we need we can try out Haystack on Apple Silicone and hopefully it will work. That way we'll have support in 1.9 release and not later. LMK what you guys think @masci @julian-risch @ZanSara @brandenchan @bogdankostic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good! Thanks for hunting down all places in Haystack that uses gpus!
@agnieszka-m , I think we addressed the issues you raised. If so, would you please approve as well - can't merge without your approval |
Related Issues
Proposed Changes:
initialize_device_settings
helper functionHow did you test it?
Not tested yet - in preparation
Todo
Update to transformers v4.21.2
Run test suite on Apple Silicone using devices parameter in components where appropriate