You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For larger datasets, it's not great to do the tokenization on one core when we have many available. I'd suggest wrapping the relevant function in a process pool, or passing the pool as an argument and doing Pool.map
Happy to make a PR if it's a good fit for the repo
For larger datasets, it's not great to do the tokenization on one core when we have many available. I'd suggest wrapping the relevant function in a process pool, or passing the pool as an argument and doing Pool.map
Happy to make a PR if it's a good fit for the repo
vampire/scripts/preprocess_data.py
Line 26 in 2613609
The text was updated successfully, but these errors were encountered: