You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using pandas and pyarrow may improve the performance of collect operations (like columns_to_list). On the other side, both pandas and pyarrow are optional dependencies for PySpark SQL. Should we use them or not? And if we should, is it a good idea to separate any calls to these libs to allow other functions to work well?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Using pandas and pyarrow may improve the performance of collect operations (like
columns_to_list
). On the other side, both pandas and pyarrow are optional dependencies for PySpark SQL. Should we use them or not? And if we should, is it a good idea to separate any calls to these libs to allow other functions to work well?@MrPowers @jeffbrennan
Beta Was this translation helpful? Give feedback.
All reactions