Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly use Python3's buffer interface. #1836

Merged
merged 4 commits into from
Sep 5, 2016

Conversation

sceeter89
Copy link
Contributor

Description

When using luigi with Python 3 there is issue running PigTask. Because output in track_and_progress is both read to unicode string and copied to temporary file, it failed as conversion to string happens before writing to file. I simply moved decoding after writing to temporary file.

Motivation and Context

Without this change, running PigTask ended with following error:

File "/projects/big-data/luigi-tasks/libs/pig_task.py", line 57, in on_execute
PigTask.run(self)
File "/home/etl/.projects/big-data/lib/python3.4/site-packages/luigi/contrib/pig.py", line 127, in run
self.track_and_progress(cmd)
File "/home/etl/.projects/big-data/lib/python3.4/site-packages/luigi/contrib/pig.py", line 149, in >track_and_progress
temp_stdout.write(line)
TypeError: 'str' does not support the buffer interface

And while Pig script finished successfully, luigi marked it as failed.

Have you tested this? If so, how?

After making this change locally our Pig scripts run without issues using Python3.4

@mention-bot
Copy link

@sceeter89, thanks for your PR! By analyzing the annotation information on this pull request, we identified @gpoulin, @jeremykarn and @toidi to be potential reviewers

@Tarrasch
Copy link
Contributor

Can you add a test case?

@sceeter89
Copy link
Contributor Author

I'll give it a shot afternoon.

@sceeter89
Copy link
Contributor Author

@Tarrasch Instead of adding new test case I changed fake Popen stuff to let PigTask read input (by returning None from poll at least once). It was enough to make all tests fail. And applying fix fixed that.

@Tarrasch
Copy link
Contributor

Tarrasch commented Sep 1, 2016

LGTM. But someone knowing pig should review.

@sceeter89
Copy link
Contributor Author

Actually this issue has nothing to do with Pig itself, but if you know such a person it would be great to let her/him know.

@Tarrasch
Copy link
Contributor

Tarrasch commented Sep 1, 2016

Okay. If nobody comments for a while I suppose this can be merged. 👍

@gpoulin
Copy link
Contributor

gpoulin commented Sep 1, 2016

LGTM, I would just change line 130 to temp_stdout = tempfile.TemporaryFile('wb') to make it explicit that the temporary file is expecting binary and not string.

@sceeter89
Copy link
Contributor Author

@gpoulin Good point, thank you for that, I've just pushed this change.

@erikbern
Copy link
Contributor

erikbern commented Sep 2, 2016

lgtm

@Tarrasch Tarrasch merged commit 2a5eef7 into spotify:master Sep 5, 2016
@sceeter89 sceeter89 deleted the pig-python3 branch September 5, 2016 08:35
This was referenced Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants