Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PySparkTask: handle special characters in name (#2778) #2779

Merged
merged 2 commits into from
Sep 20, 2019

Conversation

mrk-its
Copy link
Contributor

@mrk-its mrk-its commented Sep 9, 2019

Fixes #2778

Description

It cleans name property before it is used to derive run_path and run_pickle properties.

Motivation and Context

If name property contains some special characters PySparkTask fails: #2778

Have you tested this? If so, how?

Unit test is included

Copy link
Member

@honnix honnix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GoodDok Would you like to take another look? I see you made recent commits to this file.

Copy link
Contributor

@GoodDok GoodDok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes seem fine to me, thank you!

@@ -298,8 +299,9 @@ def app_command(self):
return [self.app, pickle_loc] + self.app_options()

def run(self):
self.run_path = tempfile.mkdtemp(prefix=self.name)
self.run_pickle = os.path.join(self.run_path, '.'.join([self.name.replace(' ', '_'), 'pickle']))
name = re.sub(r'[^\w]', '_', self.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename variable name to something like run_path_prefix (maybe some better option) due to two reasons:

  • it is actually a path, not a name anymore
  • we must be having a hard time dealing with name and self.name in the same place

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Renamed.

@mrk-its
Copy link
Contributor Author

mrk-its commented Sep 16, 2019

@Tarrasch @dlstadther could you take a look? Thanks!

Copy link
Collaborator

@dlstadther dlstadther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to me

@honnix honnix merged commit 865cc4e into spotify:master Sep 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PySparkTask fails if name contains invalid characters
4 participants