Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make getpaths type-aware, add getpaths test. #2113

Merged
merged 3 commits into from
May 26, 2017
Merged

Conversation

riga
Copy link
Contributor

@riga riga commented May 14, 2017

Description

This PR slightly updates luigi.task.getpaths which is used to replace tasks in the structured object returned by Task.requires with their respective outputs. The change makes that procedure type-aware, so e.g. when the struct's class is a subclass of dict, the returned struct has the same class.

Motivation and Context

We often rely on the order in which we define our task requirements, e.g. via an OrderedDict. However, when looping over self.input(), that order might not be preserved at the moment, because the returned object type is a plain dict.

Tests

I added task_test.TaskTest.test_getpaths and checked for several types.

@mention-bot
Copy link

@riga, thanks for your PR! By analyzing the history of the files in this pull request, we identified @erikbern, @Tarrasch and @freider to be potential reviewers.

@riga
Copy link
Contributor Author

riga commented May 14, 2017

Seems like the TOXENV=py36-nonhdfs failed, but I don't have a clue why...

@riga
Copy link
Contributor Author

riga commented May 22, 2017

Could you give me a hint how to check what went wrong?

Copy link
Contributor

@Tarrasch Tarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write a test case using an in immutable type please?

luigi/task.py Outdated
@@ -817,19 +817,19 @@ def getpaths(struct):
if isinstance(struct, Task):
return struct.output()
elif isinstance(struct, dict):
r = {}
r = struct.__class__()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this? The constructor for the dict subtype? Will this work if it's a immutable type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. If it's a subtype of dict it should always be mutable.

@Tarrasch
Copy link
Contributor

Could you give me a hint how to check what went wrong?

I just needed to rerun travis on that. Sorry never your fault.

@riga
Copy link
Contributor Author

riga commented May 25, 2017

Can you write a test case using an in immutable type please?

Should be covered with the tuple test, right?
https://github.com/spotify/luigi/pull/2113/files#diff-7fe10b1a62e793e1535a8477614c1b30R96

I can add more tests if you want.

Copy link
Contributor

@Tarrasch Tarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant for you to check that you can have a immutable dict-type/Mapping, but maybe that's overkill for now. This patch is still a strict improvement..

self.assertIsInstance(struct["OrderedDict"], collections.OrderedDict)
self.assertIsInstance(struct["list"], list)
self.assertIsInstance(struct["tuple"], tuple)
self.assertIsInstance(struct["generator"], list)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably only check for being iterable here.

for k, v in six.iteritems(struct):
r[k] = getpaths(v)
return r
return struct.__class__((k, getpaths(v)) for k, v in six.iteritems(struct))
Copy link
Contributor Author

@riga riga May 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I assumed that even immutable mappings (such as a "frozendict"-like mapping) shares the functionality of a standard dict to take a list/generator of tuples in the constructor.

AFAIK there isn't a builtin immutable mapping, but people could use implementations such as frozendict or this implementation, which should work here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this would crash if you use a collections.Counter. But if you use that for this purpose, you totally deserve it. ;p

To be clear. This code is better, there's no need for a fix. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is better, there's no need for a fix. :)

You mean

for k, v in six.iteritems(struct):
    r[k] = getpaths(v)
    return r

? Sure, I can revert that commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think the current code is better. If I have to choose between supporting custom frozen dicts or collections.Counter. I would pick the former any day (for this instance)

@Tarrasch Tarrasch merged commit 20bd7b0 into spotify:master May 26, 2017
@Tarrasch
Copy link
Contributor

@riga, I merged this, but took the liberty to greatly reword the commit message. I hope you don't mind.

Thanks for this patch! :)

This was referenced Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants