Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse .dump-hi files of library components only once #2658

Merged
merged 1 commit into from
Oct 3, 2016

Conversation

sjakobi
Copy link
Member

@sjakobi sjakobi commented Oct 1, 2016

This fixes a performance regression introduced in 5f3ffe5.

@sjakobi
Copy link
Member Author

sjakobi commented Oct 1, 2016

With this change a null build of the amazonka project (78 packages) takes ~14s on my machine vs. ~22s with v1.2.0.

@@ -734,14 +734,14 @@ libraryFiles lib = do
resolveFilesAndDeps
Nothing
(dirs ++ [dir])
(names <> exposed)
names
Copy link
Member Author

@sjakobi sjakobi Oct 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line alone would be a sufficient fix (see names = bnames ++ exposed!) but I felt that Set is the better datastructure than [] in this case. Order isn't doesn't seem to be important and we don't want any duplicates.

I actually think that we should use Set instead of [] much more pervasively to prevent ourselves from wondering whether duplicates and order actually matter in each case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reasoning for not having it be a Set is that there is cost to constructing the Set, and we always convert it from / to a list. It is semantically more correct, though, indeed!

@mgsloan
Copy link
Contributor

mgsloan commented Oct 2, 2016

I am fine with the use of Set, but are you sure it's worth it? Seems to me like it's just extra work at runtime. It does encode more invariants (no duplicates), but I it will cost some performance for construction / enumeration. Why not just keep it lists?

@sjakobi sjakobi force-pushed the less-dump-hi-parsing branch from 4ea9fbf to 28b012e Compare October 3, 2016 22:00
This fixes a performance regression introduced in
5f3ffe5.
@sjakobi sjakobi force-pushed the less-dump-hi-parsing branch from 28b012e to fe20f8d Compare October 3, 2016 22:08
@sjakobi
Copy link
Member Author

sjakobi commented Oct 3, 2016

I have undone the switch to Set for now, will move the discussion on Set vs [] to a separate issue.

@mgsloan
Copy link
Contributor

mgsloan commented Oct 4, 2016

Thanks, glad this is fixed! :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants