Skip to content

Conversation

@xianwen
Copy link
Contributor

@xianwen xianwen commented Sep 14, 2019

Compare the timestamps of target Unit file and the source Unit file, and skip the Unit remapping and writing if the target Unit file is newer than the source Unit file.

This PR addresses #6

Copy link
Contributor

@kastiglione kastiglione left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xianwen! I have a couple inline comments.

Can you format with clang-format?

index-import.cpp Outdated
StringRef inputFile,
IndexUnitWriter &writer) {
std::string error;
auto IsUptodateOpt = writer.isUnitUpToDateForOutputFile(outputFile, inputFile, error);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this file/project variables start with a lower case letter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

index-import.cpp Outdated

// Check if the unit file is already up to date
SmallString<256> outputFileFullPath;
path::append(outputFileFullPath, workingDir, outputFile);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is outputFile guaranteed to be relative? I think it can be absolute, I'll have to check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our case, we'd want to use the output file path before it gets remapped. For our builds, we don't copy the .o files, but we remap because that's what Xcode uses to lookup records in the index.

Maybe we should check both the path before and after remapping?

Copy link
Contributor Author

@xianwen xianwen Sep 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kastiglione, thanks for your comment!

Is outputFile guaranteed to be relative? I think it can be absolute, I'll have to check.

You're right, it can be absolute. I'll update this part.

For our builds, we don't copy the .o files, but we remap because that's what Xcode uses to lookup records in the index.

We need to use the remapped output file path even though the file wouldn't exist. This outputFileFullPath here is only used by isUnitUpToDate to calculate the Unit file name and therefore determine whether the Unit file is already up to date. Even though the ".o" file doesn't exist, we still need to calculate the Unit file path based on the remapped path. Because as you mentioned, that's what Xcode uses to lookup the Unit file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rename outputFileFullPath to remappedOutputFilePath to make it clearer.

Copy link
Contributor

@kastiglione kastiglione Sep 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks. I misinterpreted this logic. Carry on :)

@kastiglione
Copy link
Contributor

I think I have a need for this now. @xianwen do you plan to pick this back up any time soon?

@xianwen
Copy link
Contributor Author

xianwen commented Jan 2, 2020

Hi, @kastiglione, happy new year!
Yes, I'll take a stab at addressing this these few weeks.

@xianwen
Copy link
Contributor Author

xianwen commented Jan 7, 2020

Hi, @kastiglione:
All comments are addressed. PTAL when you got a chance.
Thanks!

@kastiglione
Copy link
Contributor

@xianwen looks like you'll need to rebase before we can merge.

@kastiglione
Copy link
Contributor

Thanks for coming back to it.

@xianwen
Copy link
Contributor Author

xianwen commented Jan 7, 2020

@kastiglione, merge conflict resolved :)

@kastiglione
Copy link
Contributor

I'll try this out on Wednesday.

@kastiglione
Copy link
Contributor

ok, looking at this today!

@kastiglione
Copy link
Contributor

I believe this may not help all use cases, particularly for Bazel and Buck. It depends how you integrate with Xcode.

Since the timestamp check is based on the outputFile, which is the path to the Xcode object file. This is where Xcode puts the object file that corresponds to any given source file. However, Bazel (and I presume Buck) do not write their object files into DerivedData.

For this to work, external build system integrations would have to copy object files from their build directories, into Xcode's build directory (ie DerivedData).

What do you think about instead comparing the time stamps of the input unit files, to the outputs. If the input unit file is not newer than an existing file at the output path, then we could skip writing reading->remapping->writing that unit file. This would improve performance for incremental imports.

@kastiglione
Copy link
Contributor

also very sorry this took me forever to reply too, I love the idea

@brentleyjones
Copy link
Contributor

Since the timestamp check is based on the outputFile, which is the path to the Xcode object file. This is where Xcode puts the object file that corresponds to any given source file. However, Bazel (and I presume Buck) do not write their object files into DerivedData.

The timestamp check is actually just based on the two unit files. The output filename is only used to generate the destination unit filename. The timestamp check is based on the two unit files:

https://github.com/apple/swift-clang/blob/b4c4d68d76e1432032a930efade53a406e1a3c70/lib/Index/IndexUnitWriter.cpp#L219-L253

I ran this locally and it reduced a 22 second import to a 10 second import.

@kastiglione
Copy link
Contributor

thanks for clarifying that @brentleyjones. I misinterpreted that API, the name alone sounds like the object file is involved.

@brentleyjones
Copy link
Contributor

Inlining the check, and not creating a IndexUnitWriter, reduced the time down to 5 seconds for me. About 2 seconds for cloning records and 3 for checking timestamps.

@kastiglione
Copy link
Contributor

kastiglione commented Apr 30, 2020

that's great, when you showed the numbers my instinct was that it could be optimized to avoid more work

@kastiglione
Copy link
Contributor

@DavidGoldman has also talked parallelizing along unit files instead of along indexstores. That would also help the perf.

@kastiglione
Copy link
Contributor

seems like it's safe to merge this then

@segiddins
Copy link

@kastiglione thoughts on getting this merged / released?

@kastiglione
Copy link
Contributor

@segiddins I'm going to review #46 first, I believe it subsumes this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants