-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove BinaryFormatter from GetSDKReferenceFiles #6324
Remove BinaryFormatter from GetSDKReferenceFiles #6324
Conversation
I would hardcode the translation logic for each type T that we need, and just write the count of key-value pairs, followed by each key and value separately. |
Can you explain why we need the counts? I haven't looked at how complete our test coverage of this task was, but it seems to pass whatever tests we previously had without it. |
I might be not understanding it well, but how are you going to read the keys and values if you don’t know how many there are? |
Looks like the answer is: there weren't tests. Makes it easy to pass them 😋 I was imagining something like looking for closing elements. In json, for instance, they have { "key1": "val1", "key2": "val2" }. You don't need a count because the opening and closing braces match. I don't know if there's an equivalent here, but I now think it just takes references and gets or sets them based on what sort of data you ask for, so since the dictionary would have started out full in serializing, it would serialize properly. Then, in deserialization, the dictionary would be null, and it would throw an exception. If I'd initialized the dictionary, it would still be empty and remain so. I added a count. I should probably add tests. |
src/Tasks/GetSDKReferenceFiles.cs
Outdated
@@ -1205,7 +1194,7 @@ private static IEnumerable<string> GetAllReferenceDirectories(string sdkRoot) | |||
/// </summary> | |||
/// <remarks>This is a serialization format. Do not change member naming.</remarks> | |||
[Serializable] | |||
private class SdkReferenceInfo | |||
internal class SdkReferenceInfo | |||
{ | |||
/// <summary> | |||
/// Constructor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're getting rid of useless summaries, might as well delete this one.
|
||
/// <summary> | ||
/// Dictionary which maps a directory to a list of file names within that directory. This is used to shortcut hitting the disk for the list of files inside of it. | ||
/// </summary> | ||
public ConcurrentDictionary<string, List<string>> DirectoryToFileList { get; } | ||
public ConcurrentDictionary<string, List<string>> DirectoryToFileList { get { return directoryToFileList; } } | ||
|
||
/// <summary> | ||
/// Hashset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove useless summary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has the bonus of deleting a summary that isn't actually accurate. Turns out if a mapping isn't in the dictionary, it doesn't look for it; it just assumes it isn't there.
src/Tasks/GetSDKReferenceFiles.cs
Outdated
public SDKInfo(ITranslator translator) : this() | ||
{ | ||
Translate(translator); | ||
} | ||
|
||
/// <summary> | ||
/// Constructor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove useless summary
src/Tasks/GetSDKReferenceFiles.cs
Outdated
} | ||
|
||
/// <summary> | ||
/// A dictionary which maps a file path to a structure that contain some metadata information about that file. | ||
/// </summary> | ||
public ConcurrentDictionary<string, SdkReferenceInfo> PathToReferenceMetadata { get; } | ||
internal ConcurrentDictionary<string, SdkReferenceInfo> PathToReferenceMetadata { get { return pathToReferenceMetadata; } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this changed from public to internal, but other properties untouched?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this actually mattered, so I reverted it. See next comment for fiddling.
} | ||
} | ||
|
||
/// <summary> | ||
/// This class represents the context information used by the background cache serialization thread. | ||
/// </summary> | ||
private class SaveContext | ||
internal class SaveContext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this need to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to fiddle with a lot of these to make them accessible to the test class. Then I downgraded a couple because properties were more visible than their containing class, which isn't really good style, but the latter wasn't necessary. As long as something prevents it from being public/protected, I don't think it particularly matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I hadn't reviewed the test so I didn't see any need for it.
src/Tasks/GetSDKReferenceFiles.cs
Outdated
string[] keys = dictionary.Keys.ToArray(); | ||
for (int i = 0; i < count; i++) | ||
{ | ||
if (i < keys.Length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If i < keys.Length once, it won't ever change right? For better at-a-glance-understanding™, I'd suggest creating some bool translatingOut = i < keys.Length
and using that bool. Or place a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. I put the if statement first, which should technically be more performant, too, though not in a particularly meaningful way.
src/Tasks/GetSDKReferenceFiles.cs
Outdated
for (int i = 0; i < count; i++) | ||
{ | ||
translator.Translate(ref keys[i]); | ||
T value = dictionary[keys[i]]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the dictionary change while the loop runs? If a value is removed, this may throw. Do you need to lock? Also perhaps its better to enumerate key value pairs to avoid looking every key up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No—before this PR, there was only the property, and it had a getter but no setter, so it really shouldn't be a ConcurrentDictionary. It's been a ConcurrentDictionary since the initial code commit. Do you want me to change it to an ordinary dictionary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
up to you, if it feels safe
src/Tasks/GetSDKReferenceFiles.cs
Outdated
int count = dictionary.Count; | ||
translator.Translate(ref count); | ||
string[] keys = dictionary.Keys.ToArray(); | ||
if (keys.Length == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this effectively determining the translation direction (read/write)? If so, is there a more explicit way, perhaps asking the translator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like ITranslator has a TranslationDirection Mode
we can take advantage of here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok as is but I'm pretty curious if we can avoid the ConcurrentDictionary/Dictionary stuff.
src/Tasks/GetSDKReferenceFiles.cs
Outdated
@@ -1246,63 +1218,79 @@ public SdkReferenceInfo(string fusionName, string imageRuntime, bool isWinMD, bo | |||
/// Structure that contains the on disk representation of the SDK in memory. | |||
/// </summary> | |||
/// <remarks>This is a serialization format. Do not change member naming.</remarks> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// <remarks>This is a serialization format. Do not change member naming.</remarks> |
Stale
src/Tasks/GetSDKReferenceFiles.cs
Outdated
@@ -1023,7 +1012,7 @@ internal SDKInfo GetCacheFileInfoFromSDK(string sdkRootDirectory, string[] sdkMa | |||
|
|||
PopulateRedistDictionaryFromPaths(directoryToFileList, redistDirectories); | |||
|
|||
var cacheInfo = new SDKInfo(references, directoryToFileList, FileUtilities.GetPathsHash(directoriesToHash)); | |||
var cacheInfo = new SDKInfo(references.ToDictionary(kvp => kvp.Key, kvp => kvp.Value, StringComparer.OrdinalIgnoreCase), directoryToFileList.ToDictionary(kvp => kvp.Key, kvp => kvp.Value, StringComparer.OrdinalIgnoreCase), FileUtilities.GetPathsHash(directoriesToHash)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a perf impact of this synchronization? Could you instead store an IDictionary
in the SDKInfo fields and on creation it's ConcurrentDictionary
/on deserialization it's a plain dictionary?
src/Tasks/GetSDKReferenceFiles.cs
Outdated
private Dictionary<string, SdkReferenceInfo> pathToReferenceMetadata; | ||
private Dictionary<string, List<string>> directoryToFileList; | ||
private int hash; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our/runtime's standard naming convention would make these:
private Dictionary<string, SdkReferenceInfo> pathToReferenceMetadata; | |
private Dictionary<string, List<string>> directoryToFileList; | |
private int hash; | |
private Dictionary<string, SdkReferenceInfo> _pathToReferenceMetadata; | |
private Dictionary<string, List<string>> _directoryToFileList; | |
private int _hash; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
info.IsWinMD = isWinmd; | ||
}, count => new Dictionary<string, SdkReferenceInfo>(count, StringComparer.OrdinalIgnoreCase)); | ||
|
||
translator.TranslateDictionary(ref _directoryToFileList, (ITranslator t, ref string s) => t.Translate(ref s), (ITranslator t, ref List<string> fileList) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funny that the duplicate translation worked because the same method handles input/output. Good thing it's not a huge concern and fairly easy to catch.
I wonder if a certain type of test would catch this? (not blocking on this thought)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but it should work as long as it matches. It serialized the dictionary twice, then deserialized it twice. Setting it the second time overwrote the correct value with the correct value, so everything still worked.
That also means I think it would be very hard to write a test for it...the only two options I can think of are having some kind of counter on how many times we accessed/set each variable or having a check at the end for the total size of the serialized form. Neither is trivial, and it isn't a huge deal for something that gets hit as often as this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting it the second time overwrote the correct value with the correct value, so everything still worked.
I see, I was thinking maybe there'd be duplicate data somehow. Agreed that it's fine as is.
I think this is almost right, but TranslatorHelpers needs System.Collections.Concurrent on .NET 3.5, or I need to be able to translate a concurrent dictionary without translating it. (?) Any idea on how I can get around that?