Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: New AssemblyName-like type for TypeName parsing #100867

Closed
adamsitnik opened this issue Apr 10, 2024 · 14 comments · Fixed by #100094
Closed

[API Proposal]: New AssemblyName-like type for TypeName parsing #100867

adamsitnik opened this issue Apr 10, 2024 · 14 comments · Fixed by #100094
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Reflection.Metadata
Milestone

Comments

@adamsitnik
Copy link
Member

adamsitnik commented Apr 10, 2024

Background and Motivation

While working on the implementation of TypeName API (#97566), it turned out that it's impossible to set AssemblyName.CultureName without allocating a CultrueInfo instance (using public NS2.0 compatibile APIs).

set => _cultureInfo = (value == null) ? null : new CultureInfo(value);

The type name parser needs to support parsing from untrusted input and such non-predefined culture name may be used as a vector of attack, so the existing AssemblyName is not suitable for our needs.

Since AssemblyName suffers from some other design issues (mutability, a LOT of obsolete properties), I would like to introduce a new, immutable, lightweight version of AssemblyName type.

Proposed API

It's a concept and I need the help of API review board to choose the right name. Some of my ideas: ReadOnlyAssemblyName, AssemblyId, LibraryName.

namespace System.Reflection.Metadata;

public sealed class AssemblyNameInfo : IEquatable<AssemblyNameInfo>
{
    public AssemblyNameInfo(string name, Version? version = null, string? cultureName = null, AssemblyNameFlags flags = AssemblyNameFlags.None,
   Collections.Immutable.ImmutableArray<byte> publicKeyOrToken = default);
    public string Name { get; }
    public string FullName { get; }
    public Version? Version { get; }
    public string? CultureName { get; }
    public AssemblyNameFlags Flags { get; }
    public ImmutableArray<byte> PublicKeyOrToken { get; }
    
    public AssemblyName ToAssemblyName();
    
    public static AssemblyNameInfo Parse(ReadOnlySpan<char> assemblyName);
    public static bool TryParse(ReadOnlySpan<char> assemblyName, [NotNullWhenAttribute(true)] out AssemblyNameInfo? result);
}

The approved TypeName API needs following changes:

public sealed partial class TypeName : IEquatable<TypeName>
{
-    public string AssemblySimpleName { get; }
-    public System.Reflection.AssemblyName? GetAssemblyName();
+    public AssemblyNameInfo? AssemblyName { get; }
}

Usage Examples

Building high-level, opinionated parser on top of the low-level API:

AssemblyName GetAssemblyName(string unsafeInput)
{
    if (!AssemblyNameInfo.TryParse(unsafeInput, out AssemblyNameInfo info))
    {
        throw new ArgumentException("Invalid name!");
    }

    if (info.Name.AsSpan().IndexOfAny(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar) >= 0)
    {
        throw new InvalidOperationException("AssemblyName must not contain path characters.");
    }

    // the following line throws CultureNotFoundException for non-predefined cultures
    _ = CultureInfo.GetCultureInfo(info.CultureName, predefinedOnly: true);

    return info.ToAssemblyName();
}

Alternative Designs

The proposed design has one property PublickKeyOrToken that can express the public key or its token. It's a design choice inherited from AssemblyName that uses AssemblyNameFlags.PublicKey flag to express what given byte array is.

We could have two dedicated properties and ctor arguments to express that: PublicKey and PublicKeyToken.

Risks

Providing low quality design may require introducing another AssemblyName-like type in the future.

@adamsitnik adamsitnik added api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Reflection.Metadata labels Apr 10, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-reflection-metadata
See info in area-owners.md if you want to be subscribed.

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Apr 10, 2024
@adamsitnik adamsitnik removed the untriaged New issue has not been triaged by the area owner label Apr 10, 2024
@adamsitnik adamsitnik self-assigned this Apr 10, 2024
@teo-tsirpanis
Copy link
Contributor

How about naming it AssemblyNameInfo? Similar to TimeZone and TimeZoneInfo.

@jkotas
Copy link
Member

jkotas commented Apr 10, 2024

internal AssemblyName2();

Should this have a public constructor that allows you to set all constituent parts?

@jkotas
Copy link
Member

jkotas commented Apr 10, 2024

AssemblyName ToAssemblyName();

Should this have a FromAssemblyName method too?

@MichalStrehovsky
Copy link
Member

If you're looking for code that could use this, I'd want to switch the managed type system to this AssemblyName:

public interface IAssemblyDesc
{
/// <summary>
/// Gets the assembly name.
/// </summary>
AssemblyName GetName();
}
}

We're using AssemblyName right now and it's probably the only reason why we cannot turn on culture invariant mode on tools that use the managed type system. AssemblyName is unusable when culture invariant mode is enabled due to the cultureinfo.

@adamsitnik
Copy link
Member Author

How about naming it AssemblyNameInfo? Similar to TimeZone and TimeZoneInfo.

Great idea, I am going to use it as the name in the proposal.

Should this have a public constructor that allows you to set all constituent parts?

Not in my use case, but it makes total sense to have it.

Should this have a FromAssemblyName method too?

Not in my use case and a very similar internal type from NativeAOT RuntimeAssemblyName also did not need that, so for now I would say no.

If you're looking for code that could use this, I'd want to switch the managed type system to this AssemblyName:

Great!

@adamsitnik adamsitnik added blocking Marks issues that we want to fast track in order to unblock other important work api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Apr 11, 2024
@adamsitnik
Copy link
Member Author

adamsitnik commented Apr 12, 2024

I've tried to implement the API today and found out following things:

  1. If the ctor accepts a byte array, then the user may mutate that and in order to create an immutable array, we need to perform a copy. But, if the ctor accepts an immutable array, the users can use ImmutableCollectionsMarshal.AsImmutableArray(array), pass the result to the ctor and avoid the extra copy.
  2. When AssemblyName.FullName is called and a public key is present, a validation is being performed and it may throw SecurityException. I believe it's better to validate it in the ctor and throw ArgumentException
    public string FullName
    {
    get
    {
    if (string.IsNullOrEmpty(this.Name))
    return string.Empty;
    // Do not call GetPublicKeyToken() here - that latches the result into AssemblyName which isn't a side effect we want.
    byte[]? pkt = _publicKeyToken ?? AssemblyNameHelpers.ComputePublicKeyToken(_publicKey);
    return AssemblyNameFormatter.ComputeDisplayName(Name, Version, CultureName, pkt, Flags, ContentType);

    if (!IsValidPublicKey(publicKey))
    throw new SecurityException(SR.Security_InvalidAssemblyPublicKey);
  3. When AssemblyName.FullName is called and a public key token is present, a validation of the length is being performed and it may throw ArgumentException. I believe it's better to validate it in the ctor.
    return AssemblyNameFormatter.ComputeDisplayName(Name, Version, CultureName, pkt, Flags, ContentType);

    if (pkt.Length > PUBLIC_KEY_TOKEN_LEN)
    throw new ArgumentException();
  4. When AssemblyName.GetPublicKeyToken is called, it may compute the token if public key is present. Since we want to make it immutable this work needs to be performed at ctor.
    public byte[]? GetPublicKeyToken() => _publicKeyToken ??= AssemblyNameHelpers.ComputePublicKeyToken(_publicKey);

Please let me know if you believe that my observations are wrong.

@jkotas
Copy link
Member

jkotas commented Apr 14, 2024

This type should not be doing any validation of the PublicKey or PublicKeyToken; or try to compute PublicKeyToken from PublicKey. It should treat the PublicKeyToken/PublicKey as an opaque blob. (BTW: We have been going in this direction for the assembly loader in modern .NET. Strong names and strong name mismatches are ignored by the assembly loader, and the publickey/publickeytoken are just treated as opaque blobs that are passed around for backward compatibility.)

We may want to have just a single property PublicKeyOrToken and use AssemblyNameFlags.PublicKey to determine whether it is key or token.

The existing behavior of AssemblyName.FullName that converts PublicKey to PublicKeyToken is questionable. It is perfectly fine to have full assembly name with public key. I do not think we want copy this behavior in this type. The full name should match what you have constructed the AssemblyNameInfo with, without any conversions.

@MichalStrehovsky
Copy link
Member

How would be go about implementing IEquatable<AssemblyNameInfo> in the presence of PublicKey/PublicKeyToken? If we have two AssemblyNameInfos for the same assembly, but one using PublicKey and the other using PublicKeyToken, would they not be equal? Or would we ignore public key in equality checks? (And if we ignore it, do we have any concerns when S.R.Metadata is used on desktop, since it still ships as netstandard?)

@jkotas
Copy link
Member

jkotas commented Apr 15, 2024

I would expect that IEquatable<AssemblyNameInfo> would look for exact match of all components (the name can be compared as case insensitive).

IEquatable<AssemblyNameInfo> would not be usable for ref/def binding. As you have pointed out, the .NET universe has number of different implementations of ref/def bindings.

@bartonjs
Copy link
Member

bartonjs commented Apr 16, 2024

Video

  • An AssemblyNameInfo(AssemblyName) ctor was proposed, but decided it was not needed.
  • We talked about adding other properties from AssemblyName that are representable in the textual form, but decided they can be added later.
  • We removed the IEquatable-ness from the type and from TypeName. When we figure out what the semantics should be we can add it back.
namespace System.Reflection.Metadata;

public sealed class AssemblyNameInfo
{
    public AssemblyNameInfo(string name, Version? version = null, string? cultureName = null, AssemblyNameFlags flags = AssemblyNameFlags.None,
   Collections.Immutable.ImmutableArray<byte> publicKeyOrToken = default);
    public string Name { get; }
    public string FullName { get; }
    public Version? Version { get; }
    public string? CultureName { get; }
    public AssemblyNameFlags Flags { get; }
    public ImmutableArray<byte> PublicKeyOrToken { get; }
    
    public AssemblyName ToAssemblyName();

    public static AssemblyNameInfo Parse(ReadOnlySpan<char> assemblyName);
    public static bool TryParse(ReadOnlySpan<char> assemblyName, [NotNullWhenAttribute(true)] out AssemblyNameInfo? result);
}
-public sealed partial class TypeName : IEquatable<TypeName>
+public sealed partial class TypeName
{
-    public string AssemblySimpleName { get; }
-    public System.Reflection.AssemblyName? GetAssemblyName();
+    public AssemblyNameInfo? AssemblyName { get; }
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed blocking Marks issues that we want to fast track in order to unblock other important work api-ready-for-review API is ready for review, it is NOT ready for implementation labels Apr 16, 2024
@adamsitnik adamsitnik added this to the 9.0.0 milestone Apr 17, 2024
@Neme12
Copy link

Neme12 commented Apr 20, 2024

I really think TypeName should be TypeNameInfo to match AssemblyNameInfo for consistency if that's what AssemblyNameInfo will be called. (Not that I think TypeNameInfo is a great name - but AssemblyNameInfo isn't either).

@Neme12
Copy link

Neme12 commented Apr 20, 2024

Looking at AssemblyNameInfo, it seems it's only a bag of properties with a constructor with parameters that match each of the property. Should this be a record with init properties then? That has the benefit that users will be able to use with to get another instance with one property changed. But that would require IEquatable<T> and also wouldn't allow any validation across multiple properties, if there is anything like that.

@Neme12
Copy link

Neme12 commented Apr 20, 2024

I really think the types should implement IParsable<TSelf> since they already have Parse methods. Why not? They already have the functionality, so it's not any maintenance burden to just add the interface. My view is that a class should always implement standard interfaces corresponding to the methods it already provides, so that it can be used by someone when they're working with the interface or in a generic context. Otherwise it can't be since interfaces aren't duck typed in C#. Please consider taking this viewpoint and always adding standard interfaces corresponding to the functionality the class already provides. The same way that if a type has an Equals(T other) method, it should implement IEquatable<T>, or if it has a Dispose() method, it should implement IDisposable (unless it's a ref struct of course).

EDIT: I do admit that it doesn't exactly match the interface because it also takes an IFormatProvider and a format string, but .NET does this in many places, including on core types like bool, where there are no formatting options so the public method that is exposed only takes the input, and it implements the interface explicitly, ignoring the format string (although I personally think it should throw if the format string is not empty, because it won't be respected anyway, and because we might have the possibility of adding a few formatting options in the future via the format string (e.g. regular vs strict by r and s characters), and if it already works and always does the default thing, then we can't do that as that would be a big breaking change). EDIT2: Scratch that, I was wrong, IParsable doesn't take a format string, that's only the case for IFormattable. It only takes a culture, which is fine to ignore.

@github-actions github-actions bot locked and limited conversation to collaborators May 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Reflection.Metadata
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants