-
Notifications
You must be signed in to change notification settings - Fork 281
Prevent overflow for self-referencing schemas #185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I'll take a look at the code. Here's my comment based on the PR description: Would it be better if we handle this with the reader/walker/etc. instead of at the DOM level? The DOM having cycles should remain valid -- an object should be able to refer to itself, just like normal objects in C#. Whoever is operating on the DOM (reader/walker/etc.) should be able to stop the infinite loop once it is detected. In this case, the walker can memorize the node it has visited and make sure it does not visit the duplicate node. The reader can do something similar. I might be missing nuances here, so I'll take a look at the code and I might see the limitations with the method above. #Closed |
| var openApiDoc = new OpenApiStreamReader().Read(stream, out var diagnostic); | ||
|
|
||
| // Assert | ||
| var components = openApiDoc.Components; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var components = openApiDoc.Components; [](start = 16, length = 39)
Let's also check that components is parsed correctly. #Closed
|
|
||
|
|
||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Location-wise I think this should go into OpenApiComponentTests (which doesn't exist yet) - also with the yaml not containing sections other than components. #Closed
|
Ah. My mistake. I read your code and understand now what you are trying to do. I still think we should allow the graph to have a cycle, so let me think about how to do that for a bit. In reply to: 359091511 [](ancestors = 359091511) |
|
One way I can think of: Post-process the schema section in the deserialization of Components. After all the schemas are loaded, recursively delve into the schemas (into the fields in Schema object that are also a Schema object) and ensure that they reference the schemas in the Components.Schemas map. Essentially, we change a schema we see (possible just a dummy schema with $ref you created) to the real schema. This is possible since all schemas have already been loaded at this point. #Closed |
|
@PerthCharern Yeah, doing a post process to hook up the cycles will be my fallback approach. I'll see if I can find a cleaner way first. #Closed |
|
@PerthCharern After failing to find a solution to resolve these $ref on first pass, I'm thinking that using a second pass to resolve the $refs might actually address another concern. In another branch I have been working on resolving $refs to external documents and have the problem that external $refs need to be loaded asynchronously. Perhaps reading can be a three step process:
I'm not sure how to expose the API for this, but at least knowing that it could help with resolving external references gives me more confidence in breaking this down into multiple steps |
|
@darrelmiller That sounds good to me. I don't have any concerns with doing multiple passes to complete the reading process. Regarding exposing these as APIs, I am afraid I don't have much input for you now. One thing I can think of is we should expose the API to resolve external references independently since that sounds pretty useful as a standalone feature. Given that the logic would be similar regardless of what we expose, I think we can also discuss once you send out a PR. #Closed |
|
** NEW BEHAVIOUR ** Ok, this PR is finally ready for review.
Remote references are not yet supported. A new branch has been started to support that. There will need to be a new ReadAsync method to support resolving remote references. /cc @PerthCharern @xuzhg |
| /// <summary> | ||
| /// Indicates object is a placeholder reference to an actual object and does not contain valid data. | ||
| /// </summary> | ||
| public bool UnresolvedReference { get; set; } = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
= false; [](start = 54, length = 8)
nit, remove #Closed
|
|
||
| if (!reference.Type.HasValue) | ||
| { | ||
| throw new ArgumentException("Local reference must have type specified."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Local reference must have type specified." [](start = 44, length = 43)
nit, This should be a resource for consistency #Closed
|
|
||
| /// <summary> | ||
| /// Visits responses. | ||
| /// Visits headers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
headers [](start = 19, length = 7)
callbacks
| { | ||
| foreach (var header in headers) | ||
| { | ||
| Walk(header.Key.Replace("/", "~1"), () => Walk(header.Value)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.Replace("/", "~1") [](start = 35, length = 19)
nit, I thought we already handled this somewhere else. If not, can we do this in the VisitorBase.Enter, given that this is a JSON pointer requirement.
| /// <summary> | ||
| /// Convert all references to references of valid domain objects. | ||
| /// </summary> | ||
| ResolveRemoteReferences |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ResolveRemoteReferences [](start = 8, length = 23)
either ResolveAllReferences or ResolveLocalAndRemoteReferences #Closed
| switch (_settings.ReferenceResolution) | ||
| { | ||
| case ReferenceResolutionSetting.ResolveRemoteReferences: | ||
| throw new ArgumentException(Properties.SRResource.CannotResolveRemoteReferencesSynchronously); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
throw new ArgumentException(Properties.SRResource.CannotResolveRemoteReferencesSynchronously); [](start = 20, length = 94)
This will be supported, right? Can we just go with NotImplementedException()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will be handled. If you want to resolve remote references then you should call the ReadAsync method instead. The ReadAsync will also load related documents into a workspace. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| walker.Walk(document); | ||
| break; | ||
| case | ||
| ReferenceResolutionSetting.DoNotResolveReferences: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, remove this awkward linebreak :)
| /// Reset loop tracking stack | ||
| /// </summary> | ||
| /// <param name="loopid"></param> | ||
| internal void ClearLoop(string loopid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
loopid [](start = 39, length = 6)
nit, loopId #Closed
| /// Exit from the context in cycle detection | ||
| /// </summary> | ||
| /// <param name="loopid"></param> | ||
| public void PopLoop(string loopid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
loopid [](start = 35, length = 6)
nit, loopId
| /// </summary> | ||
| /// <param name="loopId">Any unique identifier for a stack</param> | ||
| /// <param name="key">Identifier used to </param> | ||
| /// <returns></returns> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// [](start = 7, length = 24)
nit, you can just remove these non-filled tags #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, for this one, you might want to fill since it's not that obvious what the returned bool represents.
In reply to: 167710868 [](ancestors = 167710868)
| Walk(OpenApiConstants.RequestBody, () => Walk(operation.RequestBody)); | ||
| Walk(OpenApiConstants.Responses, () => Walk(operation.Responses)); | ||
| Walk(OpenApiConstants.Callbacks, () => Walk(operation.Callbacks)); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Walk the Tags array too? #Closed
| ResolveMap(components.Callbacks); | ||
| ResolveMap(components.Examples); | ||
| ResolveMap(components.Schemas); | ||
| ResolveMap(components.SecuritySchemes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be the job of the walker to do this "walking"?
Shouldn't this file just do reference-resolving for the object that's actually a reference and then leave the logic of getting into that object to the Walker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish I could do that. However, we replace the "unresolved" object that is empty, except for the Reference, with the completely filled object. Therefore, I need to detect the reference from the parent object that holds the pointer to the unresolved object so that we can change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is true. It's kinda unfortunate since that means we can't fully leverage the Walking capability.
If we revise the Walk method design to take in another variable (in this case the parent object), and then it can make changes to that variable, then that would work. But that's a big change, and it doesn't seem clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PerthCharern Yes, the walker is not ideal in this scenario. However, it is expected to use the parent object to do certain types of work. e.g. Checking if a value is not null, needs to be done from above. Checking if there are no duplicate parameter names. Checking if two peer values are not contradicting each other.
Passing in the parent object is hard because you don't know its type. It would need to be passed in as an object and then cast to each of the possible parent types. In the end I think it is just as easy to do certain contextually sensitive work from the parent visit method.
| {s => s.StartsWith("x-"), (o, p, n) => o.AddExtension(p, n.CreateAny())} | ||
| }; | ||
|
|
||
| private const string schemaLoopId = "schema"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private const string schemaLoopId = "schema"; [](start = 6, length = 47)
remove
| // } | ||
| //}; | ||
|
|
||
| //schemaExtension.AllOf[0].Properties["child"] = schemaExtension; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this fail the unit test due to the loop? If so, you can set xUnit to only delve down until specified level just for this test #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't figure out why it was failing. I kept getting an error that made no sense. Anyway, I just tried running it again and it is working now. #Resolved
PerthCharern
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🕐
PerthCharern
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
![]()
Fix for #182
UPDATE Major update to this PR. See below
This fix does address the problem in the parser. However, it produces a bit of an unusual DOM and I'm not sure this is what we want.
We detect the cycle whilst constructing the dom object, so when we discover there is a cycle we don't have direct access to the original object we are creating. What this currently does is create a second instance of the self-referenced object and the second instance contains just an OpenApiReference back to the original object. This means the object graph itself doesn't have a cycle in it. This is handy so that the Walker doesn't get stuck in loops. The reader doesn't have a problem because it only fully explodes objects that live under $/components/schemas so it doesn't even try to render the second instance of the object.
It does mean that someone could construct a graph manually with a cycle in it, that would break the walker. Again I think the reader would survive.
I will open an issue regarding the walker failing for graphs with cycles and we should consider if the reader should actually try and create a cycle in the DOM.