Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stricter pointer/ref checks & public segment encoding/decoding methods #47

Closed
brettz9 opened this issue Dec 30, 2015 · 38 comments
Closed

Comments

@brettz9
Copy link
Contributor

brettz9 commented Dec 30, 2015

Two related issues:

  1. Regarding a proper JSON Reference, the JSON Reference spec states:
 If a URI contains a fragment identifier, then the fragment should be
   resolved per the fragment resolution mechansim of the referrant
   document.  If the representation of the referrant document is JSON,
   then the fragment identifier SHOULD be interpreted as a
   [JSON-Pointer].

The JSON Pointer spec, in turn, says about JSON Pointers:

   Since the characters '~' (%x7E) and '/' (%x2F) have a special meaning
   in JSON Pointer, they need to be encoded as '~0' and '~1'
   respectively, when appearing in a reference token.

I would think therefore, that when executing isPtr upon encountering any tildes not followed by 0 or 1, false should be given. Similarly when executing isRef, especially if we detect that the target document is indeed JSON. This will also have the consequence of potentially throwing for this reason in pathFromPtr.

The spec seems to make no mention of any requirement to do this, however:

Evaluation of each reference token begins by decoding any escaped
   character sequence; this is performed by first transforming any
   occurrence of the sequence '~1' to '/', then transforming any
   occurrence of the sequence '~0' to '~'.

...but I think that when one asks whether something is a pointer or reference, it is good to know whether it has had proper escaping.

2 .I also think that as the JsonRefs tool, you should make the segment encoding and decoding functions public.

@whitlockjc
Copy link
Owner

  1. Good point. We can do this.
  2. Agreed.

@brettz9
Copy link
Contributor Author

brettz9 commented Dec 30, 2015

Are you open to reopening this for the sake of the first of my two requests? If so, it would be nice to be able to avoid throwing an error in the case where tildes are not followed by 0 or 1 since it is not technically illegal, and I also have a specific use-case for this (using tildes to surround variables within JSON references which may be substituted at some point, e.g., locale).

@whitlockjc
Copy link
Owner

I thought I handled them. Also, I'm not throwing an error for tildes followed by something other than 0 or 1, just returning false.

@brettz9
Copy link
Contributor Author

brettz9 commented Dec 30, 2015

Sorry about that then--haven't looked too carefully yet.

@brettz9
Copy link
Contributor Author

brettz9 commented Dec 30, 2015

As far as the tildes and error though, what I mean is that since isPtr will return false for such entries, the pathFromPtr (and in turn findRefs) functions will fail (not to mention not give the user the option to find them).

@whitlockjc
Copy link
Owner

Good point. #getRefDetails is there to help diagnose busted references, as in ones that don't show up in #findRefs and related. We could make it an option to support invalid references when finding/resolving references so at least the metadata would be there to help identify busted references.

All of this being said, if you have an invalid reference, for any reason, finding/resolving references will not process them so it's not really a big change. I think if a user expects to find/resolve a reference and it's not happening, advising them to use #getRefDetails would suffice.

Right now, these were the changes related to closing this issue. Please confirm that I've addressed everything:

  • I exposed #encodePath and #decodePath (I thought about exposing #encodePtr and #decodePtr but the first would be impossible to support because you couldn't tell if / was a path segment delimiter or part of a path segment. Without the first, the second didn't seem as useful but I'm open to suggestions.)
  • I updated #isPtr to validate the JSON Pointer tokens
  • I updated #encodePath and #decodePath to encode/decode in the proper order

Let me know if there is more and we'll reopen this.

@brettz9
Copy link
Contributor Author

brettz9 commented Dec 31, 2015

encodePtr and decodePtr were even more in the spirit of what I was looking for, but I hear you about the ambiguity.

It could be handled as JS does with encodeURI/encodeURIComponent, where using encodePtr/encodePtrComponent would cause the latter to escape / and the former to not escape, though one must know which to use. (The decode equivalents would seem to always need to convert ~1 to /, and thus one would only need decodePtrComponent unless decodePtr accepted a string and returned a path--which would be weird). Maybe it is just best to let the user convert to and from paths and supply to these methods you provided. So I guess I'm ok for now without adding any new methods.

As far as getRefDetails, since one must know the location in advance for getRefDetails, I don't think it will meet my needs. One use case I have, for example, is to find all of these poorly formed references and do variable substitutions (http://example.com/~var1~/and/~var2~) to make them well-formed and then have them resolved. I specifically use tildes to do this since they are (at least in the present spec) safe from having other meanings and can potentially alert me when I have not yet substituted the variables.

@brettz9
Copy link
Contributor Author

brettz9 commented Dec 31, 2015

Further to the discussion of finding invalid pointers, I think isPtr itself (and relevant calling functions) ought to accept a second argument as to whether the treatment should be strict or not (since the spec seems to technically allow poorly formed pointers).

@whitlockjc
Copy link
Owner

So you want me to remove the exposed decodePath and encodePath? I'm not against it, just wanting to be clear.

As for needing to know to use #getRefDetails, I understand. The biggest issue I've got with some of these APIs is based on my needs, I am pretty good. If you need something else, letting me know is how we'll get it sorted. One thing we could do is update #findRefs to allow for a configuration to return metadata for JSON Reference-like objects that do not pass validation and to include the reasons for them being invalid. #getRefDetails would do some of this but like you say, knowing you need to use it is painful. (This might not be a simple thing to implement since I'll have to break out a #isPtr and #isRef to throw errors and allow trapping them for #findRefs, #resolveRefs and friends. Shouldn't be too hard.)

In the end, I just want #isPtr and #isRef to just return booleans and to not throw errors. As long as I can keep it like that but enhance #findRefs, since all other APIs use it, we should be good to go. Any idea on a good configuration option? options.strictMode comes to mind, which would default to true of course.

@whitlockjc
Copy link
Owner

If I understand this correctly, these will be the next steps:

  • Remove decodePath and encodePath as they aren't really useful as-is and the pointer versions aren't really possible
  • Update #findRefs to allow you to specify some "strict mode" to allow you to find potential JSON References that are invalid

@whitlockjc whitlockjc reopened this Dec 31, 2015
@brettz9
Copy link
Contributor Author

brettz9 commented Dec 31, 2015

As far as decodePath and encodePath, on further reflection, I actually like them as is. I was just talking about the idea of encodePtr, etc., which I no longer think is necessary due to the reasons you mention.

@whitlockjc
Copy link
Owner

Okay. So keep the new methods as-is. :) What about the configuration option to make #findRefs find invalid references?

@brettz9
Copy link
Contributor Author

brettz9 commented Dec 31, 2015

Everything you said on isPtr/isRef and findRefs seems good to me. I think you should be able to avoid throwing errors in those methods by just accepting another argument as to whether to check in strict mode or not, no? That might be useful since, even in non-strict mode, one may wish to check that '#' or '/' was at the beginning, for example.

@brettz9
Copy link
Contributor Author

brettz9 commented Dec 31, 2015

And maybe just rename strictMode to strict to simplify and to avoid confusion with JS' strict mode.

@whitlockjc
Copy link
Owner

The more I think about it, options.includeInvalid or options.includeInvalidRefs would be more clear. I've got the implementation in mind already.

@brettz9
Copy link
Contributor Author

brettz9 commented Dec 31, 2015

ok, cool.

@brettz9
Copy link
Contributor Author

brettz9 commented Dec 31, 2015

You saw my other comment about accepting another argument to isPtr to loosen the rules for what is valid or not as far as tildes?

@whitlockjc
Copy link
Owner

Yup.

@whitlockjc
Copy link
Owner

As I'm implementing this, I'm not sure this is very useful. I can see how having #findRefs find references that aren't strictly valid so that you can identify them but making isPtr and isRef support a new strict flag to allow you to return true for invalid JSON Pointers/References doesn't seem useful. I don't know how you'd use it that couldn't be provided to you via #findRefs. Thoughts?

@whitlockjc
Copy link
Owner

I do realize that you could manually do something like:

if (!JsonRefs.isPtr(val) && JsonRefs.isPtr(val, false)) {
  // This is a JSON Pointer like value that is strictly invalid but structurally valid
}

If that's the use case you want to allow, I can make that happen. Heck, as I type this I think this is how I will need to do this for #findRefs anyways but is it worth exposing?

@whitlockjc
Copy link
Owner

The reason I ask is I'm not sure why you'd use this directly unless you were in the business of rewriting the logic of #findRefs. If #findRefs were updated to work as we mentioned above, to include reference-like values that fail strict validation, it could be enough. Well, that's my opinion. I'll wait to see what your thoughts are.

@brettz9
Copy link
Contributor Author

brettz9 commented Dec 31, 2015

I guess I was mistaken in my OP though in that the spec does state it being a requirement for escaping:

...'~' needs to be encoded as '~0' and '/'
   needs to be encoded as '~1' when these characters appear in a
   reference token.

Nevertheless, I still think that if we have a public function isPtr, that it makes sense to be able to check by something meeting our non-strict definition (assuming you agree that it is useful to leverage tildes for variable substitution, being no other safe mechanism available).

@whitlockjc
Copy link
Owner

What do you think about an argument that basically says "throw an Error with the details as to why the JSON Pointer is invalid"? That seems more useful and #findRefs could use for its enhancement to include invalid JSON References.

@brettz9
Copy link
Contributor Author

brettz9 commented Jan 1, 2016

Sure, as long as it is exhaustive or only reports the tilde errors if there are no other errors...

@whitlockjc
Copy link
Owner

I'll get something together and we can discuss then.

@whitlockjc
Copy link
Owner

Let me run this by you first. Basically, I think the real features we want are the ability to identify why a candidate is not a valid JSON Pointer/Reference and to make it so that #findRefs can include metadata for JSON Reference like objects that fail strict validation but are otherwise structurally sound.

Is that right?

The reason I ask is because we can make it so that #isPtr and #isRef can throw Error details upon request but otherwise work like they do now. #findRefs could have a configuration to include invalid _(as in failed strict validation)_JSON References. I'm thinking aloud here. The hope is that #isPtr or #isRef will work like it does now, but with a way to identify why a JSON Pointer/Reference like structure isn't valid so that #findRefs can work the way we want.

@brettz9
Copy link
Contributor Author

brettz9 commented Jan 3, 2016

Yeah, I think that is right--though maybe one more addition that would be nice would be a chance for resolveRefs to recover from poorly formed references into well-formed ones.

@whitlockjc
Copy link
Owner

I just re-read the JSON Pointer spec again and I'm not sure that a dangling ~ is an error condition. The reason being is ~ is valid for any unescaped JSON Pointer and it would be impossible, without some explicit argument/option/API, to tell if we were operating on an escaped or unescaped JSON Pointer. (Something we've never done before in any of the APIs)

That being said, I think we should remove more strict #isPtr/#isRef validation for now. I do still plan on making it where #findRefs could include invalid JSON References as that would be very useful. Thoughts?

I'll be in the json-refs Gitter channel today so we can chat there if need be. (I'll update this thread for posterity if need be.)

@whitlockjc
Copy link
Owner

I'm an idiot. Disregard the part about removing token validation. JSON Pointer values should always be encoded and decoding only happens at the time at which you evaluate the JSON Pointer value. That being said, it will stay in.

whitlockjc added a commit that referenced this issue Jan 4, 2016
`#isPtr` and `#isRef` now have a second optional argument that dictates
whether the function should throw an `Error` with the details as to why
the provided value failed validation.

See #47
@whitlockjc
Copy link
Owner

I have two commits associated with this issue. I think I've handled the scenarios we wanted. Here are the changes:

  1. You can now call #isPtr and #isRef with an optional second argument to have the reason as to why a provided value is not a JSON Pointer/Reference respectively thrown as an Error.
  2. #getRefDetails now will process JSON Reference like objects and include the reason the object fails validation in its response
  3. Any API that supports an options argument now supports options.includeInvalid which means you can find JSON Reference like objects that are not valid. (Of course, this has no impact on resolving references other than if you set options.includeInvalid to true, the reference metadata returned will have information about it.)

Please let me know if I missed something or if you think things should work differently.

@brettz9
Copy link
Contributor Author

brettz9 commented Jan 5, 2016

Sorry getting to you now--not a good health day.

Everything looks good... If you're up for it, I think:

  1. There may still be some error causes not made explicit in docs (search for throw and see which functions are included).
  2. It would be nice if, in addition or instead of pure filtering for resolveRefs, one could also have a callback which returns a string and lets a person make an invalid reference to become valid (e.g., for variable substitution).

(Sorry, I'm not submitting PR's for these, but dealing with chronic fatigue.)

@whitlockjc
Copy link
Owner

Sorry to hear about your health issues, I hope you're feeling better. As for not submitting PRs, no sweat at all. Here are my responses:

  1. I'll double check and update whatever I run into
  2. Do you have a use case for this already? The reason I ask is it seems very specific to supporting invalid JSON References and we already provide facilities to identify, with details, what JSON References are invalid. I actually think doing this outside of json-refs would be very simple as well since you could use #findRefs(obj, {includeInvalid: true}), fix the references accordingly and then use #resolveRefs(newObj).

Unless you have a compelling reason why 2 should be in json-refs, I'm inclined to pass for now since it's fairly simple to do outside of json-refs.

@whitlockjc
Copy link
Owner

I've updated the jsdoc to be more detailed in the @throws tag where appropriate. I also added a few examples to #isPtr and #isRef to better explain things, as well as some #findRefs examples as well.

If I don't hear from you in a few hours, I will likely cut 2.0.0 and we can address the things you come up with in newer releases. I think 2.0.0 looks pretty solid and unless there is something glaringly wrong with the new APIs, we should be able to release and iterate on features/fixes without a major release.

@brettz9
Copy link
Contributor Author

brettz9 commented Jan 5, 2016

My use case is primarily for i18n. Usually locale files are stored separately according to language code, so if you want to include the needed locale file by JR, you need to substitute a variable within the JR with the current (dynamically detected) locale. (I then have JRs elsewhere in the main JSON document to target specific translated strings from within the JR'd locale file.)

While I think dynamic variable substitution ought to be a fairly general problem (whether i18n or something else) and separate calls would add some overhead in needing to traverse twice, there is admittedly no formal JR-spec'd way to do this, and as you've already made the use case possible through other means, it's your call I think on whether this use case sounds general purpose enough.

@whitlockjc
Copy link
Owner

I was just about to tell you that I didn't think this should be in this library but I can see how without something, it would be painful to use #resolveRefs without it. It's like you need a pre-processor for documents but since this is for JSON Pointers/References, the best I'd be able to do would be some way to process the metadata found via #findRefs so that you could do this on the fly.

Actually, as I type that you could use #findRefs, change its metadata to work the way you want and then use options.loaderOptions.processContent to do the same for remote references. I'm sure this isn't ideal but it could be a workaround until we figure out a good way to do this.

Do you think this feature should halt 2.0.0?

@brettz9
Copy link
Contributor Author

brettz9 commented Jan 5, 2016

No, please don't let it hold you back. Thank you for making the functionality possible, btw, and thanks also for being very responsive on all of the feedback/ideas!

@whitlockjc
Copy link
Owner

No sweat. I'll create a new issue with this request in it and tag you so you can fill in the gaps in my explanation if need be.

@brettz9
Copy link
Contributor Author

brettz9 commented Jan 5, 2016

That'd be great... (Note: mentioned in #54 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants