Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update value at path #153

Closed
jonestristand opened this issue May 13, 2022 · 20 comments
Closed

Update value at path #153

jonestristand opened this issue May 13, 2022 · 20 comments
Assignees
Labels
feature New feature or enhancement request

Comments

@jonestristand
Copy link
Contributor

Is your feature request related to a problem? Please describe.
It would be very useful to be able to insert or modify a value at a particular path, essentially the inverse of the at_path(...).value_or(...) method chain.

Describe the solution you'd like
Something like tbl.at_path(...).insert_or_update(...). At the moment it seems like I would need to parse the path string myself to find the correct parent table to then call insert_or_assign, which is unfortunate since the path-parsing code already exists in the library. Or is there a canonical way of doing this already that I'm missing?

@jonestristand jonestristand added the feature New feature or enhancement request label May 13, 2022
@marzer
Copy link
Owner

marzer commented May 14, 2022

At the moment it seems like I would need to parse the path string myself ... or is there a canonical way of doing this already that I'm missing?

Nope, you're not missing anything. The nodes + node_views don't have any awareness of parents so the context is lost once at_path() returns. There isn't really an elegant way of doing this in your own code without essentially re-implementing at_path().

The lack of parent state is also why tbl.at_path(...).insert_or_update(...) can't really work- the returned node_view would have to know how it got there to modify above itself on the hierarchy.

I'd really like to avoid adding parent state tracking to nodes if I can avoid it (it'd be an ABI break), but I'm also not sure what else to suggest. I'm torn between a few different solutions:

  • Adding an outparam to at_path to recieve last node found during path traversal, regardless of whether or not the full path could be resolved;
  • A callback-based path traversal API, like at_path() on steroids;
  • Hoisting the path parsing out into a separate toml::path type so you could do normal path-like operations as you might in any filesystem path API (e.g. get parent, get root, etc.)

Do any of those sound helpful?

@jonestristand
Copy link
Contributor Author

To me the last one seems like the most sensible way to do it. Would make it possible to do things like below, which seems relatively tidy...

By contrast, option 1 feels a bit hacky, and callbacks seem excessive for the problem.

Option 3 would allow something like:

template <typename T>
bool update_or_insert(toml::table& tbl, const std::string& path, const T& value) {
  auto path = tbl.parse_path(path); // perhaps exception or return value to indicate if path not valid for this table?
  auto parentNode = tbl.at_path(path.parent());
  if (parentNode.is_table()) {
    parentNode.as_table().insert_or_assign(path.leafName(), value);
  }
}

Thoughts? Could also be non-breaking if at_path still accepted a string OR a toml::path object?

@marzer
Copy link
Owner

marzer commented May 15, 2022

To me the last one seems like the most sensible way to do it.

Yeah, I agree.

Could also be non-breaking if at_path still accepted a string OR a toml::path object?

Indeed. If I implemented it via the toml::path route I'd add new at_path() overloads, rather than replace the existing ones. It would also allow me to add node_view::operator[] overloads for toml::path, which is a nice consistency bonus.

I'll try to make time for this on some weekend soon-ish, but it might be a little while. Fairly busy of late. If you need a workaround in the short-term you'd get pretty far just by ripping the library's implementation of at_path() out into your own codebase and modifying it as necessary: get_at_path()

@jonestristand
Copy link
Contributor Author

I've forked you and I'll have a go at getting it started. I'd agree with overloads but suggest that the std::string version of at_path should use the same parsing code as toml::path to avoid duplicating code in two places.

@marzer
Copy link
Owner

marzer commented May 15, 2022

I'd agree with overloads but suggest that the std::string version of at_path should use the same parsing code as toml::path to avoid duplicating code in two places.

It is perhaps 'unclean' to have two implementations, but there are real benefits to doing it separately - if you parse a full path out into some string-based data type (which is what toml::path would likely end up being), and then do the resolve, you have to do the full parse and allocate the backing store for the string, regardless of whether or not the path actually points to something. The current at_path() doesn't allocate at all (meaning it is also noexcept, except in the std::wstring_view case) and stops parsing the path as soon as it hits a path fragment that doesn't map to any node. Merging the approaches would lose those benefits.

@jonestristand
Copy link
Contributor Author

That's a good point, particularly the short-circuiting of path parsing is nice. The only way to do that with a toml::path would be if it has awareness of the table that it belongs to which seems like a bad idea (you may want to pass them around between tables). I'll have a go at implementing it with a similar pattern to std::filesystem::path and submit a PR :-)

@marzer
Copy link
Owner

marzer commented May 16, 2022

I'll have a go at implementing it with a similar pattern to std::filesystem::path and submit a PR :-)

Sounds great! Check out CONTRIBUTING.md if you haven't already, it has most of what you need to get going. I'll also be available to answer any questions, too, of course.

@jonestristand
Copy link
Contributor Author

Will do! One thing not mentioned in there - do you prefer PRs to be a separate branch? Or just a main -> main PR?

@marzer
Copy link
Owner

marzer commented May 16, 2022

Eh, I don't really have a strong preference. People like to complain that main -> main is bad somehow, but if it's coming from a fork and I do a squash + merge, it all comes out the same colour ¯\_(ツ)_/¯

@marzer
Copy link
Owner

marzer commented May 16, 2022

Actually, now that I think about it a bit more, this has the potential to be a fairly large feature so maybe shouldn't go straight into main - I might want to spend a bit of time doing a few additional bits-and-pieces. I'll create a paths branch for you to target and we can merge it into that as a 'staging' branch for the feature.

@jonestristand
Copy link
Contributor Author

jonestristand commented May 17, 2022

Just an FYI, am getting a lot >=W4 warnings around enums missing specific cases despite there being defaults. I know your CONTRIBUTING.md indicates no warnings, but these seem pre-existing (and reasonable)?
edit: this is in MSVC2022
edit: I see now that these get explicitly disabled in impl/preprocessor.h, not sure why it's ignoring that
edit: completely ignore this post, I'm an idiot and missed a preprocessor command in my path.h. doh!

@marzer
Copy link
Owner

marzer commented May 17, 2022

edit: completely ignore this post, I'm an idiot and missed a preprocessor command in my path.h. doh!

Heh, no worries. I do realize that the header-only + automagically-preprocessed-into-a-single-file nature of the lib does mean there's quite a bit of header + preprocessor shenanigans, and you're likely to run into a bit of this. Some advice/clarifications you might need:

  • preprocessor.h is basically god
  • .inl files are just what would be .cpp if the library weren't header only
  • many standard library <includes> have their own "std_XXXXX.h" wrapper header (for toml.hpp generation reasons)

@jonestristand
Copy link
Contributor Author

Thanks! I had missed a TOML_DISABLE_WARNINGS macro - all works properly now! Thanks for the heads up on the std_XXX etc. stuff, that's a good catch.

Thoughts on preferred error handling if an invalid path is provided? It does look like you have some places that throw parse errors, but at_path returns a nullptr which doesn't really make sense in a class that just contains a parsed path. Alternatively could return an empty path object and an toml::path::empty static to compare against for check?

@marzer
Copy link
Owner

marzer commented May 17, 2022

Oh, almost forgot. The single-header generator script has it's own "magic comments" for indicating that code should not be included in toml.hpp:

  • single-line comments starting with //#
  • multi-line blocks starting with //# {{ and ending with //# }}

I haven't bothered documenting these anywhere because until now contributors have only really wanted to fix minor bugs or do tooling-related stuff, so nobody has needed to do anything structurally-significant enough for it to matter. That will likely be true for you too, but just thought you should know that those comments actually have a semantic meaning.

@marzer
Copy link
Owner

marzer commented May 17, 2022

Thoughts on preferred error handling if an invalid path is provided? It does look like you have some places that throw parse errors, but at_path returns a nullptr which doesn't really make sense in a class that just contains a parsed path.

At path doesn't return a nullptr, it returns a node_view that has an internal pointer of null, which means that it's "falsy" (i.e. operator bool() returns false). I realize that's pedantic, but an important distinction. I would expect any version of at_path() or operator[] that returns a 'false' node_view for std::string_view input to also do the same thing for a toml::path, since it's just a string with extra steps, e.g.:

toml::table tbl = toml::parse(/* ... */);
toml::path path = /* ... */;
auto view = tbl[path]; // or tbl.at_path(path)
if (!view)
{
	system("format c:");
}

Using a parse_error isn't right for this, methinks, since that class has references to source documents etc, and really we're just talking about something much lower level "did the data contain something at this address".

@marzer
Copy link
Owner

marzer commented May 17, 2022

Unless I've misunderstood and you actually literally mean a parse failure of the input path, rather than a failed lookup? In which case... can at_path() even fail to parse, strictly-speaking? It might interpret the path in a way the user doesn't expect if they do weird things with whitespace, but I don't think there's an early-exit due to some sort of parsing (only if a corresponding node doesn't exist).

EDIT: ok, there's one parse failure path, for array indexers that are missing an valid index integer or a closing bracket... well then. Bearing in mind that the library needs to remain usable for people who choose to disable exceptions, it may be that simply having a path be 'falsy' via operator bool() when it encounters invalid path syntax would suffice?

@jonestristand
Copy link
Contributor Author

jonestristand commented May 17, 2022

Was just going to reply and then saw your edit. I'm talking about the latter. Agree that at_path(toml::path&) should behave identically to the string version. The three failures to prase that I can find are:

  1. Not having a closing bracket on an array indexer: "test[1"
  2. Not having an array indexer: "test[]"
  3. Having something other than whitespace or a separator after the array indexer expression: "test[1]a.b" or "test[1] a.b"

I like the idea of returning an empty path and evaluating it as a falsy value. Alternatively could return the path as parsed up until the failure point, and set a flag to evaluate as falsy. Finally, I'll add a get_parse_err() method or similar. Cheers, thanks!

Do you have a discord for the library at all or anything?

edit: just saw your code sample system call - gave me a good chuckle!

@marzer
Copy link
Owner

marzer commented May 17, 2022

Alternatively could return the path as parsed up until the failure point, and set a flag to evaluate as falsy. Finally, I'll add a get_parse_err() method or similar.

My personal preference here is all-or-nothing:

  • An OK path has the full segmented string, and operator bool() returns true
  • A failed path contains literally nothing, and operator bool() returns false

Since the failure surface area is limited to just broken array indexers I don't think it needs to be much more complicated than that. Partial results tend to cause as many problems as they solve as inevitably people use them without checking error states and they 'sort of' work (until they don't).

If a fail reason is desired I suppose it could be delivered via exception, but that would need to be opt-in somehow in the same spirit of std::vector's at() versus operator[]. I'd consider that fairly small-fry at this stage though.

Do you have a discord for the library at all or anything?

As of 5 seconds ago there is a 'gitter' for this repo: https://gitter.im/marzer/tomlplusplus

@jonestristand jonestristand mentioned this issue May 25, 2022
10 tasks
@marzer
Copy link
Owner

marzer commented May 26, 2022

@wakely This might be relevant to your interests, given our discussion in #118. @jonestristand has a PR open to add this feature - #156. It won't add the full path reconstructibility you wanted, but it might help simplify/improve whatever workaround you've come up with in the meantime.

@marzer
Copy link
Owner

marzer commented Jul 31, 2022

paths has been merged into master in e2edd69 :)

(sorry it took me so long)

@marzer marzer closed this as completed Jul 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or enhancement request
Projects
None yet
Development

No branches or pull requests

2 participants