-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std.json: support field aliases #8987
Conversation
- if a struct T passed to parse(T) contains a pub const __tags of the form shown below, these aliases will be used as the json keys instead of the field name. - adds std.mem.{indexOfSliceScalar, indexOfSliceScalarPos} which support finding aliases. adding these was my workaround to a compiler bug encountered when using a for loop for the same purpose. The names __tags and alias were chosen by me ad-hoc and may be easily changed.
@SpexGuy suspects that this is failing due to significant comptime usage and may have to wait for stage 2. |
I messed up my rebase in previous pr. Forgive me as I've closed it and reopened this after mikdusan suggested that the real issue was needing a rebase. Maybe this will pass ci now... |
Sorry about that. With self hosted you should be able to comptime to your heart's content. I am confident we can ship it by zig 0.9.0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is the best way to accomplish this feature: I was thinking it would be better to provide on a per parse
/stringify
invocation via an option that contains a map of overrides
@andrewrk looking forward to 0.9 when comptime will go brrr 👍. Should i leave this pr open then after everything else is resolved? Or close it and plan to re-open / make a new pr after 0.9? |
I'd like to reserve the PR queue for "ready to be reviewed & merged". If something can't be merge-ready due to being blocked by other (not immediately solvable) issues, then it should be an issue, and then resurrected as a PR when the time comes. |
Sounds good. I opened #8993 as a reminder to revisit after 0.9. Will close this once other issues with this pr are resolved. |
@daurnimator i made some changes. i'm not sure what needs to be done to support escapes. are you saying that the escaped version of |
I'm saying you need to make sure
i.e. if Note that to do this you'll want to change your implementation to do most of the work inside the Also note typo in your commit message: |
- introduces `ParseOptions.field_aliases` as a `[]const [2][]const u8`. usage becomes: `.{ .field_aliases = &.{.{ "value", "__value" }} }` - buildFieldAliases() now returns a `StringHashMapUnmanaged([]const u8)` and no longer requires excessive comptime memory.
52377d5
to
eb536da
Compare
thank you 👍. so i added this test which i expected to fail, but its passing: test "field alias escapes" {
const S = struct { foo: u8 };
const text =
"{\"__f\x6fo\": 42}";
const s = try parse(S, &TokenStream.init(text), .{
.field_aliases = &.{.{ "foo", "__foo" }},
.allocator = std.testing.allocator,
});
try testing.expectEqual(@as(u8, 42), s.foo);
} not sure what to make of this... 🤔 am i testing the wrong thing? i tried this which also passed: test "field alias escapes" {
const S = struct { foo: u8 };
const text =
"{\"__foo\": 42}";
const s = try parse(S, &TokenStream.init(text), .{
.field_aliases = &.{.{ "foo", "__f\x6fo" }},
.allocator = std.testing.allocator,
});
try testing.expectEqual(@as(u8, 42), s.foo);
} can you think of a test case which i should include to make this fail? |
@travisstaloch you have 2 layers of escaping: |
Ok, well I was initially using multiline escapes, but ran into this problem. Turns out std.json can't even parse this and is giving the same error i was seeing: const std = @import("std");
test "field alias escapes" {
const S = struct { foo: u8 };
const text =
\\{"f\x6fo": 42}
;
const s = try std.json.parse(S, &std.json.TokenStream.init(text), .{});
try std.testing.expectEqual(@as(u8, 42), s.foo);
} gives error: $ zig version
0.9.0-dev.21+7462b0e5b
$ zig test /tmp/test.zig
test "field alias escapes"... FAIL (InvalidEscapeCharacter)
~/Documents/Code/zig/zig/build/lib/zig/std/json.zig:753:21: 0x211714 in std.json.StreamingParser.transition (test)
return error.InvalidEscapeCharacter;
^
~/Documents/Code/zig/zig/build/lib/zig/std/json.zig:284:13: 0x20bc86 in std.json.StreamingParser.feed (test)
if (try p.transition(c, token1)) {
^
~/Documents/Code/zig/zig/build/lib/zig/std/json.zig:1122:13: 0x20abf0 in std.json.TokenStream.next (test)
try self.parser.feed(self.slice[self.i], &t1, &t2);
^
~/Documents/Code/zig/zig/build/lib/zig/std/json.zig:1628:26: 0x20b089 in std.json.parseInternal (test)
switch ((try tokens.next()) orelse return error.UnexpectedEndOfJson) {
^
~/Documents/Code/zig/zig/build/lib/zig/std/json.zig:1795:15: 0x20985e in std.json.parse (test)
const r = try parseInternal(T, token, tokens, options);
^
/tmp/test.zig:9:15: 0x207d31 in test "field alias escapes" (test)
const s = try std.json.parse(S, &std.json.TokenStream.init(text), .{});
is this an issue with std.json or an invalid input? maybe i'm missing something? |
was just me being wrong :P json doesn't support |
Ok good. So does this look like the test i'm trying to make pass? I want to be sure i'm testing the right thing. test "field alias escapes" {
const S = struct { foo: u8 };
const text =
\\{"__f\u006fo": 42}
;
const s = try parse(S, &TokenStream.init(text), .{
.field_aliases = &.{.{ .field = "foo", .alias =
\\__f\u006fo
}},
.allocator = std.testing.allocator,
});
try testing.expectEqual(@as(u8, 42), s.foo);
} |
no you want to set the alias to just test "field alias escapes" {
const S = struct { foo: u8 };
const text =
\\{"__f\u006fo": 42}
;
const s = try parse(S, &TokenStream.init(text), .{
.field_aliases = &.{.{ .field = "foo", .alias = "__foo" }},
.allocator = std.testing.allocator,
});
try testing.expectEqual(@as(u8, 42), s.foo);
} |
- now using correct escape sequence and key - attempt to reformat and add a few comments to `inline for (structInfo.fields) |field, i|` loop body
- now maps from field_name => alias_name - this unifies and improves searching for aliases. - this also fixes missing check for field.name equality. - add test coverage: false positive matches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need an allocator: you can bound the size of the field aliases by the number of fields in the type, which means it can be on the stack.
lib/std/json.zig
Outdated
@@ -1466,6 +1466,7 @@ pub const ParseOptions = struct { | |||
ignore_unknown_fields: bool = false, | |||
|
|||
allow_trailing_data: bool = false, | |||
field_aliases: ?[]const struct { field: []const u8, alias: []const u8 } = null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't need to be null-able: just use the empty list.
lib/std/json.zig
Outdated
@@ -1466,6 +1466,7 @@ pub const ParseOptions = struct { | |||
ignore_unknown_fields: bool = false, | |||
|
|||
allow_trailing_data: bool = false, | |||
field_aliases: ?[]const struct { field: []const u8, alias: []const u8 } = null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an empty line between fields here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed all of these in latest commit.
the allocator is only needed for the |
this makes the tests pass. not sure if var buf: [structInfo.fields.len * (1 << 9)]u8 = undefined;
var fba = std.heap.FixedBufferAllocator.init(&buf);
var field_alias_map = try buildFieldAliasMap(T, options, &fba.allocator); |
continuous-integration/drone/pr and ziglang.zig ci jobs both passed before i got tired waiting to see if 3rd would pass. going to push my latest commit now. |
516aae8
to
be039ac
Compare
- use heap.fixedBufferAllocator with size structInfo.fields.len * 512 rather then user allocator - get rid of tests expecting error.AllocatorRequired - add test "many field aliases" - make FieldAlias struct pub and add some doc comments
be039ac
to
9d5222c
Compare
lib/std/json.zig
Outdated
// build a map of aliases from field_name => alias_name | ||
fn buildFieldAliasMap(comptime T: type, options: ParseOptions, allocator: *mem.Allocator) !std.StringHashMapUnmanaged([]const u8) { | ||
var result = std.StringHashMapUnmanaged([]const u8){}; | ||
if (options.field_aliases.len == 0) return result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for this line I think?
lib/std/json.zig
Outdated
@@ -1623,6 +1710,9 @@ fn parseInternal(comptime T: type, token: Token, tokens: *TokenStream, options: | |||
} | |||
} | |||
} | |||
var buf: [structInfo.fields.len * (1 << 9)]u8 = undefined; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the limit of 512 is appropriate; I was more thinking something like iterating over the fields (with an inline for
) and calculating the actual max size of the hash table possible (if there is e.g. 5 fields, then the hash table can't have more than 5 entries).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok thats what i was thinking too initially. but i'm not sure how to do it. can you provide an example of how to initialize a hash table with only N entries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think i found a solution. will push a commit soon.
- allocates backing data on the stack - much smaller memory requirement (around 40 bytes per field) - removes FixedBufferAllocator - adds error.TooManyFieldAliases check
78086e5
to
22049c7
Compare
- 8 is ArrayHashMap's linear_scan_max. if there are more than 8 entries, an index is allocated and used. otherwise a get() does a linear scan and no allocations occur. - use FixedBufferAllocator with capacity 9 * field_names.len bytes. 9 is the lowest multiple which doesn't OOM. i tested this with up to 2048 struct fields.
no idea why the freebsd build failed. i can't tell from the logs. |
Timed out, but it seems like all the relevant tests were run. |
is there anything left to do on this or is it just waiting in line? just wanted to ping @daurnimator as there has been no action in a couple weeks. note that the most recent freebsd ci job seems to have run all relevent tests but just timed out for some reason. also know that it shouldn't be a problem for me to rebase as happened previously. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the idea behind this to do the aliasing at compile-time instead of runtime?
is there anything left to do on this or is it just waiting in line?
* if a struct T passed toparse(T, ...)
contains apub const __tags
of the form
shown below, these aliases will be used as the json keys instead of the
field name.
I don't see this in the code
In response to @andrewrk questions:
I'm not sure if this is currently possible. In this PR's first commit i had created the field aliases as a list of from_field, to_field pairs at comptime. However if memory serves, this caused an OOM error. The aliases are now stored as an ArrayHashMapUnmanaged which is created with stack memory. This was done in an attempt to avoid allocations.
The field_aliases were moved from a |
- this is a rework of ziglang#8987. rebasing that old pr proved difficult so it was easier to just start over. - this patch has the same api and tests as ziglang#8987 but uses an enums.EnumMap rather than StringArrayHashMap. this allows `field_alias_map` to be made entirely on the stack and requires no allocator (not even a FixedBufferAllocator which the previous used for re-indexing its map).
Closing in favor of #10193 |
parse(T, ...)
contains apub const __tags
of the formshown below, these aliases will be used as the json keys instead of the
field name.
std.mem.{indexOfSliceScalar, indexOfSliceScalarPos}
which supportfinding aliases. adding these was my workaround to a compiler bug
encountered when using a for loop for the same purpose.
The names
__tags
andalias
were chosen by me ad-hoc and may be easilychanged.
Inspired by #1099 #1099 (comment)