Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: Improve chkentry(1) to check the JSON semantics of a winning IOCCC entry #940

Open
2 of 3 tasks
lcn2 opened this issue Aug 26, 2024 · 56 comments
Open
2 of 3 tasks
Assignees
Labels
background priority While this issue is needs to be solved, it is of a somewhat lower priority. enhancement New feature or request post-IOCCC28 All work and comments delayed until post-IOCCC28 and post IOCCC judge vacation.

Comments

@lcn2
Copy link
Contributor

lcn2 commented Aug 26, 2024

Is there an existing issue for this?

  • I have searched for existing issues and did not find anything like this

Describe the feature

Currently chkentry(1) validates the .info.json and .auth.json files.

As of commit 7a3907c chkentry(1) throws a usage error of the form when given a single argument:

single argument mode is reserved for future use

We propose that when given a directory of the form YYYY/dir (where YYYY is an IOCCC year and dir is the name of a winning IOCCC entry found under YYYY) that chkentry(1) should validate the semantics of the entry's .entry.json and related author/author_handle.json files. This will also include verifying that the files referenced in the .entry.json exist, etc.

The faq.md of the other repo and the FAQ.md of this repo will need to be updated accordingly.

Relevant images, screenshots or other files

Read at least 42 words of humor from the closed issue #2614 as found in the temp-test-ioccc repo for some fun.

Relevant links

From the closed issue #2614 as found in the temp-test-ioccc repo we recommend:

These comments have been duplicated below.

Anything else?

Stay tuned as we transfer important comments from the closed issue #2614 as found in temp-test-ioccc repo below.

Additional TODO

  • Move open_json_dir_file() in chkentry.c into jparse/util.c

  • Write a jprint(1) tool and add it to the jparse repo

See GH-issuecomment-2316164972 for information on the jprint(1) tool and why it is needed for this issue.

@lcn2 lcn2 added the enhancement New feature or request label Aug 26, 2024
@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Coped from GH-issuecomment-2267089148:

It seems like the command line parsing should instead have a single arg (directory) still but it should call a stub function for .entry.json and author_handle.json files. I could then update the FAQ to say how to validate a .info.json and/or a .auth.json file.

That makes it easier. What do you wish me to do?

We don't want to imply that if .info.json and/or a .auth.json files exist, that chkentry(1) will be processing them. When a submission is transitioning into an IOCCC winner, it will have .info.json, and .auth.json, as well as possibly a .entry.json being formed (i.e., exists but is incomplete, or may be empty or may be missing).

We suggested a change in behavior on the chkentry(1) command line.

To process an entry .entry.json file (and its associated author/author_handle.json files(s)):

./chkentry YYYY/dir

For example:

./chkentry 2020/ferguson1

To check just .info.json:

./chkentry some/path/.info.json .

To check just .auth.json:

./chkentry . some/path/.auth.json

To check .info.json, and .auth.json:

./chkentry some/path/.info.json some/path/.auth.json

To have "undocumented fun" 🤩:

./chkentry . .

Remember that a submission directory (with .info.json, and .auth.json files) can be anywhere including NOT under an IOCCC winner repo directory tree because submissions will be judged elsewhere (perhaps far away from any repo tree).

Remember that a submission directory could have .info.json, and .auth.json files along with a .entry.json file, and that the .entry.json file may not be valid JSON because the .entry.json file may be in the process of being formed.

The chkentry(1) tool CANNOT assume that the existence of a .entry.json file means that it is an IOCCC entry within the winner repo with associated author/author_handle.json files(s) (see previous line).

The chkentry(1) tool CANNOT assume that the existence of .info.json, and .auth.json files means that the directory is a submission directory because the .entry.json file may be in the process of being formed.

If the chkentry(1) tool is directed via the command line to validate .info.json and/or a .auth.json files, it cannot assume anything about the paths of those JSON files.

The chkentry(1) tool CANNOT assume the filename form of a .info.json or .auth.json files as they could be called curds and whey and even be in different directories:

./chkentry /tmp/curds /var/tmp/whey

If the chkentry(1) tool is directed via the command line to validate a .entry.json file, it can assume that the directory path is of the form YYYY/dir AND that the top level author directory exists, AND that the JSON file is called .entry.json and is located in YYYY/dir/.entry.json.

There will be a tool (such as bin/all-run.sh in that other repo) that will want to call the chkentry(1) tool with just a YYYY/dir directory argument that will expect the chkentry(1) tool to process the YYYY/dir/.entry.json file and the associated author/author_handle.json files(s) AND that the tool will call the chkentry(1) tool from the top of the repo directory tree.

The above line is one reason why we suggest that by default, the chkentry(1) tool print nothing and exit 0 when the JSON file(s) are OK (valid JSON and are semantically valid for their context) (error and warning messages, of course, are welcome when problems are found).

So this is why we suggest these command line forms:

./chkentry some/path/.info.json .
./chkentry . some/path/.auth.json
./chkentry some/path/.info.json some/path/.auth.json
./chkentry /tmp/curds /var/tmp/whey
./chkentry YYYY/dir

This is why we are suggesting that a "one arg" chkentry(1) command line imply that the "one arg" is a YYYY/dir directory with a repo, and with a YYYY/dir/.entry.json file, and with the associated author/author_handle.json file processing (no .info.json and/or a .auth.json processing).

This is why we are suggesting that a "two arg" chkentry(1) command line imply .info.json and/or a .auth.json processing (no .entry.json processing).

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2267092165:

I believe that I have that covered at least as far as parsing the command line.

It also even checks for the author directory and of the .entry.json file can be opened. That one is even parsed though no walking through the tree yet. That's why author/ is only checked as a directory: since the other file has to be analysed before it can come up with the correct filenames.

You are probably correct.

The change to chkentry(1) tool behavior is in the one arg form. Now:

./chkentry arg

will require and arg of the form YYYY/dir and will imply YYYY/dir/.entry.json file, and with the associated author/author_handle.json file processing and with NO .info.json NOR .auth.json processing.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2267616067:

Question

How did you generate the soup/chk.* files? I guess this would be something I need to do for this change. I know there's more to it but it might be helpful if I have an idea how you came up with it. Or if you have an idea that might make it easier to update (as it is indeed a pain as it stands now - I remember having to make fixes before) then I'm definitely willing to read what you have to say.

The last time there was a chance, we hand edited the patch. That's not good and so needs to change so that a known reference can be used as a model. Currently one can be generated based on a reference file and that result is patched so that some things such as the allows count range min and max are adjusted. See

https://github.com/ioccc-src/mkiocccentry/blob/master/soup/chk.auth.ptch.c

That might have been fine for development, but it is a bit of pain should a format need to change. So there might need to be a better was to specify the range count (instead of the current soup/chk.* stuff) may be needed.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2267623067:

Hmm ... to help with determining the number of authors in .entry.json files would it be possible (like with the other json files) to have an author_count field in the .entry.json files?

One should be careful when one "over specifies" something: In this case adding a count of authors instead of just counting the authors.

Over specification of something allows for multiple ways for something to be wrong and adds to the number of checks one has to make. So adding or removing and author from the set of authors would be complicated by now also having to adjust the author count number. And then the length of the list of authors now has to be compared with the author count. Adding a count of authors doesn't provide and missing information and create opportunities for something to be wrong.

Yes, when forming .auth.json we had author count and even author numbers. But that was a very different situation. The mkioccentry(1) tool has to ask the user for the number of authors and then count them to be sure all the authors have been provided by the user and then verify the author set provided by the user is complete.

Here, with .entry.json we have no such "what did the user tell us" problem. The list of authors are the list. In fact we assume that the contents of the .entry.json file is "authoritative".

It is a similar reason why the manifest in the .entry.json file does not ask for the file size of a given file. We assume that a file such as prog.c is the correct size. Adding a "file size" property to the manifest would complete the JSON and create more opportunities for something to be wrong.

But you might say: "how do we know if the list of authors is incomplete?" The answer is we let the humans decide. If a winner were posted with an author missing or incorrectly added to an entry, we would count on the humans noticing the problem and correcting it with a pull request. Adding an author count to the JSON wouldn't help detect a missing or extra author to an entry because the overall author count could also be wrong.

#The same way we don't detect if a prog.c file is the wrong size. We let the humans discover that fact and fix missing or extra code. Adding a file length to the manifest doesn't help detect if a prog.c file is the wrong size because such a "file size" value could be also wrong. And worse still, adding such a "file size" value just creates more opportunities for errors.

So in general, over specifying information is not helpful. It creates opportunities for redundant information to disagree with itself, requires even more consistency checks, complicates the overall JSON structure, and doesn't really help detect when content such as an author is missing or extra.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied form GH-issuecomment-2267639223:

The checking of .info.json checks the manifest but in that file it has just the id name and the filename. It checks that required files are there, that there are no duplicates of any files (required or not) and that no required files are identified a something else (I think that was done - been a while and haven't checked).

If the manifest of a .entry.json is missing a required file (such as Makefile, README.md, index.html, etc.) then that needs to be flagged as an error.

While a weeeeeee tiny bit near the edge of semantic checking, if the actual file is missing in the directory, of if it the directory has a file not listed in the manifest, that should be flagged as an error too.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2267663149:

While a weeeeeee tiny bit near the edge of semantic checking, if the actual file is missing in the directory, of if it the directory has a file not listed in the manifest, that should be flagged as an error too.
Interesting thought, that.

The need for this can be argued this way: the semantics of a field manifest depends, in part, on matching the files under the YYYY/dir directory.

Assume that make clobber has already been done. It someone forgets to make clobber first, the restating errors will be obvious and in the "grand scheme of the Makefile rules" assume make clobber was already done before bin/all-run executes chkentry YYYY/dir for all entire.

FYI: The tool that the judges use to help convert a submission into an entry will use chkentry(1) in the "one argument" (I.e., YYYY/dir) mode to help verify that the willing entry being formed is "sane and semantically valid". If the judges forget to remove either .info.json and/or .auth.json, the file manifest checks will differ from the list of files under the YYYY/dir directory.

It probably would hurt chkentry(1), when it is attempt to verify that mandatory files (such as Makefile, README.md, index.html, .path, .entry.json, etc.) exist: also verify that certain entry files no longer exist. That is, the following files should NOT be found in an entry: neither in the .entry.json manifest nor in the directory:

  • .info.json
  • .auth.json
  • remarks.md

This will catch the case where the judges, when converting a submission on a winning entry, forget to remove those 3 files.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2269551230:

As far as the patch files what about what I wrote in GH-issuecomment-2267622035?

If by that you ask to have the code, with lots of comments explained in GitHub comments? While we don't think you mean that, we just want to be sure you don't as the effort need to re-explain code comments and algorithms in GitHub comments would be larger than the effort to just write the tool.

In terms of requirements: the single argument chkentry(1) tool needs to determine if the information found in an IOCCC entry's .entry.json file and its related author/author_handle.json file(s) are consistent, complete and reflects the contents of the IOCCC entry.

If you walk thru a file such as 1984/mullender/.entry.json you will see a number of JSON members. The code needs to determine if each of those JSON members are present, have the proper JSON member value type and in some cases the value is reasonable. The year as an integer value >= 1984 and <= say 9999. The entry_id is a JSON string whose value is consistent with the YYYY/dir path (i.e., is YYYY_dir). An entry_text is a JSON string whose lengths are within a specified range (i.e., longer than 0 and shorted than some specified limit). Etc.

We suggest you approach this from a testing perspective. How might someone try to mess up a .entry.json file? Does it contain extra stuff that is unexpected? Does it have something omitted? Is a given JSON member value the wrong type (for example, an integer when it should be a string, a null when it should be an integer, etc.)? Is a given JSON member that does have a JSON value of the correct time have an "insane" value (for example, an empty string, a string that is too long, a negative integer when a positive integer is needed, a string that contains bogus characters when it should be a string value of a well defined type, etc.)? An .entry.json file should describe a given IOCCC entry, but does it describe correctly (for example, the year doesn't match, the file manifest does not match, the authorship doesn't have a related author/author_handle.json file, etc.).

We won't go into every JSON member element in GitHub comments. You will have to figure that out yourself. In some cases you may even need to add a #define SOMETHING_MAXLEN if there isn't one that you needed.

BTW: The single argument chkentry(1) tool should start out with determine that the IOCCC entry's .entry.json file and its related author/author_handle.json file(s) are valid JSON. If they are not valid JSON then throw an error and exit. However if they are valid JSON, you need to ask of the files are reasonable have don't contain mistakes, extra unwanted stuff, omissions, bogus values, etc.

Think of a ways to mess up a .entry.json file with respect to the entry directory it is in, and be sure that the single argument chkentry(1) tool can detect that. Think about a given .entry.json file that is put into the wrong IOCCC entry and how to detect that. Think about someone who edits a given .entry.json file and makes mistakes, that while are valid JSON, doesn't properly describe the IOCCC entry, or contains values that are out of range, or JSON member values that are the wrong type, and detect that.

Some high level requrements

When the single argument chkentry(1) tool is given a well formed IOCCC entry its related author/author_handle.json file(s), does it pass (i.e., exit 0).

When a mistake is made in the contests of the .entry.json file its related author/author_handle.json file(s), even through they may be valid JSON files, are those mistakes detected?

By "mistakes made" we include things such as the wrong type of JSON value, or a JSON value that is the correct type but whose value is not proper, out of range, unreasonable, or is inconsistent with the IOCCC entry it is within, etc.

When someone modifies an IOCCC entry but fails to update the contests of the .entry.json file its related author/author_handle.json file(s), is that detected?

And by detected we mean warnings or errors are issued on stderr and the tool does not exit with a 0 exit code.

UPDATE 0a

Think of ways that a .entry.json file (within a given YYYY/dir) and/or its related author/author_handle.json file(s) might be messed up (unintentional or deliberate), or ways that an IOCCC entry might be changed (again, unintentional or deliberate) without making the proper changes to the .entry.jsonfile its relatedauthor/author_handle.json` file(s). Detect those situations, issue warning and errors to stderr as needed, and exit non-zero.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2269571888:

Even so it'd be nice if we did not have to worry about patching OR if we could come up with a way to not require modifying it manually AND/OR a way (perhaps the most ideal?) to generate it more easily (for we do generate it but it might be nice if a new table could be generated more easily).

The patching system was, we admit, a but of a "bad hack" that might be showing its limits when the code has to be extended to checks for .entry.json and related author/author_handle.json file(s). If you want to toss out that patching system and come up with a better way, PLEASE DO.

UPDATE 0

If the result is that the requirements in GH-issuecomment-2269551230 are met, AND the tool is able to still do sanity checks for the submission .info.json and .auth.json files (in the 2 arg mode), and you toss out the while patch system for something better, that is fine.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2272414067:

Hmm .. would having an identifier for the different tables be something to consider instead of having to patch files?

Hmm .. sure? But perhaps a higher level question needs to be consider instead.

While one can generate a table for a given JSON file using jparse/jsemtblgen:

jsemtblgen 2020/endoh1/.entry.json

other semantically valid .entry.json files may have difference counts, such as:

jsemtblgen 1984/anonymous/.entry.json

How can you find the range of something? We hacked together what we thought of as a "bare minimum" JSON file for some types of JSON files. For example, for the .info.json case:

jsemtblgen test_ioccc/test_JSON/info.json/good/info.min-manifest.json

With the assumption that other somatically valid .info.json files would have counts that are >= the count for the "good minimum case".

QUESTION

Are jsemtblgen generated tables the wrong approach?

Perhaps jsemtblgen generated tables is the WRONG approach.

What is needed, so to detect that those required JSON items be present, that they have the requires types, and that one is able to determine if the number of a given JSON item is OK or not.

Take the .info.json case. Let us look at a few examples of requirements:

We MUST have 1 and only 1 "empty_override" JSON member name AND that the JSON member value MUST be a boolean (true or false), AND that the "empty_override" JSON member name/value pair be under the root of the JSON file.

We MUST have 1 and only 1 "formed_timestamp" JSON member name AND that the JSON member value MUST be an integer, AND that the "formed_timestamp " JSON member name/value pair be under the root of the JSON file.

We must have 1 and only 1 "c_src" JSON member name, AND that the JSON member value MUST be a string, AND that "c_src" JSON member MUST be under a JSON array, AND that JSON array be a JSON member value associated with the JSON member name "manifest", AND that the JSON member name/value pair be under the root of the JSON file.

We may have 0 or more "extra_file" JSON member names, AND AND that the JSON member value MUST be a string, AND that "extra_file" JSON member MUST be under a JSON array, AND that JSON array be a JSON member value associated with the JSON member name "manifest", AND that the JSON member name/value pair be under the root of the JSON file.

etc.

All of the above requirements are BEFORE one calls upon validation functions such as chk_empty_override, chk_formed_timestamp, chk_c_src, chk_extra_file, etc.

Is there a BETTER way to verify what we call the semantics of a given type of JSON file?

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2272443630:

Problem

It is rather difficult to look for questions within a long comment, then guess if something with a "?" is a question that needs to be answered or is later on rendered not important (because perhaps you found the answer, or you decided it was not important, or you were just pondering something out loud, or) and then to try and determine the context for the question, and the try to reply to the question.

Suggestion

When you find something you need to ask, put a ## Question header followed by an explicit question.

Possible Answer to a previous question

We think in GH-issuecomment-2271322924 you asked a question you wish us to answer.

We will now try to quote the relevant text below:

Ah, here. But it's not a member with a name.

{ 5, JTYPE_STRING,   1,      104,    104,    1,      0,      NULL,   NULL },

is changed to:

  { 5,  JTYPE_STRING,   15,     104,    104,    1,      0,      NULL,   NULL },

which means the min is 15, not 1.

The min is minimum allowed count (I guess that AT LEAST that many are required?) but since it's not named I'm not sure what it indicates.

Do you have an example where there is a difference for a NAMED member? And what are the unnamed members if you have an example?

OK, now we will make a guess that NAMED member might relate, given GH-issuecomment-2272396477, to the struct json_sem in jparse/json_sem.h there are 3 structs with an element of "name" so we are guessing here.

So now we will make a guess that "where there is a difference" might be asking about something where min had to be changed. But that question followed an example that you provided where the min was changed from 1 to 15, so perhaps the string:

Do you have an example where there is a difference for a NAMED member?

is not a question since you seemed you have provided an example above it

Maybe the real question lies in the that that immediately follows:

And what are the unnamed members if you have an example?

Here we will guess that (assuming our guess of name related to the struct json_sem in jparse/json_sem.h) "unnamed members" might refer to some (struct json_sem *)p->name is NULL.

Well the patch file soup/chk.info.ptch.c shows as the first difference:

--- ref/info.reference.json.c   2024-05-19 00:08:13
+++ chk_sem_info.c      2024-05-18 23:59:02
@@ -39,17 +39,17 @@

 struct json_sem sem_info[SEM_INFO_LEN+1] = {
 /* depth    type        min     max   count   index  name_len validate  name */
-  { 5, JTYPE_STRING,   1,      84,     84,     0,      0,      NULL,   NULL },
+  { 5, JTYPE_STRING,   10,     84,     84,     0,      0,      NULL,   NULL },

So there is a case where the min value had to change from 1 to 10 for some JSON string that was NOT a JSON member name.

That example is at level 5 of the JSON parse tree (where 0 is the root). Looking a minimal example of a .info.json file we see that at level 5 of the JSON tree we find 5 JSON strings that are NOT JSON member names:

		{"info_JSON" : ".info.json"},
		{"auth_JSON" : ".auth.json"},
		{"c_src" : "prog.c"},
		{"Makefile" : "Makefile"},
		{"remarks" : "remarks.md"}

In particular the 5 JSON member value strings:

  1. ".info.json"
  2. ".auth.json"
  3. "prog.c"
  4. "Makefile"
  5. "remarks.md"

are those 5 JSON strings that are not JSON member names and, if you count the levels, are at the JSON parse tree depth level of 5.

QUESTION

Did we answer your question with the above text?

UPDATE 0

Perhaps the high level question raised in GH-ssuecomment-2272414067 should be addressed first before trying to determine details about the current "jsemtblgen / patch" mechanism that, while it functioned for .info.json and auth.json, might be extended to files such as .entry.json and related author/author_handle.json files.

The need for JSON semantic checking the JSON parse tree for tools that link to jparse/jparse.a. The "jsemtblgen / patch" mechanism may not be a useful want to describe what JSON items should be found at various levels of a JSON parse tree.

Perhaps a better and different approach is needed altogether.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2273674353:

What do you think about all the above?

We think that the "## Problem" section of the GH-issuecomment-2272443630 is still a bit of a problem.

Given the "## Suggestion" section of the GH-issuecomment-2272443630 had little effect, we speculate that the above line was the only sentence that was timely in need of answering. 😄 😄 😄

Answer

We will write a new comment with a new suggestion based on the question we raised in GH-issuecomment-2272414067. When we complete our next comment we speculate that we will have provided better content, to move us forward, than if we had responded to all of the direct and implied questions from GH-issuecomment-2273496974.

However for that next comment we are promising above, you will have to wait a bit while we ponder the answer to our question we raised in GH-issuecomment-2272414067.

UPDATE 0

We we are in a hurry to complete bin/cvt-submission.sh so that we can return to working on the submit server, and as this issue is not on the critical path, we will ponder our idea and return later to post the answer to our question.

If we forget to do so in a few days, please remind us, @xexyl. 🤓

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2282227580:

We have a concept for testing JSON semantics by use of JSON.

We are thinking of how illustrate our idea with a simple tool in the near future, for certain values of near.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2293965292:

In regards to GH-issuecomment-2282227580:

The high level idea is to have add command line option to jparse(1), or more likely to create some new tool that takes valid JSON input an outputs a JSON that is equivalent form but where every JSON value (JSON string, JSON number, JSON boolean, JSON null) will be replaced by a JSON string that we will call a "JSON semantic string". The resulting JSON output from such a tool (or command line option to jparse(1)) would be valid JSON AND the JSON parse tree structure would be identical to the JSON parse tree of the original JSON input, AND where each JSON string be a specially formatted "JSON semantic string" that would help describe the semantics of the original JSON input.

We are woking out the details of the format of those "JSON semantic strings", so expect a later comment describing them.

The "JSON semantic string" output could be modified by a developer of a JSON semantic checking tool (such as chkentry(1)) to indicate that a given JSON value was optional, or that the JSON value could be repeated between some number of times (with a possible min and/or max repeat count). Such modifications could be performed by the developer by hand, or with a patch file. This would make it much easier to update JSON semantic checking code when the original format of the JSON needs to change.

In the generic case, semantic checking code (such as chkentry(1)) would JSON parse both the JSON file in question (such as, say, .entry.json) as well as a JSON semantic reference file (such as, say, sem.entry.json). The result in memory will be two JSON parse trees, one for the JSON in question (such as, say, .info.json) and one for the JSON semantic reference file (such as sem.info.json).

Then by "walking over" both JSON parse trees in memory in parallel, one could use the JSON semantic reference parse tree to examine the parse tree of JSON input the question. We say "walking over" because when one encounters a "JSON semantic string" indicating that the corresponding JSON value is optional or that the corresponding JSON value could be repeated (with a possible min and/or max repeat count), one as to "walk around" the JSON parse tree of the JSON in question (such as the JSON parse tree of .auth.json where the number of authors may vary from 1 to 10) accordingly.

The "JSON semantic string" will also contain an optional reference to a "chk-function" that, when called, will perform various value specific checks (similar to functions found in soup/chk_validate.c). There will be a way for the developer of a JSON semantic checking tool (such as chkentry(1)) to convert string containing the name of a function to an actual C value checking function (similar to functions found in soup/chk_validate.c).

The result of the above architecture will be a more generic way that you and others can check the semantics of any form of JSON.

FYI, More TBD

This is just an FYI for the curious and is not intended to be a complete technical specification.

Please wait as more details relating to the above are developed and modified and deleted and replaced. :-) Please also wait as the specification of the format of the "JSON semantic string" is being developed.

Such detailed info will be posted as comments below. Stay tuned over the next several days. And if you don't see more of the promised above content posted below soon enough (because we may have gotten so busy with work on matters related to GH-issuecomment-2293850933, please remind us after a few days.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2294938385:

QUESTION

Does chkentry(1) need these updates prior to the next IOCCC? I could see how it might be useful but I can also see how it might slow things down esp as there won't be many more entries to inspect: just the next winners. But of course I could also work on it during the judging process (for example). Either way I'll do it when necessary. I would admittedly like to get it done but running the next contest before this if possible would also be great.

The chkentry(1) updates can wait until after IOCCC28.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2295015680:

QUESTION

Does chkentry(1) need these updates prior to the next IOCCC? I could see how it might be useful but I can also see how it might slow things down esp as there won't be many more entries to inspect: just the next winners. But of course I could also work on it during the judging process (for example). Either way I'll do it when necessary. I would admittedly like to get it done but running the next contest before this if possible would also be great.

The chkentry(1) updates can wait until after IOCCC28.

That is wonderful.

Working on a python submit server, unfortunately, will not wait.

So does that mean that it does have to be updated first, then? I hope not but if necessary it's necessary.

The python submit server will need to be in advanced enough state before the Great Fork Merge in order that screen shots may be added to FAQ(s) related to how to submit to the IOCCC.

The "register for the IOCCC" process will need to be in advanced enough state before the Great Fork Merge in order that screen shots may be added to FAQ(s) related to how to register for the IOCCC.

Before IOCCC28 goes into "pending" state the BETA submit server will need to be up and running so that people may test using the mkiocccentry(1) tool by uploading compressed tarballs to the test submit server.

Before IOCCC28 goes into "pending" state the BETA "register for the IOCCC" process will need to be and running so that people (including the IOCCC judges) may test the submit server workflow.

The "screen sorts" before the * Great Fork Merge AND these BETA registration & BETA submit server take the place of what was once called IOCCCMOCK.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2310926005:

QUESTION

Even though this issue will not be resolved until after IOCCC28, it may be best to change chkentry(1) NOW do throw an error if given only one argument. This would clear the way for the future of this issue when given a single arg means to process an entry instead of a submission.
We think this (throw an error if given only one argument) needs to be done before the pending next release of the mkiocccentry repo.

I have a number of changes like it already starting to parse the other files (or at least find the files based on the arg). That being said I can make a copy and then edit it to reject one arg only. After that is merged I can then pull and replace it with what I already wrote.

I can do this tomorrow if you wish? If I don't get to it please remind me!

Performed on that other repo with commit 7a3907c.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 26, 2024

Copied from GH-issuecomment-2311005389:

We have "cloned" this enhancement request into issue #940 in the mkiocccentry repo.

@xexyl
Copy link
Contributor

xexyl commented Aug 27, 2024

Please assign.

@xexyl
Copy link
Contributor

xexyl commented Aug 27, 2024

I just (and that's all I'm doing for now as we have to figure out the semantics - plus other more important things) made a slight improvement over the current design (or so I think).

It is now possible to specify the author/ directory and the .entry.json filename to search/open. I know it's unlikely that it'll ever be run from a directory that's not the website root directory but now it should be possible to do so if one should choose to do so.

I made an improvement to an error message and once that's merged I can pull and then add back the changes (from backup or else git stash) so I have the updates. This is in my fork copy and before the contest goes I'll of course install from the repo that is not my fork but this way I'll be set for after the contest.

I am going to look at the CSS issue over there and then I have other things to do.

Please do make a new release if you need to and again THANK YOU for holding off! I really appreciate it very much and I'm sorry for the delay.

@xexyl
Copy link
Contributor

xexyl commented Aug 27, 2024

I have a suggestion. It's not necessary to do before the merge, maybe, but it would be good to do before the parser is moved to the jparse repo.

The function open_json_dir_file in chkentry.c seems like a good idea to have in jparse/util.c but maybe not json specific? It seems much more general than just for chkentry.

What do you think?

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 27, 2024

I have a suggestion. It's not necessary to do before the merge, maybe, but it would be good to do before the parser is moved to the jparse repo.

The function open_json_dir_file in chkentry.c seems like a good idea to have in jparse/util.c but maybe not json specific? It seems much more general than just for chkentry.

What do you think?

That seems like a good idea.

@xexyl
Copy link
Contributor

xexyl commented Aug 27, 2024

I have a suggestion. It's not necessary to do before the merge, maybe, but it would be good to do before the parser is moved to the jparse repo.
The function open_json_dir_file in chkentry.c seems like a good idea to have in jparse/util.c but maybe not json specific? It seems much more general than just for chkentry.
What do you think?

That seems like a good idea.

When would you like me to do that?

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 28, 2024

I have a suggestion. It's not necessary to do before the merge, maybe, but it would be good to do before the parser is moved to the jparse repo.
The function open_json_dir_file in chkentry.c seems like a good idea to have in jparse/util.c but maybe not json specific? It seems much more general than just for chkentry.
What do you think?

That seems like a good idea.

When would you like me to do that?

Yes, please. We added this as a TODO so then when you work on this issue, the function can be moved at that time.

@xexyl
Copy link
Contributor

xexyl commented Aug 28, 2024

I have a suggestion. It's not necessary to do before the merge, maybe, but it would be good to do before the parser is moved to the jparse repo.

The function open_json_dir_file in chkentry.c seems like a good idea to have in jparse/util.c but maybe not json specific? It seems much more general than just for chkentry.

What do you think?

That seems like a good idea.

When would you like me to do that?

Yes, please. We added this as a TODO so then when you work on this issue, the function can be moved at that time.

Well I might even end up doing it today because there's an unfortunate extraneous word in two error messages that I could imagine someone wanting to try and fix which would trample over my changes.

Not at laptop yet but will be in a little while and likely will look at it then. I have to take a shower and do some other things. There will be interruptions here and there but I more than anything today want to tackle 2005.

Hope you and I both have a better day today (I didn't have that great of a day yesterday either: actually it was a poor day)!

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 28, 2024

Well I might even end up doing it today because there's an unfortunate extraneous word in two error messages that I could imagine someone wanting to try and fix which would trample over my changes.

Would such a change warrant the making of a new release of the repo? If the answer is no, then we advise against such a change as those picking up the released code won't see it.

@xexyl
Copy link
Contributor

xexyl commented Aug 28, 2024

Well I might even end up doing it today because there's an unfortunate extraneous word in two error messages that I could imagine someone wanting to try and fix which would trample over my changes.

Is that really needed?

And would such a change warrant a new release of a new version of the repo? If not, then we advise against such a change now.

Fair enough. Then what can we do about people maybe wanting to fix it? If they did it might cause a conflict with my changes.

But I will avoid making the changes for now.

When we have a better idea about semantics I will probably start thinking about the updates.

In a little bit I HOPE to work on 2005 but it probably will be a bit more time. I might take the opportunity to work on those other things that take less focus as I am still waking up.

Then after that I can look at 2005. In the meantime ..

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 28, 2024

Fair enough. Then what can we do about people maybe wanting to fix it? If they did it might cause a conflict with my changes

The same thing if someone, while testing the code finds an execution flaw or portability issue that needs to be fixed. Such an emergency patch will need to be applied to any other code branch.

UPDATE 0

There's always a risk that code will need to be updated to fix a functional problem: especially a fix that is required for all contestants that forces the use of a new version number of a critical tool. Such a fix would force a new repo release and force everyone to recompile with the new code to produce the correct version in the JSON files.

On the other hand, fixing a typo in a man page, or comment, or even an error message wording (unless that resulted in a significantly misleading set of instructions), while annoying, would not be a change that is necessary for everyone to adopt and recompile.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 28, 2024

But of course if I make this typo fix would not it be s good idea to move the function over? That might also be a reason for a new release. Or it could be it's merged but not worth a new release? That might also be an option?

Fair point on a 3rd 🥉 option: issue a PR and apply the PR (one that includes an update to CHANGES.md and the value of MKIOCCCENTRY_REPO_VERSION) now but don't issue a new repo release as the PR is not critical.

Would you prefer that?

If you don't want to make a new release: yes. Should I move that function though? It might be the right time to do if I have to do this. I mean it's also not critical but it would save the time later. And since I don't have any updates to util.c it would be easy to manage too.

Sure.

@xexyl
Copy link
Contributor

xexyl commented Aug 28, 2024

Will do that shortly (for certain definitions of shortly) and then take care of some other things. If I have time after that I will try working with the other repo.

@xexyl
Copy link
Contributor

xexyl commented Aug 28, 2024

Done with 03f605c. You can thus mark that todo item complete if you wish.

Once the semantics details (and other things necessary) have been discussed more I can further enhance the tool, getting more insight into and familiarity with the dynamic array facility and some of the lower level json functions that you added.

I have to take care of other things but if I have time after that I will work on 2005. The good news is that if I do not have time it seems that the next thing I have to work on will be 2005 (though I have a couple other html files fixed in format) .. unless of course yet another thing here requires updates!

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 28, 2024

Done with 03f605c. You can thus mark that todo item complete if you wish.

See the request to update the PR.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 28, 2024

We need to describe our model for how to make a "parallel" JSON parse tree that contains specially formatted JSON strings that describe how to evaluate the semantics of a JSON tree.

It is a shame that we don't have a simple jprint(1) tool that parses a JSON file, and if valid, does a simple job of printing JSON in canonical format to standard output in the format identical to what bin/jprint-wrapper.sh does in that other repo.

If we had such a tool, then it would be a straightforward process to add a command like option (such as -S for semantical output) to cause the jprint(1) to print JSON values in the form of these JSON semantic strings. The data to print JSON semantic strings is right there in the JSON parse tree structures, so would be just a simple job formatting and printing such JSON strings (instead of the JSON values).

Perhaps it is time for you to update your jparse repo with the current state of the jparse/ directory from this mkiocccentry repo?

If you did that, then we would populate our jparse.clone/ directory with the contents of your updated jparse repo.

HINT: This is how we plan to make is easier integrate changes, when needed, from your updated jparse repo back into mkiocccentry repo .. HINT

Then you could add a simple jprint(1) tool your updated jparse repo. Such a simple jprint(1) tool could produce output equivalent to bin/jprint-wrapper.sh from that other repo.

HINT: It would be easy for us to install the a jprint(1) tool your updated jparse repo and modify bin/jprint-wrapper.sh to detect the presence of a jprint(1) tool and use that if found.

We could then fork your updated jparse repo with jprint added and issue to you a PR that would modify simple jprint -s to print JSON semantic strings for you to consider.

TL;DR

We recommend:

@xexyl
Copy link
Contributor

xexyl commented Aug 28, 2024

The above sounds great! I need to get some food soon and then if I have to I will do this (I think I am about done with the other stuff).

I am tempted to do this first but I feel like it might take some time and work so I better eat first. If I don't get to it today I will do it tomorrow.

Or if you want you could issue a pull request.

Assuming that I have finished the other things and I don't have to leave for the day I might be able to work on this in say half an hour or so.

Perhaps after this I can also work out how to integrate the parser into the test suite we added JSON files from.

As for not adding code with functionality changes no worries.

@xexyl
Copy link
Contributor

xexyl commented Aug 28, 2024

Of course the jprint tool might take a bit of time.

But on that note how does this (in priority) (the jprint tool I mean) compare to the html files in the other repo?

Thanks!

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 28, 2024

Of course the jprint tool might take a bit of time.

But on that note how does this (in priority) (the jprint tool I mean) compare to the html files in the other repo?

It is a lower priority. Perhaps something that can be done when the final stages of the Great Fork Merge are underway (and post html files in the other repo are done).

We will hold off on describing the JSON semantic strings until jprint(1) is ready.

@xexyl
Copy link
Contributor

xexyl commented Aug 28, 2024

Of course the jprint tool might take a bit of time.

But on that note how does this (in priority) (the jprint tool I mean) compare to the html files in the other repo?

It is a lower priority. Perhaps something that can be done when the final stages of the Great Fork Merge are underway (and post html files in the other repo are done).

We will hold off on describing the JSON semantic strings until jprint(1) is ready.

But should the repo be populated now (I know you said it should be but since we are talking about one part waiting I want to be sure)?

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 28, 2024

Of course the jprint tool might take a bit of time.

But on that note how does this (in priority) (the jprint tool I mean) compare to the html files in the other repo?

It is a lower priority. Perhaps something that can be done when the final stages of the Great Fork Merge are underway (and post html files in the other repo are done).
We will hold off on describing the JSON semantic strings until jprint(1) is ready.

But should the repo be populated now (I know you said it should be but since we are talking about one part waiting I want to be sure)?

One moment please ...

@xexyl
Copy link
Contributor

xexyl commented Aug 28, 2024

Of course the jprint tool might take a bit of time.

But on that note how does this (in priority) (the jprint tool I mean) compare to the html files in the other repo?

It is a lower priority. Perhaps something that can be done when the final stages of the Great Fork Merge are underway (and post html files in the other repo are done).

We will hold off on describing the JSON semantic strings until jprint(1) is ready.

But should the repo be populated now (I know you said it should be but since we are talking about one part waiting I want to be sure)?

One moment please ...

Sure. Food still cooking.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 28, 2024

Of course the jprint tool might take a bit of time.

But on that note how does this (in priority) (the jprint tool I mean) compare to the html files in the other repo?

It is a lower priority. Perhaps something that can be done when the final stages of the Great Fork Merge are underway (and post html files in the other repo are done).

We will hold off on describing the JSON semantic strings until jprint(1) is ready.

But should the repo be populated now (I know you said it should be but since we are talking about one part waiting I want to be sure)?

One moment please ...

Sure. Food still cooking.

Hope it is cooking well.

See PR #1 in your jparse repo.

The first pull request for that repo! 🤓

@xexyl
Copy link
Contributor

xexyl commented Aug 28, 2024

Of course the jprint tool might take a bit of time.

But on that note how does this (in priority) (the jprint tool I mean) compare to the html files in the other repo?

It is a lower priority. Perhaps something that can be done when the final stages of the Great Fork Merge are underway (and post html files in the other repo are done).

We will hold off on describing the JSON semantic strings until jprint(1) is ready.

But should the repo be populated now (I know you said it should be but since we are talking about one part waiting I want to be sure)?

One moment please ...

Sure. Food still cooking.

Hope it is cooking well.

It's already in my alimentary canal, breaking down and so on, the poor little things. Thanks!

See PR #1 in your jparse repo.

The first pull request for that repo! 🤓

I was hoping it would be you actually! Unfortunately some build problems. But I merged so we can address it. One issue (not sure if it's the only issue) is that the json semantic table generator requires iocccsize.h. We have to resolve that one. I have no idea what you suggest but obviously as an IOCCC specific thing that cannot be there.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 28, 2024

Thanks for accepting the PR #1.

We then did for our own tree under this repo:

make jparse.update_clone

Now when we do:

make jparse.diff_jparse_clone

we only see:

diff -u -r --exclude-from=.exclude jparse.clone jparse
Only in jparse.clone: CHANGES.md
make: [jparse.diff_jparse_clone] Error 1 (ignored)

@xexyl
Copy link
Contributor

xexyl commented Aug 28, 2024

Thanks for accepting the PR #1.

We then did for our own tree under this repo:

make jparse.update_clone

Now when we do:

make jparse.diff_jparse_clone

we only see:

diff -u -r --exclude-from=.exclude jparse.clone jparse
Only in jparse.clone: CHANGES.md
make: [jparse.diff_jparse_clone] Error 1 (ignored)

Well the problem here (I think) can be solved by removing the #include "../iocccsize.h" and running make depend. It at least compiles now. Will commit and see if that solves the problem.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 28, 2024

Of course the jprint tool might take a bit of time.

But on that note how does this (in priority) (the jprint tool I mean) compare to the html files in the other repo?

It is a lower priority. Perhaps something that can be done when the final stages of the Great Fork Merge are underway (and post html files in the other repo are done).

We will hold off on describing the JSON semantic strings until jprint(1) is ready.

But should the repo be populated now (I know you said it should be but since we are talking about one part waiting I want to be sure)?

One moment please ...

Sure. Food still cooking.

Hope it is cooking well.

It's already in my alimentary canal, breaking down and so on, the poor little things. Thanks!

See PR #1 in your jparse repo.
The first pull request for that repo! 🤓

I was hoping it would be you actually! Unfortunately some build problems. But I merged so we can address it. One issue (not sure if it's the only issue) is that the json semantic table generator requires iocccsize.h. We have to resolve that one. I have no idea what you suggest but obviously as an IOCCC specific thing that cannot be there.

We should be discussing this in that other repo .. but for now we had to place our clone under njparse/jparse and then setup as follows:

lrwxr-xr-x  1 chongo   19 Aug 28 13:37 dbg -> ../mkiocccentry/dbg
lrwxr-xr-x  1 chongo   25 Aug 28 13:37 dyn_array -> ../mkiocccentry/dyn_array
lrwxr-xr-x  1 chongo   27 Aug 28 13:37 iocccsize.h -> ../mkiocccentry/iocccsize.h
drwxr-xr-x 42 chongo 1344 Aug 28 13:52 jparse

For now, we commend you try the above.

Eventually in your GitHub workflow, you will need address this. Currently the jparse tree assumes the above.

@xexyl
Copy link
Contributor

xexyl commented Aug 28, 2024

Of course the jprint tool might take a bit of time.

But on that note how does this (in priority) (the jprint tool I mean) compare to the html files in the other repo?

It is a lower priority. Perhaps something that can be done when the final stages of the Great Fork Merge are underway (and post html files in the other repo are done).

We will hold off on describing the JSON semantic strings until jprint(1) is ready.

But should the repo be populated now (I know you said it should be but since we are talking about one part waiting I want to be sure)?

One moment please ...

Sure. Food still cooking.

Hope it is cooking well.

It's already in my alimentary canal, breaking down and so on, the poor little things. Thanks!

See PR #1 in your jparse repo.
The first pull request for that repo! 🤓

I was hoping it would be you actually! Unfortunately some build problems. But I merged so we can address it. One issue (not sure if it's the only issue) is that the json semantic table generator requires iocccsize.h. We have to resolve that one. I have no idea what you suggest but obviously as an IOCCC specific thing that cannot be there.

We should be discussing this in that other repo .. but for now we had to place our clone under njparse/jparse and then setup as follows:

lrwxr-xr-x  1 chongo   19 Aug 28 13:37 dbg -> ../mkiocccentry/dbg
lrwxr-xr-x  1 chongo   25 Aug 28 13:37 dyn_array -> ../mkiocccentry/dyn_array
lrwxr-xr-x  1 chongo   27 Aug 28 13:37 iocccsize.h -> ../mkiocccentry/iocccsize.h
drwxr-xr-x 42 chongo 1344 Aug 28 13:52 jparse

For now, we commend you try the above.
Eventually in your GitHub workflow, you will need address this. Currently the jparse tree assumes the above.

Let's discuss it over there then yes.

I added a comment there asking what you think since of course the workflow needs to have it but so does anyone using the parser. It already builds locally but nobody else can unless they have installed the other repos.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 28, 2024

Added a TODO about jprint(1) to the top level comment.

@xexyl
Copy link
Contributor

xexyl commented Aug 28, 2024

Added a TODO about jprint(1) to the top level comment.

Thanks. Perhaps there should be a new issue about this in the jparse repo? Please feel free to open one. That way we can discuss it and there can be a clearer idea of what you are after.

@lcn2
Copy link
Contributor Author

lcn2 commented Aug 29, 2024

Added a TODO about jprint(1) to the top level comment.

Thanks. Perhaps there should be a new issue about this in the jparse repo? Please feel free to open one. That way we can discuss it and there can be a clearer idea of what you are after.

See issue #5 in the jparse repo.

Please NOTE: The issue is just a draft. We will go to sleep, wake up and if needed, revise it.

@lcn2
Copy link
Contributor Author

lcn2 commented Sep 7, 2024

We believe we have addressed all of the current questions that still need answering at this time. If we've missed something or something else needs to be clarified, please ask again.

@lcn2
Copy link
Contributor Author

lcn2 commented Sep 24, 2024

When issue #979 is completed, the chkentry(1) will need to perform the same checks, as noted in GH-issue-2546596062.

@xexyl
Copy link
Contributor

xexyl commented Sep 24, 2024

When issue #979 is completed, the chkentry(1) will need to perform the same checks, as noted in GH-issue-2546596062.

Yes but this comes after the next IOCCC, right?

Okay off again for the day. Back tomorrow.

@xexyl
Copy link
Contributor

xexyl commented Oct 1, 2024

I just noticed something useful for this. The json_sem.c file might have some useful things for this issue too wrt the walking of the tree. This is a note to myself mostly, for after IOCCC28.

@lcn2 lcn2 added the background priority While this issue is needs to be solved, it is of a somewhat lower priority. label Oct 18, 2024
@lcn2 lcn2 added the post-IOCCC28 All work and comments delayed until post-IOCCC28 and post IOCCC judge vacation. label Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
background priority While this issue is needs to be solved, it is of a somewhat lower priority. enhancement New feature or request post-IOCCC28 All work and comments delayed until post-IOCCC28 and post IOCCC judge vacation.
Projects
None yet
Development

No branches or pull requests

8 participants
@lcn2 @xexyl and others