Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify property names #336

Merged
merged 1 commit into from
Sep 25, 2018
Merged

clarify property names #336

merged 1 commit into from
Sep 25, 2018

Conversation

mattgarrish
Copy link
Member

@mattgarrish mattgarrish commented Sep 19, 2018

This PR is an attempt to address issue #312 by adding a quick reference table that lists all the properties and identifies where they are defined. It also adds the properties to the descriptions that precede the infoset/manifest subsections to introduce their names earlier.

One problem it raises is that it leaves the accessibility properties with nothing in their infoset section, because all the section did was list the property names.

I tried various other things, like trying to put the property names in the section headings, but a problem is that some sections are groupings of many properties.

Probably doesn't completely solve the issue, as we're always going to have some disharmony between natural language usage and the odd names schema.org sometimes chooses, but hopefully this helps a bit. If not, feel free to discard.


Preview | Diff

@iherman
Copy link
Member

iherman commented Sep 20, 2018

Per accessibility: I think have two possibilities:

  1. we can simply say that the infoset requirements is simply that the values should abide to the rules defined for these values as documented on the Wiki page. We already have links in the table.

  2. We can extract the values for each of the accessibility values the allowed terms and put these into the spec. Editorially, we can put them into a normative appendix, and use a reference rather than put them into the main section just for a matter of readability. Note that @HadrienGardeur has already collected those values for each term in the JSON schema, that may be easier to use.

I have a slight preference for (2), although it may not be scalable for the future. However, the Wiki is not really clear, and it was not easy to extract a precise set of terms (kudos to @HadrienGardeur).

@avneeshsingh WDYT?

Copy link
Member

@iherman iherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the separate comment in #336 (comment) I really like that change, so go for it! Thx

@mattgarrish
Copy link
Member Author

I hadn't noticed the enums in the schema, but it's questionable whether we should be trying to enforce the values that strictly. Schema.org doesn't have those kinds of controls on the value space. They are the recommended value set, yes, but it's not invalid to use values not in the preferred vocabulary. New values are expected to be added whenever they're needed, too.

@iherman
Copy link
Member

iherman commented Sep 24, 2018

@mattgarrish you are right, but, on the other hand, it is good to have information in one place if possible. Whilst I do not think we should collect all the possible Schema.org values, for example, these accessibility values are not easy to find and extract from the wiki page.

What about creating that list in the appendix as non normative, making it clear that the values may change (presumably expand) and the authoritative place is the wiki page?

@HadrienGardeur
Copy link

Since the schema is non-normative and can easily be updated, I don't see a good reason not to include such enums.

Unlike other validation tools, we don't have warnings with JSON Schema and a schema is much more valuable when it enforces strong validation.

@mattgarrish
Copy link
Member Author

Since the schema is non-normative and can easily be updated

It's not so much the schema as the implication that the values are an exclusive set. What you find on the wiki isn't even complete, as there is an allowance for producers to extend their own values based on the primary concepts (see "highContrastAudio/nobackground" as an example, which isn't in the schema). Trying to enumerate all those possibilities in the schema hardly seems worth the effort, if we could even do it.

Defining the value space is not a problem easily solved by json schema, or by our own specification, in other words. I personally don't think it's an issue we should invest too much effort into, either. We know what the expected value types are and we can recommend use of the preferred vocabulary, which is all that schema.org does.

But if we decide we're going to become value gatekeepers and limit what people can add, then we'll need to be clearer of that expectation in the specification.

@iherman
Copy link
Member

iherman commented Sep 24, 2018

@mattgarrish I think we should really get the opinion on @avneeshsingh on this. Who is the gatekeeper in the first place?

@HadrienGardeur
Copy link

I would personally prefer that we keep track of such values and validate them in the context of a WP since this is an important part of our mandate (accessibility).

It's very easy to drop requirements in the schema but it also makes it less useful at the same time IMO.

@avneeshsingh
Copy link

avneeshsingh commented Sep 24, 2018

The values for schema.org accessibility metadata are defined on W3C wiki page, which can be edited without a strict/formal process. Although it is not a huge page but keeping track of everything may be difficult.

The stable piece is the properties. One way can be to pick up the properties , and describe it in WP specs. And provide the link to wiki for more details. I do not think that we should copy all the values from schema.org wiki to WP specs.

On closer review, it looks Math has done something similar!

@mattgarrish
Copy link
Member Author

So you're going to add entries for every point size that can modify largePrint? Run through all the possible CSS properties that can modify displayTransformability?

Not my idea of fun...

For history, the project to develop the properties was initiated by Benetech, but the metadata is based on IMS's Access for All standard. IMS is generally considered the gatekeeper of the properties, but we are less formal about the value space. We last took discussions up during the development of the EPUB accessibility spec, as most of the key players were there. That's sort of the way it's gone since the project ended.

If it helps put your mind at ease, though, I was the editor of the original project (but can't take credit for the wiki). I kind of have an idea what I'm talking about... ;)

Like I said above, though, if we want to be stricter about what we accept than schema.org allows, we need to be clear about it. But once we shift the values into the specification and enforce them, then we become responsible for any problems it poses, like the inevitable disconnect between schema.org in the wild and what you can do in the manifest.

I'd rather leave it to user agents to determine what to do with values that makes no sense, which is what they have to do now anyway, than try to lock down a subset of the vocabulary.

Or, as @avneeshsingh says, create an non-normative appendix of key values for the properties without actually tying validation to them.

@iherman
Copy link
Member

iherman commented Sep 24, 2018

Or, as @avneeshsingh says, create an non-normative appendix of key values for the properties without actually tying validation to them.

which is fine with me (we will have to see at the processing steps of the current draft that actually does include checking the values, though, just like the json schema does; that must be removed).

@avneeshsingh
Copy link

Thinking it further, another way would be to provide examples of key values instead of placing non-normative values in appendix. The values are strings, and the list can be easily extended, so it would be good to avoid overhead of consistently maintaining the duplicate list on WP side.

@madeleinerothberg officially maintain the W3C wiki for schema.org. As Matt mentioned a lot of this work was done in EPUB 3 working group where IMS Global, DAISY Consortium, Benetech and other key organizations together moved it forward.

@iherman
Copy link
Member

iherman commented Sep 25, 2018

@avneeshsingh thanks. Is it correct to say that the current set of values on the wiki are, sort of, indicative, ie, the list can and will be extended without further ado? If so, then I do not think our spec, or indeed the processing of our manifest, can really meaningful testing on the values.

I would then propose to

  1. add some general text based on this discussion into the document on the section for the accessibility values and merge the current PR.
  2. In section 6.2 the items 4.2 and 4.3 should be removed (processing the manifest cannot and should not test the values because there is no final list to be checked for). @mattgarrish you can either do that in this PR, or I can do it after a merge.

The question of the JSON schema is a bit less clear. The problem I have is that the schema is really useful only if there is a way to continuous maintenance of the schema (ie, the enum values) alongside the accessibility terms, and I am not sure how we can ensure that. The current schema reflects the current snapshot only. I may prefer to remove the enums altogether, but I am not fully sure about it.

@mattgarrish
Copy link
Member Author

the list can and will be extended without further ado

Yes, they cover a broad range of requirements that we identified, but they are not comprehensive. The idea was always to have the community contribute new terms they're using.

It's impossible to say if people are extending the vocabulary or not yet, but we shouldn't prevent it.

@mattgarrish
Copy link
Member Author

There are varying degrees of likelihood that new additions will be made, too. The access modes aren't likely to be expanded on, for example, and it's unclear what new hazards might be introduced. But the features list, in particular, is potentially more fluid.

Unless we're proposing specific behaviours based on the metadata, which I can't imagine we are, it just doesn't make a lot of sense to be too strict about it.

We can also review the wiki and see how it can be improved if you find it problematic. @clapierre was trying to clean it up earlier this year, as it was put together by another fellow who has since moved on and went untouched for a long time. He, @madeleinerothberg and I can always do more work on it if you just let us know what specific difficulties you have with it (probably best to take that discussion offline, though).

@mattgarrish
Copy link
Member Author

What if we just be a bit less draconian about removing properties in 6.2? For example:

  1. For all the terms defined in 4.3.1 Accessibility, except for accessModeSufficient and accessibilitySummary, check whether all tokens listed in manifest[term] are defined in the preferred vocabulary (see the list of expected values for each). Issue a warning for each unrecognized value.

  2. For all values in manifest["accessModeSufficient"], check whether each token in each ItemList is defined in the preferred vocabulary (see the list of expected values). Issue a warning for each unrecognized value.

Let's call it an imperfect compromise, in that it still expects a known list of values. But so long as we aren't issuing errors or throwing out values, I can live with the enums. Just because the machine can't make sense of the value doesn't mean a human won't know what the token means. I'd rather the information bubble through than be tossed out. It might also encourage anyone extending the vocabulary to do so in a more formal way.

As minor notes, the link to the expected values should probably go to the wiki table, not to the accessibilityfeature writeup. Also, I rewrote the above to stop mimicking EPUB's workaround use of accessModeSufficient. We should really do the same elsewhere.

@clapierre
Copy link

Happy to help Matt and Madeleine make further improvements to the wiki, and agree with Matt's proposal.

@iherman
Copy link
Member

iherman commented Sep 25, 2018

@mattgarrish I am fine with what you say in #336 (comment)

@avneeshsingh
Copy link

Just clarifying in my head. Matt's proposal is to allow some validation of values based on the wiki page, and not on the basis of values copied on WP specs. i.e. W3C wiki remain the main source of reference.
This looks good.

@mattgarrish
Copy link
Member Author

allow some validation of values based on the wiki page, and not on the basis of values copied on WP specs

Exactly, yes. What we essentially get is that you "should" use the vocabulary defined in the wiki, but user agents should still process values they don't recognize. Stripping values is problematic, as it could change the meaning of the sufficient access modes, for example.

I'd like to address these changes in a new PR, though, so that we can review them separately. To that end, I'll close this one on naming off now.

@mattgarrish mattgarrish merged commit e509194 into master Sep 25, 2018
@mattgarrish mattgarrish deleted the property-naming branch September 25, 2018 14:26
@madeleinerothberg
Copy link

I haven't been directly involved in this spec process, but it sounds to me like you have arrived at conclusions I agree with. The list of enumerated terms are recommended but not mandatory; the list may be expanded. Most schema.org properties don't have any enumeration so the accessibility properties are unusual. I expect that if the properties get good uptake, new terms will arise in the wild. But I don't know what the right technical solution for WP is.

@mattgarrish mattgarrish mentioned this pull request Sep 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants