-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-proposal: standardize setting alt text from alt
metadata in display_data
#120
Comments
Great stuff!
Specifically for images, things get more complex: a Having a robust, schema-constrained data model for this content in the Jupyter Messaging Spec would be a great way forward, but ideally it would come from somewhere else that already had tools, documentation, and test suites to run against. |
great stuff. i'm glad there is momentum for things like this. i agree this should be standardized, but i don't know if the nbformat is the place for it. like @bollwyvl says this is a good starting point, and, also, a can of worms. i could imagine a supplementary schema that says something like: [properties.cells.items.properties.outputs.items.metadata.properties."image/png"]
type = "string"
required = true
[properties.cells.items.properties.outputs.items.metadata.properties."image/jpeg"]
type = "string"
required = true
[properties.cells.items.properties.outputs.items.data.properties."text/markdown"]
description = "check for empty markdown links"
type="string"
pattern = ".*\[\s*\](\S).*" so now the question is @blois introduced changes for "specifying alt text in for IPython.Image", but it still seems that generally the alt text rarely makes it to html including lab and nbconvert. i think this approach is a good starting point and we should try to see that propagates to the html where it can be measured.
in the context of authoring a notebook, inverse models for inferring alt text could be a distraction. in a notebook, an author is going to know the context of their figures as preliminary objects like arrays and dataframes. when we have live states of data we can create better forward models for generating alt text rather guessing with inverse models on lossy data representations. matplotlib figures with subplots really stretch the idea of encoding |
Thank you! I thought I had seen something about this before, but was looking at frontends instead of ipykernel. A supplementary schema for accessible documents is a really great and interesting idea. Do you think that's where we should start, or is a smaller scope appropriate for a first pass? Figure vs Image is a good question, and we could open that here, or we could consider that separately. I think it's a valid choice to say that frontends should or may display image outputs as The scope of what I originally aimed for here was encoding what are typically accessibility attributes on an img tag, e.g. For the matplotlib plots, I did not mean to pick them to start a discussion of how to compute alt for charts, but rather as an example where kernels register the display function that produces an image (the matplotlib inline backend), but the alt text is likely to come from somewhere else (e.g. directly written by the author). i.e. what does it look like in ipykernel to display a figure with user-created alt text? So assume the author has the alt text and Figure, how do they publish them together? This is less about matplotlib specifically, and more about attaching author-controlled alt text to something that may already have an image repr, to make sure it's manageable for authors without needing to re-register custom display formatters for everything. The |
right now, software and accessibility wise we are missing this. there is no recommendation for including alt text because there is no implicit way to do it. we really need this. nothing really seems to work right and it should. in the near term
i personally would choose trying to extend the schema. we've introduced two JEPs that focus on extensible schema for the notebook format and rallying around those would be a good experiment.
i always caution measuring accessibility outside of html, or removing the image from its context. ARIA10:Using aria-labelledby to provide a text alternative for non-text content is a technique for alt text that relies on the document. i know i keep giving reasons not to, but i think that is a scope creep potential for putting accessibility directly in the notebook format. it is an entire world in itself. i think that nbconvert-a11y could potentially take on some of these schema concerns though.
i'd really love for this to be more explicit, it is still implicit. we should have assistive notebook representations that generated tagged pdf. then we want some accessible epub. our goal is the same nbconvert targets all along, but accessible. that requires retrofitting accessibility which has proved very challenging in jupyter. i think outside of immediately improving the quality of html output we should be adding testing like axe and the |
Also see discussion in ipython/ipython#12864 I would also love to see an easy/automatic way for alt text from matplotlib images. |
I don't view this as taking any responsibility from the author. I think computing the accessible info is going to remain their responsibility (and in some cases where possible, a library). I mainly want to get to the point where if an author has the accessibility information, they can publish and it will be displayed. For I think the one Jupyter-specific point is that all of our displayed images actually do come with text representations in the text/plain output. So a potential default implementation is to use the
To be clear, do you mean adding to the existing nbformat schema, or defining an extended "accessible notebooks" schema, which derives from the existing one? How is this for a proposal:
then we have a place to advance the accessible outputs schema going forward, owned by the experts in the area, and are starting with the low-hanging fruit of handling the My original plan was just the last bullet point, but it seems like having a dedicated place to grow more of these metadata definitions can facilitate things going forward.
I agree. In this context, I mean it mostly in terms of naming fields, e.g. when we need to pick our own key for "text that can be substituted for an image," we use
This is where things go way beyond my understanding, but presumably will inform the schema development. How much can an accessible PDF be generated from what is assumed to be already accessible HTML? Or is this only feasible with higher-level structured representations of an image and metadata?
Yes, that would be great! |
the proposal you put forth sounds great. we can work on schema in salient projects with the end goal being the jupyter schemas repo. conveniently, the front end work on accessible image descriptions concurrently while these other activities are happening.
nbconvert-a11y just started taking on outputs. it is nascent work with the goal of providing more semantically meaningful outputs for dataframes and arrays. the main contribution of nbconvert a11y so far is the reference template for the notebook that uses semantic html5 to construct an ideal representation for assistive technology. it could be used as implemenation, but is primarily a reference for testing out accessible designs. here is an example of the lorenz notebook with outputs.
i think another advantage is that schema work can be introduced into testing pipelines of projects trying to adopt this enhancement. this is another potential mode for uptake.
my suggestion for this key with
the theory is some accessible html could be made in accessible pdf with |
Hi all, I'm the developer of the project mentioned in the first comment, but definitely not an expert on Jupyter. Rather than transferring over complex html attributes, is there a way to render matplotlib outputs or other figures in html by default? (also see #14482) For instance, the way matplotalt renders figures together with alt text is by displaying the base64 encoded image in an
|
Problem:
alt text is required for image accessibility, but there's currently no documented way to set alt text on displayed images.
Proposal:
populate
alt
attribute from thealt
metadata field in display_data messages, if defined. Alternative: use a non-standard but more specific name, such ashtml-alt
, oralt-text
.There are a few places this info can come from, such as ImageDescription EXIF data, but we should have a protocol-level place for users and libraries to be able to easily populate these fields.
Another alternative comes from the fact that the Jupyter protocol mime-bundles already are multiple representations of the same object, and should always include plain text. So another valid choice is for User Interfaces (JupyterLab) to populate the alt text for images with the
text/plain
representation of the same output. However, in practice that is very rarely a good choice for alt text for an image, since it describes the displayed object, not the image itself. I'd need to hear from accessibility folks if that's better or worse than no alt text at all. But maybe this should lead us to make better text representations in text/plain, too!While using the
text/plain
output may be a reasonable default when alt is unspecified, I still think it makes sense to pass through thealt
metadata so that users and libraries can more easily setalt
on the destination element, regardless of what we choose for possible sources of default values in the HTML.Still to iron out:
Affected packages:
references:
Suggested reviewers:
The text was updated successfully, but these errors were encountered: