Introduce AnnotatedIOBuffer #51807

tecosaur · 2023-10-21T11:58:08Z

This allows for styled content to be constructed incrementally, without resorting to repeated concatenation. It operates very similarly to IOContext, just with a special write method and specifically wrapping an IOBuffer.

While I consider this functionality generally useful, I also have a specific use in Markdown in mind for this. For no it's not exported or public API, but I think we'd probably want to talk about making it public API in the future.

For Base itself though, I think it's important to note that this lets us pass an AnnotatedIOBuffer to show calls, and thus truncate printed AnnotatedStrings without the issues of accidentally removing terminating escape codes etc. that we currently have (see #45521 for an example).

With the way printstyled works, this benefit won't be seen immediately, but I see it as part of building the path to a nicer, less headache-inducing future.

LilithHafner · 2023-10-26T01:09:14Z

This CI failure:

ambiguous                                        (3) \|         failed at 2023-10-21T12:07:05.619
  | Test Failed at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-master/julia-b67d93b47d/share/julia/test/ambiguous.jl:174
  | Expression: sig1 === sig2
  | Evaluated: write(io::Base.AnnotatedIOBuffer, x::AbstractString) @ Base strings/annotated.jl:433 === write(io::IO, s::Union{SubString{String}, String}) @ Base strings/io.jl:248
  |  
  | Test Failed at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-master/julia-b67d93b47d/share/julia/test/ambiguous.jl:177
  | Expression: good

Means that write(io::Base.AnnotatedIOBuffer, x::AbstractString) and write(io::IO, s::Union{SubString{String}, String}) are ambiguous.

Specifically, calls to write(io::Base.AnnotatedIOBuffer, s::Union{SubString{String}, String}) will now error.

tecosaur · 2023-10-26T01:33:37Z

Thanks for having a look, in response I've just added these two methods:

write(io::AnnotatedIOBuffer, x::AbstractString) = write(io.io, x)
write(io::AnnotatedIOBuffer, s::Union{SubString{String}, String}) = write(io.io, x)

LilithHafner

This seems like a good idea. However, I think it makes sense to include at least one motivating usage of this type in this PR (or a linked PR that depends on it) because I don't like introducing additional complexity that does not have any impact on publicly accessible functionality.

In addition to avoiding premature addition of complexity (which I doubt this is) the request to add a motivating use case is a nice principle to follow because having a use of AnnotatedIOBuffer gives context for the API design decisions of the type.

base/strings/annotated.jl

tecosaur · 2023-10-26T03:32:27Z

Thanks for the comments Lilith. Regarding adding a usage example, there are three PRs I have in mind currently that need this: improving the join method for annotated strings, using AnnotatedIOBuffer with show/showerror, and a PR to the Markdown stdlib.

The latter two will be more work I suspect, so I'll see if I can whip up the join PR in the near future.

base/show.jl

base/strings/annotated.jl

tecosaur · 2023-10-26T18:07:37Z

For a basic example of this being useful, perhaps JuliaLang/StyledStrings.jl#12 is a decent starting point? The join PR might not come till the weekend.

LilithHafner · 2023-10-26T19:24:21Z

From triage: Jeff is afraid that everyone will have to use annotated things for all IO in Base (and the whole ecosystem?) if we go this route.

This is not something that we can talk about as a single issue. We need to discuss the overarching vision of all IO in Julia. This has far reaching consequences and shouldn't be merged as a small thing when it's really a huge thing.

This is not to say that we don't want to do this, but that if we do decide to do this, it should be a broad sweeping decision. Let's talk about this at next triage 1:30 UTC on November 9.

base/strings/annotated.jl

tecosaur · 2023-11-09T04:01:52Z

Triage didn't get to this, try again in a month's time 😞

tecosaur · 2023-12-07T03:15:29Z

Triage had a chat about this, and while not unanimous was broadly in support of this. We concluded that it would be best to put some of the major point in the discussion here, and give an opportunity for objections/problems we've overlooked to be mentioned. Should nothing (new) come up, this can be merged in a week or so.

Triage conversation points:

AnnotatedIOBuffer acts as a annotated string builder
AnnotatedIOBuffer acts as a way to get structured styling information out of "legacy" code written for ::IO and using print / printstyled. This lets you e.g. safely take a substring of a show(...) invocation's result.
AnnotatedIOBuffer isn't really conceptually purely IO
Worst case seems to be get(::IO, :color, false)-guarded manually printed escape codes will be lost
You can't really get the "swap in and collect structured styling information" behaviour without subtyping IO
Since this is an opt-in where you want it type approach, this doesn't affect or require rewriting of existing code, other than funky shouldn't-really-be-done-in-the-first-place type stuff needing rewriting when combined with this.
As a compromise, it seems worth it

tecosaur · 2024-01-31T09:04:45Z

Alrighty, time for round 2 of

I think that's everything addressed?

😁

vtjnash · 2024-01-31T16:54:20Z

It is pretty straightforward indeed! That's why I've had JuliaLang/StyledStrings.jl#12 sitting around since October, which lets you benefit from this new functionality with code that still uses printstyled. 😄

Ah gotcha. That seems somewhat like type piracy though, no, since you have to load StyledStrings for this AnnotatedIOBuffer to work correctly?

vtjnash · 2024-01-31T16:58:19Z

Seems to have an ambiguity error:

Error in testset ambiguous:
Test Failed at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-master/julia-16a3361820/share/julia/test/ambiguous.jl:174
  Expression: sig1 === sig2
   Evaluated: read(s::IO, T::Type) @ Base io.jl:1179 === read(io::Base.AnnotatedIOBuffer, x) @ Base strings/annotated.jl:506

Also, I see you added the "don't squash" label, but it doesn't appear that each commit here is functionally independent. If not, they would normally be squashed to preserve the one commit == one feature / change expectation.

tecosaur · 2024-01-31T17:10:36Z

It is pretty straightforward indeed! That's why I've had JuliaLang/StyledStrings.jl#12 sitting around since October, which lets you benefit from this new functionality with code that still uses printstyled. 😄

Ah gotcha. That seems somewhat like type piracy though, no, since you have to load StyledStrings for this AnnotatedIOBuffer to work correctly?

It is indeed, type pirate-y. However, I don't really see much of a way around it.

One of the reasons (by a few people, IIRC) given in support of StyledStrings being a stdlib and not part of Base is not adding display/printing-logic-type code to Base when there are active efforts to slim it down (I belive somebody even made the aspirational comment that Base itself should have no printing code at all eventually). However, to work without causing a pile of method invalidations, some bits need to be in Base (e.g. once this PR is merged, there's one that uses this to make join work better with a mix of Strings and AnnotatedStrings).

Now, printstyled is display/printing-logic, and so some of the functions required to convert it to a nice composable styled AnnotatedString are in the StyledStrings stdlib. So we have this split where some styling code is in Base, and some is in StyledStrings, and making the two work nicely together involves StyledStrings committing type piracy with Base … but I don't see any easy way around it.

Do please let me know if you have any better ideas around this.

vtjnash · 2024-01-31T17:17:46Z

I guess that makes sense, maybe StyledStrings is just one of those that needs to be loaded by default, like Logging, FileWatching, and Libdl. They are conceptually separated out of Base, and could be removed during a build if absolutely required, but are not actually separable or "replaceable batteries" in that sense.

vtjnash · 2024-01-31T17:19:03Z

Once the tests are passing, this looks good to merge

tecosaur · 2024-01-31T17:20:06Z

I'm not actually quite sure what's going on with that ambiguity, might you be able to elaborate on it?

tecosaur · 2024-01-31T17:32:54Z

Also, I see you added the "don't squash" label, but it doesn't appear that each commit here is functionally independent. If not, they would normally be squashed to preserve the one commit == one feature / change expectation.

I guess it depends on what you look for in a PR. I get the sense that you're seeing this as a PR that "provides AnnotatedIOBuffer" and so there should be one commit that "introduces AnnotatedIOBuffer".

I tend to think of PRs as a collection of atomic commits that collectively provide a feature/change, and a useful history of how that happened. These commits are that.

* Introduce AnnotatedIOBuffer
  (An outline of the feature, with a commit message explaining the intent.)
* Make AnnotatedIOBuffer behave more like IOBuffer
  (Changing the behaviour of the feature, with justification in the commit message)
* Make AnnotatedIOBuffer reading more generic
  (Another specific change with justification/reasoning in the commit message)
* Also read AnnotatedChars from an AnnotatedIOBuffer
  (A new (sub)feature)
* Writing from one AnnotatedIOBuffer to another
  (A new (sub)feature)
* Tests for AnnotatedIOBuffer
* Update the mention of annotated features in NEWS
* Use insert! not splice! for new annotations

I'd think the history of the development (so long as the commits are atomic, which they are) would be more valuable than dumping the collection of changes fully formed into the repo, no? I guess it depends on how much you see git as a tool to track the development history of a project vs. just a means of distributed work.

Depending on how you look at it, forget being independent of each other, some of the commits here aren't even independent of other PRs 😛 (e.g. the NEWS and insert! commits, affect content from previous PRs unrelated to this PR, but I wouldn't think are worth separating out into separate PRs).

tecosaur · 2024-01-31T17:42:06Z

I guess that makes sense, maybe StyledStrings is just one of those that needs to be loaded by default, like Logging, FileWatching, and Libdl

On this note, once #51869 is sorted out (imminently hopefully 🤞), StyledStrings will be loaded with REPL, which I imagine is pretty similar in effect.

While `String` is the only concrete type for which `read(::IOBuffer, ::Type{<:AbstractString})` is defined, is is entirely conceivable that some other custom string type could define a similar `read` method. Since making reading an `AnnotatedString` from an `AnnotatedIOBuffer` more generic is as easy as replacing the hardcoded `String` with a type parameter, we may as well do so.

The `read(::AnnotatedIOBuffer, AnnotatedString)` method is intended as an approximate analogue to `read(::IOBuffer, String)`. In the same sense, it makes sense to define `read(::AnnotatedIOBuffer, AnnotatedChar)` as an analogue to `read(::IOBuffer, Char)`.

Co-authored-by: Jameson Nash <vtjnash@gmail.com>

vtjnash · 2024-01-31T18:26:22Z

I'd think the history of the development

Commits should not reflect the history of the development, but rather are curated set of changes that each can be viewed independently as being a complete change

tecosaur · 2024-01-31T18:29:23Z

Commits should not reflect the history of the development, but rather are curated set of changes that each can be viewed independently as being a complete change

Right, but I don't see the two as being mutually exclusive — or do you not view commits like Use insert! not splice! for new annotations as being able to be viewed independently as a complete change?

vtjnash · 2024-01-31T18:30:28Z

Seen another way, each separate commit is a statement that individual commits could be reverted and the feature (up to that commit in the sequence) are still a complete implementation. Thus if someone encounters a problem later, we can revert and reland just the problematic commits. That breaks down if some arbitrary, unknown subset of commits are half-broken aspirational in progress work which is only valid at the merge commit point

vtjnash · 2024-01-31T18:31:35Z

If it was in a separate PR, it would be a standalone commit, but as a complete sequence, each change to the API or implementation in the development process should not result in a separate commit to master

tecosaur · 2024-01-31T18:33:57Z

You seem to be describing atomic commits, no?

The commits here are atomic, and can be handled individually. None of them are half-broken "aspirational" commits. 37 changes have been rebased into 8 logical commits, and the codebase is "valid" at each point, CI should pass at each step.

If the whole thing needs to be reverted for some reason, you can just revert the merge commit instead of needing to squash everything into one commit 🤷, and I'd think it would be much easier to pinpoint where/what's gone wrong when less changes are bundled together.

I'm not trying to be argumentative here, I'm just somewhat confused by this as it runs counter to the emphasis on the value of a detailed and useful history I've seen elsewhere.

tecosaur · 2024-01-31T18:45:32Z

Feel free to just go ahead and squash merge this though, I care more about finally getting this PR over with than discussing whether it's worth preserving the commit history.

vtjnash · 2024-01-31T18:55:13Z

A semi-automated way to look at it is that each separate commit should likely have tests if it represents a different change. Not always the case, but not all PRs need to have tests either. And if the later commits are changing the proposed API, then it is even less obvious that you would want someone to roll back this to a version with the older API as that would be a breaking change that wouldn't be allowed.

tecosaur · 2024-01-31T19:14:07Z

I can't say I entirely see it, but this is good to merge from my POV. The CI hasn't completed yet, but the buildkite/julia-master only has a UndefVarError: `A` not defined in `Main.Test100Main_REPL.REPLCompletionsTest` failure so far. Hopefully the ambiguity check will clear this time.

tecosaur · 2024-02-01T04:55:32Z

No ambiguity errors it looks like! 😀

tecosaur · 2024-02-01T15:52:12Z

Thanks for helping this over the line Jameson!

tecosaur added io Involving the I/O subsystem: libuv, read, write, etc. strings "Strings!" feature Indicates new feature / enhancement requests labels Oct 21, 2023

tecosaur force-pushed the annotated-io branch 3 times, most recently from 43f88eb to b67d93b Compare October 21, 2023 18:48

tecosaur force-pushed the annotated-io branch 2 times, most recently from 3c2f0e2 to a3d0329 Compare October 26, 2023 01:33

LilithHafner reviewed Oct 26, 2023

View reviewed changes

base/strings/annotated.jl Outdated Show resolved Hide resolved

base/strings/annotated.jl Outdated Show resolved Hide resolved

base/strings/annotated.jl Outdated Show resolved Hide resolved

base/strings/annotated.jl Outdated Show resolved Hide resolved

LilithHafner added the needs tests Unit tests are required for this change label Oct 26, 2023

tecosaur force-pushed the annotated-io branch from a3d0329 to 912854f Compare October 26, 2023 08:18

tecosaur mentioned this pull request Oct 26, 2023

printstyled specialisation for AnnotatedIOBuffer JuliaLang/StyledStrings.jl#12

Merged

JeffBezanson reviewed Oct 26, 2023

View reviewed changes

base/show.jl Outdated Show resolved Hide resolved

tecosaur force-pushed the annotated-io branch 2 times, most recently from 75f4654 to 1b43f6b Compare October 26, 2023 15:41

LilithHafner reviewed Oct 26, 2023

View reviewed changes

base/strings/annotated.jl Outdated Show resolved Hide resolved

LilithHafner added the triage This should be discussed on a triage call label Oct 26, 2023

tecosaur mentioned this pull request Oct 28, 2023

Improve the annotated join method and dispatch #51914

Merged

tecosaur force-pushed the annotated-io branch from 1b43f6b to f508212 Compare October 28, 2023 15:24

vchuravy reviewed Oct 28, 2023

View reviewed changes

base/strings/annotated.jl Outdated Show resolved Hide resolved

tecosaur force-pushed the annotated-io branch from f508212 to cfa9616 Compare October 29, 2023 05:06

tecosaur mentioned this pull request Oct 29, 2023

Styled markdown, with a few tweaks #51928

Merged

tecosaur force-pushed the annotated-io branch from cfa9616 to 87a832a Compare November 9, 2023 02:36

tecosaur force-pushed the annotated-io branch 2 times, most recently from 1bc3cb9 to 16a3361 Compare January 31, 2024 08:59

tecosaur and others added 6 commits February 1, 2024 01:53

Writing from one AnnotatedIOBuffer to another

d49ba4f

Tests for AnnotatedIOBuffer

2e6ebb9

Update the mention of annotated features in NEWS

39839ac

Use insert! not splice! for new annotations

2b9839b

Co-authored-by: Jameson Nash <vtjnash@gmail.com>

tecosaur force-pushed the annotated-io branch from 16a3361 to 2b9839b Compare January 31, 2024 17:55

vtjnash merged commit f117a50 into JuliaLang:master Feb 1, 2024
5 of 7 checks passed

tecosaur deleted the annotated-io branch February 1, 2024 15:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce AnnotatedIOBuffer #51807

Introduce AnnotatedIOBuffer #51807

tecosaur commented Oct 21, 2023 •

edited

Loading

LilithHafner commented Oct 26, 2023

tecosaur commented Oct 26, 2023

LilithHafner left a comment

tecosaur commented Oct 26, 2023

tecosaur commented Oct 26, 2023

LilithHafner commented Oct 26, 2023

tecosaur commented Nov 9, 2023

tecosaur commented Dec 7, 2023

tecosaur commented Jan 31, 2024

vtjnash commented Jan 31, 2024

vtjnash commented Jan 31, 2024

tecosaur commented Jan 31, 2024

vtjnash commented Jan 31, 2024

vtjnash commented Jan 31, 2024

tecosaur commented Jan 31, 2024

tecosaur commented Jan 31, 2024

tecosaur commented Jan 31, 2024

vtjnash commented Jan 31, 2024

tecosaur commented Jan 31, 2024

vtjnash commented Jan 31, 2024

vtjnash commented Jan 31, 2024

tecosaur commented Jan 31, 2024 •

edited

Loading

tecosaur commented Jan 31, 2024

vtjnash commented Jan 31, 2024

tecosaur commented Jan 31, 2024

tecosaur commented Feb 1, 2024

tecosaur commented Feb 1, 2024

Introduce AnnotatedIOBuffer #51807

Introduce AnnotatedIOBuffer #51807

Conversation

tecosaur commented Oct 21, 2023 • edited Loading

LilithHafner commented Oct 26, 2023

tecosaur commented Oct 26, 2023

LilithHafner left a comment

Choose a reason for hiding this comment

tecosaur commented Oct 26, 2023

tecosaur commented Oct 26, 2023

LilithHafner commented Oct 26, 2023

tecosaur commented Nov 9, 2023

tecosaur commented Dec 7, 2023

tecosaur commented Jan 31, 2024

vtjnash commented Jan 31, 2024

vtjnash commented Jan 31, 2024

tecosaur commented Jan 31, 2024

vtjnash commented Jan 31, 2024

vtjnash commented Jan 31, 2024

tecosaur commented Jan 31, 2024

tecosaur commented Jan 31, 2024

tecosaur commented Jan 31, 2024

vtjnash commented Jan 31, 2024

tecosaur commented Jan 31, 2024

vtjnash commented Jan 31, 2024

vtjnash commented Jan 31, 2024

tecosaur commented Jan 31, 2024 • edited Loading

tecosaur commented Jan 31, 2024

vtjnash commented Jan 31, 2024

tecosaur commented Jan 31, 2024

tecosaur commented Feb 1, 2024

tecosaur commented Feb 1, 2024

tecosaur commented Oct 21, 2023 •

edited

Loading

tecosaur commented Jan 31, 2024 •

edited

Loading