-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Verify valid write of statefile to disk #108
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cc @gdalle |
I also wrote a notebook to explore the Nov statefile a bit: ### A Pluto.jl notebook ###
# v0.19.22
using Markdown
using InteractiveUtils
# ╔═╡ 4e3146c2-280a-4823-8672-3f231bdf2edd
import Pkg
# ╔═╡ 9d1cfd50-4f53-4ce6-a10c-7282afc92055
Pkg.activate()
# ╔═╡ ac54af6d-e9c4-4c26-a5f1-28c1cee68ffa
import Pluto
# ╔═╡ c5843f00-c8c8-11ed-1fae-a59cf7067654
filename = "~/Downloads/broken statefile huh.plutostate" |> expanduser
# ╔═╡ 592df111-aa07-449d-912a-421820a8428b
data = read(filename)
# ╔═╡ f32cb028-544e-43c5-a114-024e8c48f9d4
d = Pluto.unpack(data)
# ╔═╡ 9d20a49e-a8fb-4178-84cd-c9c91e975d35
rk = keys(d["cell_results"])
# ╔═╡ 9b913b7f-97be-4f37-9c3a-31aef4c30eee
Dict(
k => keys(d["cell_results"][k])
for k in keys(d["cell_results"])
)
# ╔═╡ 54cc6550-5139-49e9-9268-bc0fc1a5c1e6
setdiff(rk, d["cell_order"])
# ╔═╡ 5e8b7a2b-becf-4bd4-a634-cd3745659f96
setdiff(d["cell_order"],rk)
# ╔═╡ ebcaa53e-a07a-45df-8890-bbe9fc7a7844
Set(d["cell_order"])
# ╔═╡ b03191f1-4392-4d41-9e49-fdbb911a61a0
broken_cell_id = "29612df6-203f-42bc-b53b-86af618d60ec"
# ╔═╡ 43746f31-a903-4783-9ea3-f1aa323db1a0
d["cell_results"]["4115f7cb-d45f-4cc2-86bf-316c074393f8"]
# ╔═╡ f9c47abd-e78e-412f-b709-d36bd26188b7
d["cell_results"][broken_cell_id]
# ╔═╡ c6c7926f-c217-4d02-a84a-ea9bf6e27f0b
# String(copy(d["cell_results"][broken_cell_id]["output"]["body"])) |> Text
# ╔═╡ 8fcc29f1-52e3-489a-a746-e28a8a9f402c
rawd = copy(data)
# ╔═╡ a56e9ddc-e563-4dca-8505-4b72e56273e5
raws = String(copy(data))
# ╔═╡ 9df36e7c-fdbb-46db-a584-699f8b970262
239343 + 136808
# ╔═╡ 58f37eb2-2327-4079-a332-f74be278a889
raws[376100:378054]
# ╔═╡ 905cbd87-e879-4c4a-aee1-c525604b3213
"""
url(#clip752)\" style=\"stroke:#0000ff; stroke-linecap:round; stroke-linejoin:round; stroke-width:10; stroke-opacity:0.7; fill:none\" points=\"\n 2352.76,1179.34 2352.76,1179.34 \n \"/>\n</svg>\n
\xb0
persist_js_state¤mime\xadimage/svg+xml\xb2last_run_timestamp\xcbA\xd8\xd8N\xe2\x8e\b-\xb7has_pluto_hook_features¬rootassignee
\xc0
\xa7
cell_id
\xd9
\$
29612df6-203f-42bc-b53b-86af618d60ec
\x88\xa4line\xff\xa3msg\x92\xd9WIndices Base.OneTo(4545) of attribute `markercolor`. Legend entries\nmay be suppressed by passing an empty label.\nFor example,\n plot([1:2,1:3], [[4,5],[3,4,5]], label=[\"y\" \"\"], markercolor=[1 2])\n\xaatext/plain\xa7cell_id\xd9\$2"
"""
# ╔═╡ 5624ae35-db37-45bb-b17e-2badba0716c9
cell_id_ranges = findall(broken_cell_id, raws)
# ╔═╡ 76d76270-0701-4e06-813e-f42f38e497ad
findall("queued", raws)
# ╔═╡ dffef0df-922d-4b10-92b9-c627a7e899af
[
raws[first(r)-12:end]
for r in cell_id_ranges
]
# ╔═╡ 6eb236c2-ee15-4c1f-8a62-38a25f213ec5
Pluto.pack(Dict("running" => false, "aesdf" => [1,2,3])) .|> Char
# ╔═╡ b75e6651-89a9-4fbf-94ba-964ed567059a
bitstring(0xa6)
# ╔═╡ 58ee494a-34c3-46f5-bdfb-33dafe44851b
bitstring(0x82)
# ╔═╡ 2b55dbc2-2d49-4003-a04a-9288051a9645
"29612df6-203f-42bc-b53b-86af618d60ec\x88\xa4line\xff\xa3msg\x92\xd9WIndices Base.OneTo(4545) of attribute `markercolor` does not match data indices 1:4544.\xaatext/plain\xa7cell_id\xd9
# ╔═╡ 5513c11c-cae4-4da0-9064-02ac61b1792c
problem = raws[136773:end]
# ╔═╡ fa741ae3-f265-4709-ac68-7db8681fd912
data[1466+length(broken_cell_id):end]
# ╔═╡ 615a398e-3d63-4e9c-9184-db4b2cdf82a9
# ╠═╡ disabled = true
#=╠═╡
Text(problem)
╠═╡ =#
# ╔═╡ a5d6a38f-3793-40e5-a1e6-4e53a20a3c3b
"""
29612df6-203f-42bc-b53b-86af618d60ec
\xb2
upstream_cells_map
\x82
\xae
time_evolution
\x91
\xd9
\$
b48e55b7-4b56-41aa-9796-674d04adf5df
\xa2
p0
\x91
\xd9
\$
6b298184-32c6-412d-a900-b113d6bd3d53
\xd9
\$
b948830f-ead1-4f36-a237-c998f2f7deb8
\x84
\xb4
precedence_heuristic
\t
\xa7
cell_id
\xd9
\$
b948830f-ead1-4f36-a237-c998f2f7deb8
\xb4
downstream_cells_map
\x80
\xb2
upstream_cells_map
\x81
\xa8
getindex
\x90\xd9\$0b26efab-4e93-4d53-9c4d-faea68d12174\x84\xb4precedence_heuristic\t\xa7cell_id\xd9\$0b26efab-4e93-4d53-9c4d-faea68d12174\xb4downstream_cells_map\x81\xb1initial_condition\x91\xd9\$6b298184-32c6-412d-a900-b113d6bd3d53\xb2upstream_cells_map\x81\xa5zeros\x90\xd9\$98994cb9-"
# ╔═╡ Cell order:
# ╠═4e3146c2-280a-4823-8672-3f231bdf2edd
# ╠═9d1cfd50-4f53-4ce6-a10c-7282afc92055
# ╠═ac54af6d-e9c4-4c26-a5f1-28c1cee68ffa
# ╠═c5843f00-c8c8-11ed-1fae-a59cf7067654
# ╠═592df111-aa07-449d-912a-421820a8428b
# ╠═f32cb028-544e-43c5-a114-024e8c48f9d4
# ╠═9d20a49e-a8fb-4178-84cd-c9c91e975d35
# ╠═9b913b7f-97be-4f37-9c3a-31aef4c30eee
# ╠═54cc6550-5139-49e9-9268-bc0fc1a5c1e6
# ╠═5e8b7a2b-becf-4bd4-a634-cd3745659f96
# ╠═ebcaa53e-a07a-45df-8890-bbe9fc7a7844
# ╠═b03191f1-4392-4d41-9e49-fdbb911a61a0
# ╠═43746f31-a903-4783-9ea3-f1aa323db1a0
# ╠═f9c47abd-e78e-412f-b709-d36bd26188b7
# ╠═c6c7926f-c217-4d02-a84a-ea9bf6e27f0b
# ╠═8fcc29f1-52e3-489a-a746-e28a8a9f402c
# ╠═a56e9ddc-e563-4dca-8505-4b72e56273e5
# ╠═9df36e7c-fdbb-46db-a584-699f8b970262
# ╠═58f37eb2-2327-4079-a332-f74be278a889
# ╠═905cbd87-e879-4c4a-aee1-c525604b3213
# ╠═5624ae35-db37-45bb-b17e-2badba0716c9
# ╠═76d76270-0701-4e06-813e-f42f38e497ad
# ╠═dffef0df-922d-4b10-92b9-c627a7e899af
# ╠═6eb236c2-ee15-4c1f-8a62-38a25f213ec5
# ╠═b75e6651-89a9-4fbf-94ba-964ed567059a
# ╠═58ee494a-34c3-46f5-bdfb-33dafe44851b
# ╠═2b55dbc2-2d49-4003-a04a-9288051a9645
# ╠═5513c11c-cae4-4da0-9064-02ac61b1792c
# ╠═fa741ae3-f265-4709-ac68-7db8681fd912
# ╠═615a398e-3d63-4e9c-9184-db4b2cdf82a9
# ╠═a5d6a38f-3793-40e5-a1e6-4e53a20a3c3b |
One more, generated with an old version of PlutoSliderServer (before the fix) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I have seen two instances where the statefile written to disk is glitched/corrupt: it is still readable by MsgPack, but the structure is totally wrong.
Reports
Both reports happened on PlutoSliderServer generating a static HTML site on a GHA runner. Both had
Export_baked_state = false
, i.e. the statefile was a separate.plutostate
file on disk, not embedded as base64 in the HTML file.The first report was on 1 Nov 2022:
(This is supposed to be structured like this: https://github.com/fonsp/Pluto.jl/blob/v0.19.22-src/src/webserver/Dynamic.jl#L100 )
Luckily enough I stored the original statefile!
broken statefile huh.plutostate.zip
Second report was recently on slack. we don't have the statefile anymore, only a screenshot that @pankgeorg took in the chrome devtools, the notebook file and the GHA logs
Notebook file: https://github.com/IdePHICS/EPFLDoctoralLecture2023/blob/606db3cf62f8ee34662d0d1ba2c930f4b8a35073/Resources/Week2/rfim.jl (might have been one of the commits before)
Action run: https://github.com/IdePHICS/EPFLDoctoralLecture2023/actions/runs/4468560473/jobs/7849468519 (nothing out of the ordinary)
logs_71.zip
Observation
You can see that something went wrong while writing
["cell_results"]["<cell id here>"]
. For the cell where it went wrong (29612df6-203f-42bc-b53b-86af618d60ec
in the Nov case), only part of that Dict is available, and the rest is spread out over parent structures. You see that keys and values are being swapped, and that children of "cell_results" became top-level KV pairs.Possible causes
notebook_to_js
cannot return aDict
like this (that would be such a serious Julia bug that it would have been found before). So something must have gone wrong in one of:I find a mistake during
Pluto.pack
unlikely, because this would also have led to bugs in regular Pluto use (with websockets or the static HTML with baked state). I also trust that https://github.com/JuliaIO/MsgPack.jl is well-implemented, also because the spec of MsgPack is not too complex. Still, some possibilities:A mistake in the GHA system is unlikely (lots of people use it), and i only have reports of the
.plutostate
files being corrupted.So I suspect that something went wrong while writing the file (which is something that I know nothing about). This is especially likely because we called
Pluto.pack
directly on the file handler IO:I have later learned that this is not a good idea for stability (e.g. fonsp/Pluto.jl#976 because the program can get interrupted halfway during a write), and performance (because every single write(io, ...) call has overhead).
There might also be a bug in MsgPack where using a file handler as the
io
can lead to incorrect results. This might not have surfaced before if applications and the MsgPack tests mostly use memory buffers as IO.So in this PR, we switch this to first
Pluto.pack
ing to memory, and then writing thatVector{UInt8}
to the file:I suspect that this already fixed the bug. But to also eliminate a problem where
write
just writes the wrong contents, I added aread
to verify the contents.