Post a summary of the flaky tests to the commit #45798

kevin940726 · 2022-11-16T07:46:21Z

What?

This is an attempt to bring more visibility to flaky tests during reviews. It will post a comment on the commit which has flaky tests.

Why?

The original flaky tests reporter started with the idea of retrying failed tests to unblock contributors in PRs with flaky tests. However, a side-effect is that flaky tests tend to be overlooked and ignored by the author. This PR tries to fix that by posting the summary of flaky tests to the commit without blocking the PR authors.

How?

Tons of improvements are in this PR:

End-to-end testing workflows are merged into one. It seems like the only way to ensure only one run of the report-flaky-tests job will be run after all e2e jobs are finished.
We'll bail out of the report-flaky-tests job if there isn't any flaky tests report.
We no longer split artifacts into different names from different matrixes. It seems like GitHub allows uploading different files to the same artifact.

Testing Instructions

I'm not aware of an easy way to test this. It's unfortunately a problem with custom github actions. The best way I can think of is to fork the repo and test it there.

See below for detailed instructions on how to test it in your own fork: #45798 (comment).

Screenshots or screencast

Here's what it might look like in the commit:

Or you can view it here: kevin940726@b87c01c#commitcomment-90061333

codesandbox · 2022-11-16T07:46:23Z

Open in CodeSandbox Web Editor | VS Code | VS Code Insiders

github-actions · 2022-11-16T07:58:19Z

Size Change: 0 B

Total Size: 1.32 MB

ℹ️ View Unchanged

Filename	Size
`build/a11y/index.min.js`	993 B
`build/annotations/index.min.js`	2.78 kB
`build/api-fetch/index.min.js`	2.27 kB
`build/autop/index.min.js`	2.15 kB
`build/blob/index.min.js`	487 B
`build/block-directory/index.min.js`	7.16 kB
`build/block-directory/style-rtl.css`	1.04 kB
`build/block-directory/style.css`	1.04 kB
`build/block-editor/content-rtl.css`	2.71 kB
`build/block-editor/content.css`	2.71 kB
`build/block-editor/default-editor-styles-rtl.css`	403 B
`build/block-editor/default-editor-styles.css`	403 B
`build/block-editor/index.min.js`	182 kB
`build/block-editor/style-rtl.css`	14.7 kB
`build/block-editor/style.css`	14.6 kB
`build/block-library/blocks/archives/editor-rtl.css`	61 B
`build/block-library/blocks/archives/editor.css`	60 B
`build/block-library/blocks/archives/style-rtl.css`	90 B
`build/block-library/blocks/archives/style.css`	90 B
`build/block-library/blocks/audio/editor-rtl.css`	150 B
`build/block-library/blocks/audio/editor.css`	150 B
`build/block-library/blocks/audio/style-rtl.css`	122 B
`build/block-library/blocks/audio/style.css`	122 B
`build/block-library/blocks/audio/theme-rtl.css`	138 B
`build/block-library/blocks/audio/theme.css`	138 B
`build/block-library/blocks/avatar/editor-rtl.css`	116 B
`build/block-library/blocks/avatar/editor.css`	116 B
`build/block-library/blocks/avatar/style-rtl.css`	84 B
`build/block-library/blocks/avatar/style.css`	84 B
`build/block-library/blocks/block/editor-rtl.css`	305 B
`build/block-library/blocks/block/editor.css`	305 B
`build/block-library/blocks/button/editor-rtl.css`	485 B
`build/block-library/blocks/button/editor.css`	485 B
`build/block-library/blocks/button/style-rtl.css`	532 B
`build/block-library/blocks/button/style.css`	532 B
`build/block-library/blocks/buttons/editor-rtl.css`	337 B
`build/block-library/blocks/buttons/editor.css`	337 B
`build/block-library/blocks/buttons/style-rtl.css`	332 B
`build/block-library/blocks/buttons/style.css`	332 B
`build/block-library/blocks/calendar/style-rtl.css`	239 B
`build/block-library/blocks/calendar/style.css`	239 B
`build/block-library/blocks/categories/editor-rtl.css`	84 B
`build/block-library/blocks/categories/editor.css`	83 B
`build/block-library/blocks/categories/style-rtl.css`	100 B
`build/block-library/blocks/categories/style.css`	100 B
`build/block-library/blocks/code/editor-rtl.css`	53 B
`build/block-library/blocks/code/editor.css`	53 B
`build/block-library/blocks/code/style-rtl.css`	121 B
`build/block-library/blocks/code/style.css`	121 B
`build/block-library/blocks/code/theme-rtl.css`	124 B
`build/block-library/blocks/code/theme.css`	124 B
`build/block-library/blocks/columns/editor-rtl.css`	108 B
`build/block-library/blocks/columns/editor.css`	108 B
`build/block-library/blocks/columns/style-rtl.css`	406 B
`build/block-library/blocks/columns/style.css`	406 B
`build/block-library/blocks/comment-author-avatar/editor-rtl.css`	125 B
`build/block-library/blocks/comment-author-avatar/editor.css`	125 B
`build/block-library/blocks/comment-content/style-rtl.css`	92 B
`build/block-library/blocks/comment-content/style.css`	92 B
`build/block-library/blocks/comment-template/style-rtl.css`	199 B
`build/block-library/blocks/comment-template/style.css`	198 B
`build/block-library/blocks/comments-pagination-numbers/editor-rtl.css`	123 B
`build/block-library/blocks/comments-pagination-numbers/editor.css`	121 B
`build/block-library/blocks/comments-pagination/editor-rtl.css`	222 B
`build/block-library/blocks/comments-pagination/editor.css`	209 B
`build/block-library/blocks/comments-pagination/style-rtl.css`	235 B
`build/block-library/blocks/comments-pagination/style.css`	231 B
`build/block-library/blocks/comments-title/editor-rtl.css`	75 B
`build/block-library/blocks/comments-title/editor.css`	75 B
`build/block-library/blocks/comments/editor-rtl.css`	840 B
`build/block-library/blocks/comments/editor.css`	839 B
`build/block-library/blocks/comments/style-rtl.css`	637 B
`build/block-library/blocks/comments/style.css`	636 B
`build/block-library/blocks/cover/editor-rtl.css`	612 B
`build/block-library/blocks/cover/editor.css`	613 B
`build/block-library/blocks/cover/style-rtl.css`	1.57 kB
`build/block-library/blocks/cover/style.css`	1.56 kB
`build/block-library/blocks/embed/editor-rtl.css`	293 B
`build/block-library/blocks/embed/editor.css`	293 B
`build/block-library/blocks/embed/style-rtl.css`	410 B
`build/block-library/blocks/embed/style.css`	410 B
`build/block-library/blocks/embed/theme-rtl.css`	138 B
`build/block-library/blocks/embed/theme.css`	138 B
`build/block-library/blocks/file/editor-rtl.css`	300 B
`build/block-library/blocks/file/editor.css`	300 B
`build/block-library/blocks/file/style-rtl.css`	253 B
`build/block-library/blocks/file/style.css`	254 B
`build/block-library/blocks/file/view.min.js`	353 B
`build/block-library/blocks/freeform/editor-rtl.css`	2.44 kB
`build/block-library/blocks/freeform/editor.css`	2.44 kB
`build/block-library/blocks/gallery/editor-rtl.css`	984 B
`build/block-library/blocks/gallery/editor.css`	988 B
`build/block-library/blocks/gallery/style-rtl.css`	1.55 kB
`build/block-library/blocks/gallery/style.css`	1.55 kB
`build/block-library/blocks/gallery/theme-rtl.css`	122 B
`build/block-library/blocks/gallery/theme.css`	122 B
`build/block-library/blocks/group/editor-rtl.css`	654 B
`build/block-library/blocks/group/editor.css`	654 B
`build/block-library/blocks/group/style-rtl.css`	57 B
`build/block-library/blocks/group/style.css`	57 B
`build/block-library/blocks/group/theme-rtl.css`	78 B
`build/block-library/blocks/group/theme.css`	78 B
`build/block-library/blocks/heading/style-rtl.css`	76 B
`build/block-library/blocks/heading/style.css`	76 B
`build/block-library/blocks/html/editor-rtl.css`	332 B
`build/block-library/blocks/html/editor.css`	333 B
`build/block-library/blocks/image/editor-rtl.css`	829 B
`build/block-library/blocks/image/editor.css`	828 B
`build/block-library/blocks/image/style-rtl.css`	627 B
`build/block-library/blocks/image/style.css`	630 B
`build/block-library/blocks/image/theme-rtl.css`	137 B
`build/block-library/blocks/image/theme.css`	137 B
`build/block-library/blocks/latest-comments/style-rtl.css`	298 B
`build/block-library/blocks/latest-comments/style.css`	298 B
`build/block-library/blocks/latest-posts/editor-rtl.css`	213 B
`build/block-library/blocks/latest-posts/editor.css`	212 B
`build/block-library/blocks/latest-posts/style-rtl.css`	478 B
`build/block-library/blocks/latest-posts/style.css`	478 B
`build/block-library/blocks/list/style-rtl.css`	88 B
`build/block-library/blocks/list/style.css`	88 B
`build/block-library/blocks/media-text/editor-rtl.css`	266 B
`build/block-library/blocks/media-text/editor.css`	263 B
`build/block-library/blocks/media-text/style-rtl.css`	507 B
`build/block-library/blocks/media-text/style.css`	505 B
`build/block-library/blocks/more/editor-rtl.css`	431 B
`build/block-library/blocks/more/editor.css`	431 B
`build/block-library/blocks/navigation-link/editor-rtl.css`	716 B
`build/block-library/blocks/navigation-link/editor.css`	715 B
`build/block-library/blocks/navigation-link/style-rtl.css`	115 B
`build/block-library/blocks/navigation-link/style.css`	115 B
`build/block-library/blocks/navigation-submenu/editor-rtl.css`	299 B
`build/block-library/blocks/navigation-submenu/editor.css`	299 B
`build/block-library/blocks/navigation/editor-rtl.css`	2.15 kB
`build/block-library/blocks/navigation/editor.css`	2.16 kB
`build/block-library/blocks/navigation/style-rtl.css`	2.23 kB
`build/block-library/blocks/navigation/style.css`	2.21 kB
`build/block-library/blocks/navigation/view-modal.min.js`	2.81 kB
`build/block-library/blocks/navigation/view.min.js`	447 B
`build/block-library/blocks/nextpage/editor-rtl.css`	395 B
`build/block-library/blocks/nextpage/editor.css`	395 B
`build/block-library/blocks/page-list/editor-rtl.css`	363 B
`build/block-library/blocks/page-list/editor.css`	363 B
`build/block-library/blocks/page-list/style-rtl.css`	175 B
`build/block-library/blocks/page-list/style.css`	175 B
`build/block-library/blocks/paragraph/editor-rtl.css`	174 B
`build/block-library/blocks/paragraph/editor.css`	174 B
`build/block-library/blocks/paragraph/style-rtl.css`	279 B
`build/block-library/blocks/paragraph/style.css`	281 B
`build/block-library/blocks/post-author/style-rtl.css`	175 B
`build/block-library/blocks/post-author/style.css`	176 B
`build/block-library/blocks/post-comments-form/editor-rtl.css`	96 B
`build/block-library/blocks/post-comments-form/editor.css`	96 B
`build/block-library/blocks/post-comments-form/style-rtl.css`	501 B
`build/block-library/blocks/post-comments-form/style.css`	501 B
`build/block-library/blocks/post-date/style-rtl.css`	61 B
`build/block-library/blocks/post-date/style.css`	61 B
`build/block-library/blocks/post-excerpt/editor-rtl.css`	73 B
`build/block-library/blocks/post-excerpt/editor.css`	73 B
`build/block-library/blocks/post-excerpt/style-rtl.css`	69 B
`build/block-library/blocks/post-excerpt/style.css`	69 B
`build/block-library/blocks/post-featured-image/editor-rtl.css`	586 B
`build/block-library/blocks/post-featured-image/editor.css`	584 B
`build/block-library/blocks/post-featured-image/style-rtl.css`	318 B
`build/block-library/blocks/post-featured-image/style.css`	318 B
`build/block-library/blocks/post-navigation-link/style-rtl.css`	153 B
`build/block-library/blocks/post-navigation-link/style.css`	153 B
`build/block-library/blocks/post-template/editor-rtl.css`	99 B
`build/block-library/blocks/post-template/editor.css`	98 B
`build/block-library/blocks/post-template/style-rtl.css`	282 B
`build/block-library/blocks/post-template/style.css`	282 B
`build/block-library/blocks/post-terms/style-rtl.css`	96 B
`build/block-library/blocks/post-terms/style.css`	96 B
`build/block-library/blocks/post-title/style-rtl.css`	100 B
`build/block-library/blocks/post-title/style.css`	100 B
`build/block-library/blocks/preformatted/style-rtl.css`	103 B
`build/block-library/blocks/preformatted/style.css`	103 B
`build/block-library/blocks/pullquote/editor-rtl.css`	135 B
`build/block-library/blocks/pullquote/editor.css`	135 B
`build/block-library/blocks/pullquote/style-rtl.css`	326 B
`build/block-library/blocks/pullquote/style.css`	325 B
`build/block-library/blocks/pullquote/theme-rtl.css`	167 B
`build/block-library/blocks/pullquote/theme.css`	167 B
`build/block-library/blocks/query-pagination-numbers/editor-rtl.css`	122 B
`build/block-library/blocks/query-pagination-numbers/editor.css`	121 B
`build/block-library/blocks/query-pagination/editor-rtl.css`	221 B
`build/block-library/blocks/query-pagination/editor.css`	211 B
`build/block-library/blocks/query-pagination/style-rtl.css`	288 B
`build/block-library/blocks/query-pagination/style.css`	284 B
`build/block-library/blocks/query-title/style-rtl.css`	63 B
`build/block-library/blocks/query-title/style.css`	63 B
`build/block-library/blocks/query/editor-rtl.css`	440 B
`build/block-library/blocks/query/editor.css`	440 B
`build/block-library/blocks/quote/style-rtl.css`	213 B
`build/block-library/blocks/quote/style.css`	213 B
`build/block-library/blocks/quote/theme-rtl.css`	223 B
`build/block-library/blocks/quote/theme.css`	226 B
`build/block-library/blocks/read-more/style-rtl.css`	132 B
`build/block-library/blocks/read-more/style.css`	132 B
`build/block-library/blocks/rss/editor-rtl.css`	202 B
`build/block-library/blocks/rss/editor.css`	204 B
`build/block-library/blocks/rss/style-rtl.css`	289 B
`build/block-library/blocks/rss/style.css`	288 B
`build/block-library/blocks/search/editor-rtl.css`	165 B
`build/block-library/blocks/search/editor.css`	165 B
`build/block-library/blocks/search/style-rtl.css`	409 B
`build/block-library/blocks/search/style.css`	406 B
`build/block-library/blocks/search/theme-rtl.css`	114 B
`build/block-library/blocks/search/theme.css`	114 B
`build/block-library/blocks/separator/editor-rtl.css`	146 B
`build/block-library/blocks/separator/editor.css`	146 B
`build/block-library/blocks/separator/style-rtl.css`	234 B
`build/block-library/blocks/separator/style.css`	234 B
`build/block-library/blocks/separator/theme-rtl.css`	194 B
`build/block-library/blocks/separator/theme.css`	194 B
`build/block-library/blocks/shortcode/editor-rtl.css`	474 B
`build/block-library/blocks/shortcode/editor.css`	474 B
`build/block-library/blocks/site-logo/editor-rtl.css`	490 B
`build/block-library/blocks/site-logo/editor.css`	490 B
`build/block-library/blocks/site-logo/style-rtl.css`	203 B
`build/block-library/blocks/site-logo/style.css`	203 B
`build/block-library/blocks/site-tagline/editor-rtl.css`	86 B
`build/block-library/blocks/site-tagline/editor.css`	86 B
`build/block-library/blocks/site-title/editor-rtl.css`	116 B
`build/block-library/blocks/site-title/editor.css`	116 B
`build/block-library/blocks/site-title/style-rtl.css`	57 B
`build/block-library/blocks/site-title/style.css`	57 B
`build/block-library/blocks/social-link/editor-rtl.css`	184 B
`build/block-library/blocks/social-link/editor.css`	184 B
`build/block-library/blocks/social-links/editor-rtl.css`	674 B
`build/block-library/blocks/social-links/editor.css`	673 B
`build/block-library/blocks/social-links/style-rtl.css`	1.4 kB
`build/block-library/blocks/social-links/style.css`	1.39 kB
`build/block-library/blocks/spacer/editor-rtl.css`	332 B
`build/block-library/blocks/spacer/editor.css`	332 B
`build/block-library/blocks/spacer/style-rtl.css`	48 B
`build/block-library/blocks/spacer/style.css`	48 B
`build/block-library/blocks/table/editor-rtl.css`	457 B
`build/block-library/blocks/table/editor.css`	457 B
`build/block-library/blocks/table/style-rtl.css`	636 B
`build/block-library/blocks/table/style.css`	635 B
`build/block-library/blocks/table/theme-rtl.css`	184 B
`build/block-library/blocks/table/theme.css`	184 B
`build/block-library/blocks/tag-cloud/style-rtl.css`	251 B
`build/block-library/blocks/tag-cloud/style.css`	253 B
`build/block-library/blocks/template-part/editor-rtl.css`	404 B
`build/block-library/blocks/template-part/editor.css`	404 B
`build/block-library/blocks/template-part/theme-rtl.css`	101 B
`build/block-library/blocks/template-part/theme.css`	101 B
`build/block-library/blocks/text-columns/editor-rtl.css`	95 B
`build/block-library/blocks/text-columns/editor.css`	95 B
`build/block-library/blocks/text-columns/style-rtl.css`	166 B
`build/block-library/blocks/text-columns/style.css`	166 B
`build/block-library/blocks/verse/style-rtl.css`	87 B
`build/block-library/blocks/verse/style.css`	87 B
`build/block-library/blocks/video/editor-rtl.css`	691 B
`build/block-library/blocks/video/editor.css`	694 B
`build/block-library/blocks/video/style-rtl.css`	179 B
`build/block-library/blocks/video/style.css`	179 B
`build/block-library/blocks/video/theme-rtl.css`	139 B
`build/block-library/blocks/video/theme.css`	139 B
`build/block-library/classic-rtl.css`	162 B
`build/block-library/classic.css`	162 B
`build/block-library/common-rtl.css`	1.05 kB
`build/block-library/common.css`	1.05 kB
`build/block-library/editor-elements-rtl.css`	75 B
`build/block-library/editor-elements.css`	75 B
`build/block-library/editor-rtl.css`	11.7 kB
`build/block-library/editor.css`	11.7 kB
`build/block-library/elements-rtl.css`	54 B
`build/block-library/elements.css`	54 B
`build/block-library/index.min.js`	197 kB
`build/block-library/reset-rtl.css`	478 B
`build/block-library/reset.css`	478 B
`build/block-library/style-rtl.css`	12.4 kB
`build/block-library/style.css`	12.4 kB
`build/block-library/theme-rtl.css`	716 B
`build/block-library/theme.css`	721 B
`build/block-serialization-default-parser/index.min.js`	1.13 kB
`build/block-serialization-spec-parser/index.min.js`	2.83 kB
`build/blocks/index.min.js`	50.4 kB
`build/components/index.min.js`	204 kB
`build/components/style-rtl.css`	11.7 kB
`build/components/style.css`	11.7 kB
`build/compose/index.min.js`	12.3 kB
`build/core-data/index.min.js`	15.9 kB
`build/customize-widgets/index.min.js`	11.7 kB
`build/customize-widgets/style-rtl.css`	1.41 kB
`build/customize-widgets/style.css`	1.41 kB
`build/data-controls/index.min.js`	663 B
`build/data/index.min.js`	8.14 kB
`build/date/index.min.js`	32.1 kB
`build/deprecated/index.min.js`	518 B
`build/dom-ready/index.min.js`	336 B
`build/dom/index.min.js`	4.74 kB
`build/edit-navigation/index.min.js`	16.2 kB
`build/edit-navigation/style-rtl.css`	4.14 kB
`build/edit-navigation/style.css`	4.15 kB
`build/edit-post/classic-rtl.css`	571 B
`build/edit-post/classic.css`	571 B
`build/edit-post/index.min.js`	34.7 kB
`build/edit-post/style-rtl.css`	7.49 kB
`build/edit-post/style.css`	7.48 kB
`build/edit-site/index.min.js`	63.6 kB
`build/edit-site/style-rtl.css`	9.08 kB
`build/edit-site/style.css`	9.08 kB
`build/edit-widgets/index.min.js`	16.8 kB
`build/edit-widgets/style-rtl.css`	4.48 kB
`build/edit-widgets/style.css`	4.49 kB
`build/editor/index.min.js`	44.1 kB
`build/editor/style-rtl.css`	3.69 kB
`build/editor/style.css`	3.68 kB
`build/element/index.min.js`	4.72 kB
`build/escape-html/index.min.js`	548 B
`build/experiments/index.min.js`	882 B
`build/format-library/index.min.js`	7.2 kB
`build/format-library/style-rtl.css`	598 B
`build/format-library/style.css`	597 B
`build/hooks/index.min.js`	1.66 kB
`build/html-entities/index.min.js`	454 B
`build/i18n/index.min.js`	3.79 kB
`build/is-shallow-equal/index.min.js`	535 B
`build/keyboard-shortcuts/index.min.js`	1.79 kB
`build/keycodes/index.min.js`	1.86 kB
`build/list-reusable-blocks/index.min.js`	2.13 kB
`build/list-reusable-blocks/style-rtl.css`	865 B
`build/list-reusable-blocks/style.css`	865 B
`build/media-utils/index.min.js`	2.94 kB
`build/notices/index.min.js`	977 B
`build/nux/index.min.js`	2.07 kB
`build/nux/style-rtl.css`	775 B
`build/nux/style.css`	771 B
`build/plugins/index.min.js`	1.95 kB
`build/preferences-persistence/index.min.js`	2.23 kB
`build/preferences/index.min.js`	1.35 kB
`build/primitives/index.min.js`	960 B
`build/priority-queue/index.min.js`	1.59 kB
`build/react-i18n/index.min.js`	702 B
`build/react-refresh-entry/index.min.js`	8.44 kB
`build/react-refresh-runtime/index.min.js`	7.31 kB
`build/redux-routine/index.min.js`	2.75 kB
`build/reusable-blocks/index.min.js`	2.26 kB
`build/reusable-blocks/style-rtl.css`	283 B
`build/reusable-blocks/style.css`	283 B
`build/rich-text/index.min.js`	10.7 kB
`build/server-side-render/index.min.js`	2.09 kB
`build/shortcode/index.min.js`	1.52 kB
`build/style-engine/index.min.js`	1.53 kB
`build/token-list/index.min.js`	650 B
`build/url/index.min.js`	3.7 kB
`build/vendors/inert-polyfill.min.js`	2.48 kB
`build/vendors/react-dom.min.js`	41.8 kB
`build/vendors/react.min.js`	4.02 kB
`build/viewport/index.min.js`	1.09 kB
`build/warning/index.min.js`	280 B
`build/widgets/index.min.js`	7.27 kB
`build/widgets/style-rtl.css`	1.21 kB
`build/widgets/style.css`	1.21 kB
`build/wordcount/index.min.js`	1.06 kB

_{compressed-size-action}

Mamaduka

I like the idea. Can you provide instructions for testing with the forked repo?

kevin940726 · 2022-11-18T06:26:05Z

Sure. It's gonna be very tedious though so bear with me if I miss any detail 😅 .

Fork the repo.
Go to .github/workflows/end2end-test.yml.
Change the two occurrences of WordPress/gutenberg to your own username (e.g. kevin940726/gutenberg) to allow running them in your fork.
Create some intentional flaky tests. An example written in Playwright will be:

const { test, expect } = require( '@wordpress/e2e-test-utils-playwright' );

test.describe( 'Flaky test', () => {
	test( 'should be flaky', async ( {}, testInfo ) => {
		expect( testInfo.retry ).toBeGreaterThan( 1 );
	} );
} );

Push the change to trunk, so that the workflow will be updated.
Wait for the CI to finish, it should post a comment in the commit and open an issue for the flaky test.
(Bonus) Add some random change in a new branch and open a PR. The workflow should post a comment on the PR's commit.

glendaviesnz · 2022-11-24T02:30:31Z

This worked as advertised for me on a repo fork. I wonder if it would be better to add a comment to the PR rather than the commit when it is run against a PR as the comment against the commit can be easily overlooked, eg. below the size change comment is much more obvious than the flakey test comment bubble against the commit:

kevin940726 · 2022-11-30T08:36:05Z

I wonder if it would be better to add a comment to the PR rather than the commit when it is run against a PR as the comment against the commit can be easily overlooked

True! I've thought about something similar. Since these comments are actually related to the commits but not the PR, we will need to update the comment to include the commit hash or link to clarify that. I wonder if we should still keep the comment for older commits too, then we might want to keep posting to the commit. I guess posting to the commit is a good enough solution for now, and we can iterate if we find it helpful.

glendaviesnz · 2022-11-30T20:02:16Z

I guess posting to the commit is a good enough solution for now, and we can iterate if we find it helpful.

Yeh, adding a comment to the PR could be a follow up.

kevin940726 · 2022-12-05T10:15:39Z

@youknowriad Curious about your thoughts 🙈.

youknowriad · 2022-12-05T11:02:56Z

Sounds good to me. 👍

Makes me thing we're starting to have more and more summaries: flaky tests, bundle size (potentially codesandbox later). A great UX would be a single "PR summary" comment with multiple collapsible details :)

kevin940726 · 2022-12-07T08:04:10Z

A great UX would be a single "PR summary" comment with multiple collapsible details

Yeah, agree! FWIW, this PR doesn't post comments to PRs, yet. Would be nice if we could build a preview website for more advanced content too. For instance, I've been wanting to deploy the Playwright test report HTML somewhere for easier debuggability.

glendaviesnz · 2022-12-14T23:20:09Z

As a way forward maybe we should:

Merge this version
Iterate on it to get it posting a PR comment
Look at integrating the various reports into a single PR summary

kevin940726 · 2022-12-15T02:47:04Z

Agreed! Would love to get some reviews! I'll rebase it to resolve the conflicts later.

kevin940726 · 2022-12-15T06:50:47Z

@youknowriad Seems like we have to remove/change the required checks if this is merged. 🙇

talldan

Pretty happy with this code-wise and the idea seems good. I left a few questions.

I couldn't quite get it working on my fork, I'm not sure if I did something wrong - talldan@32d5f92

.github/workflows/end2end-test.yml

kevin940726 · 2022-12-15T07:19:01Z

I couldn't quite get it working on my fork, I'm not sure if I did something wrong

Thanks for testing it out!

It appears that the problem is that the "Report to GitHub" step only runs in the context of trunk, so this PR has to be applied to the fork's trunk to test.

talldan

This works nicely. An example below (there were more flaky tests than I expected, looks like I didn't need to make a fake one 😄 ):
talldan@fd0e1a0#commitcomment-93272254

We'll have to figure out what to do about the required checks.

kevin940726 · 2022-12-16T07:03:41Z

there were more flaky tests than I expected

Yeah, I noticed the same thing too 😆. Hence that's why we need this for more visible feedback 😅 .

We'll have to figure out what to do about the required checks.

I think here is the doc for that, under step 7.

Mamaduka · 2022-12-16T07:44:18Z

Nice work, @kevin940726!

By the way, I'm trying to fix the font size picker related flaky tests in #46591.

dmsnell · 2022-12-16T22:15:12Z

This has already introduced an amplification of noise in my GitHub notifications. Given how flakey our tests are it seems like this risks making the flakiness worse, because now not only do those tests not pass, but they also leave comments we have to ignore or delete, or deal with every time we commit or rebase a branch. That is, not only do we have to re-run the tests but we also have to look past these comments, and spend the time navigating to the PR to realize that they were only a distraction.

Would love it if instead of adding more noise we could either update an existing comment or simply mark the flakiness somewhere where it only interferes with the people working on the test-flakiness problem. Even with one comment per PR we're still talking about adding unnecessary comments and notifications on a large percentage of the contributions.

I do understand the desire to tell people who aren't familiar with how unreliable our tests are that they shouldn't be bothered by the fact that the tests failed, but if we end up doing that by leaving twenty comments on their work we might just make the problem worse 😉

dmsnell · 2022-12-16T22:17:36Z

I received yet another notification for these comments while I was writing the above reply. It's worth observing that we don't know from the GitHub notification screen what is demanding our attention until we click through and navigate to the PR and (in some cases scroll down the page and) find the comment and realize it was reiterating that our test suite has failed us. That's frustrating.

youknowriad · 2022-12-17T21:44:10Z

I've been thinking about this a bit. The main reason we have flaky tests and retries is to actually prevent the job from failing and not care about these. If the goal is to actually make people care about these tests and do something about them, I think we should probably just go back to what we had before:

Remove retries and flakiness reports which forces people to either fix or skip the flaky tests.

talldan · 2022-12-19T02:03:07Z

Feels to me like we tried that for years, and it didn't work. There were still lots of flaky tests.

I think there were more people interested in fixing them, but only because it was such a bad situation and a terrible developer experience.

kevin940726 · 2022-12-19T06:18:42Z

If the goal is to actually make people care about these tests and do something about them

Not entirely true. The goal is to make these failures more visible as opposed to hiding them in each issue while also unblocking the contributors. So flaky tests won't block the PR but just post a comment to the commit.

This however is getting annoying because we currently have too many flaky tests. I don't expect it to be posting too often if our tests are more stable. I totally agree that the notification is a bit too much though. It'd be nice if we could prevent the comment from sending notifications to the commit authors, but that doesn't seem possible on GitHub.

Would only posting/updating a single comment on PRs as suggested above sound like a good enough middle ground? In the meantime, we can continue trying to fix as many highly frequent flaky tests as possible.

youknowriad · 2022-12-19T08:15:27Z

Feels to me like we tried that for years, and it didn't work. There were still lots of flaky tests.

Sure, but so far, it doesn't feel like the new situation is any better. At least, that's my impression

dmsnell · 2022-12-19T19:27:44Z

Would only posting/updating a single comment on PRs #45798 (comment) sound like a good enough middle ground? In the meantime, we can continue trying to fix as many highly frequent flaky tests as possible.

What's the benefit of the comment? Maybe a clean revert would be most prudent to avoid leaving in parts that don't get updated.

If anything we could add a comment at the top of the project README telling people not to worry about our poor tests. We already know which ones are flakey (at least hypothetically) so why not make those tests optional for a merge? That is, if the test suite fails, show the results, but don't block the PR?

The script I wrote for collecting runtime measurements of our Workflows could be easily modified to rerun failed test suites. If we export the list of individual test cases that fail too it could watch for new flakiness. We could even theoretically call out to that server and ask, "given these test results, are there failures that aren't known to be flakey right now?" and use the response to determine whether to clear or reject the PR.

Any option seems better than daily repeatedly reminding every one of us that our tests are unreliable 🙃

kevin940726 · 2022-12-20T03:56:42Z

What's the benefit of the comment?

The motivation is mentioned in the "Why?" section in the PR description. The goal is to make flaky tests more visible during commits/reviews so that we get a chance to fix them. I agree though that the pings seem a bit too much at this stage. We can definitely revert this but I still think we need another way to solve this problem or we will just go back to introducing more flaky tests to the project.

We already know which ones are flakey (at least hypothetically) so why not make those tests optional for a merge? That is, if the test suite fails, show the results, but don't block the PR?

The problem actually lies in the hypothetical part. We don't know if a test is flaky until we successfully retry the failed test. A less flaky one could pass a hundred times but fail the 101st time. We don't know if any given PR solves the flakiness without monitoring it for some time either. As for within a single run, we are already retrying failed tests, and if they pass, we don't block the PR.

The script I wrote for collecting runtime measurements of our Workflows could be easily modified to rerun failed test suites.

This works but rerunning the whole workflow is time-consuming and costly. That's why I went for rerunning only the failed test cases.

If we export the list of individual test cases that fail too it could watch for new flakiness.

I don't think I understand this part. Do we commit the list to the repo? Or store them elsewhere? One thing to consider is that PRs can be branched from any given point of time and a static list stored centrally could be easily out of sync.

Any option seems better than daily repeatedly reminding every one of us that our tests are unreliable 🙃

FWIW, this system won't post comments if there are no flaky tests in the run. I don't expect this to be a daily reminder if we can improve the stability of our tests. (But still, the notification is kind of annoying 😅)

dmsnell · 2022-12-20T05:50:39Z

A less flaky one could pass a hundred times but fail the 101st time.

In my experience, which could be wrong based on my perception, we're not talking about 1 in 101 test runs that fails. It's more like 1 in 2 test runs, or 3 in 5.

This works but rerunning the whole workflow is time-consuming and costly. That's why I went for rerunning only the failed test cases.

Here I think we're talking about the same thing, but re-running the tests isn't really that time-consuming or costly from a scripting point of view. My script as-written monitors tests and every five minutes tries to rerun them. This is how I've been able to get hundreds of runs for a PR without even trying.

We could modify my "without optimizations branch" approach and create what is essentially an empty branch that tracks trunk and watches test failures.

I don't think I understand this part. Do we commit the list to the repo?

We have Workflows and test suites and individual test cases within those suites. we could potentially write out into an artifact the results of each test case (probably with some JSON-export for jest) and then look at which individual test cases failed. There's a URL at which we can grab each artifact from the test suites.

This has actually been on my plans for the performance tests CI job except it took a lower priority once it was no longer the slowest CI workflow.

I don't expect this to be a daily reminder if we can improve the stability of our tests

In practice I'm worried this will always be the case. I've had maybe ten comments today through rebases while working on trying to understand why all performance tests suddenly stopped working. It's not the biggest problem, but it's the kind of frustration that feels incredibly annoying. It's possibly more prominent for me because I'm trying to fix the root problems behind these issues.

kevin940726 · 2022-12-20T09:33:12Z

@dmsnell I'm failing to understand some details here, and I think maybe we're both missing some context. Let me explain how the flaky tests system currently works so that we can be sure we're on the same page. 🙇

E2E tests run on CI (both Puppeteer and Playwright) will auto-retry at most 2 times (3 times in total) until they pass.
Passed tests that have been retried before are marked as "flaky tests". They don't block the CI.
After all tests are finished, we gather all the flaky tests into an artifact.
On a different job, we download the artifact and report each flaky test to its corresponding tracking issue automatically.

All of the above already work before this PR. The motivation for retrying failed tests is to help unblock the contributors so that they don't have to examine those flaky ones individually and manually rerun the workflow. However, this comes with the cost of hiding the flaky tests in each issue, rendering them less visible, hence making our project flakier as PR authors won't catch that.

What this PR does is add a fifth step to aggregate those flaky tests into a comment. We try to make it clear in the comment to note that the flaky tests probably aren't related to the PR itself, but the information is still there for examination.

The problem now is that the notification is too noisy. Possibly only creating a single comment to the PR (but not the commit) is a better alternative.

Hope that this explains the situation a little bit better. Any other solution is always welcome though. For instance, I've been wanting to build an external dashboard to host all these flaky test data elsewhere so that we can do advanced data visualization if possible.

but re-running the tests isn't really that time-consuming or costly from a scripting point of view

I believe it's still a cost to our CI runner, as we don't have unlimited parallel run durations on GitHub.

In practice I'm worried this will always be the case.

Might be, but I still believe we can improve our tests' stability so much more. For instance, the recent Playwright migration tends to have more stable tests than the Puppeteer one (just a feeling there's no proof yet 😅 ).

Could you explain further in steps what you have in mind about the "scripting" approach and how it would work?

dmsnell · 2022-12-20T20:23:45Z

I think maybe we're both missing some context.

probably! just to affirm though, thanks again for working on the testing suites. it's a terribly frustration with the project and I'm glad people are trying to make it better

What this PR does is add a fifth step to aggregate those flaky tests into a comment. We try to make it clear in the comment to note that the flaky tests probably aren't related to the PR itself, but the information is still there for examination.

For this PR maybe this is my biggest gripe. It feels like we're punishing the wrong people, and making noise where it isn't needed. E.g. if the tests are failing in my PR I'm already going to look at those tests and see if they seem related to my changes or not. As anyone who works in the repo should learn rather quickly, if a test fails, it's most likely caused by broken tests and not by broken code.

A simple note in the README or on first contributors' PRs might be a way to better communicate this: it's not you, it's our infrastructure.

I've been wanting to build an external dashboard to host all these flaky test data

This seems like a better direction IMO to this problem than nagging people trying to contribute to the project. This is a framework level problem, or an education problem, or both. I'd rather have a central place to review to see and work on fixing the broken tests (though I thought that was the point of the issues already, and if that's correct, then it's another reminder that our process is the problem since we haven't prioritizing fixing the tests we know are unreliable).

the recent Playwright migration tends to have more stable tests than the Puppeteer one

I have a growing suspicion that what we saw early on was more due to the fact that there were fewer tests written in Puppeteer and also that we're doing a lot more fixturing and mocking. This as a tradeoff between writing tests that won't fail as often, but in exchange give up testing the behaviors they are there to assert. as we continue to fill out those Playwright tests, let's see how the reliability holds. as we see more and more tests succeed when the app fails, because of that fixturing, we might end up unfeeling that faux reliability.

Could you explain further in steps what you have in mind about the "scripting" approach and how it would work?

Oh it's probably not any different than what's already there. The idea was more like what you were talking about with a dashboard and one that would reach out to re-run the failing flakey tests more often. I don't know if 2 or 3 times is enough to get past the flakiness typically, and I don't know - does our existing watcher track things around the repo or just on individual PRs?

I am of the understanding that our flakey test watcher is not monitoring the performance tests, but when something happens as it did yesterday and suddenly all Performance Tests workflow runs fail, it would be nice to alert that.

this comes with the cost of hiding the flaky tests in each issue, rendering them less visible, hence making our project flakier as PR authors won't catch that.

concluding, and bringing this back to the first point, I don't follow this argument. we are not likely to catch the introduction of flakey tests in the PR that introduces them. this is because when they are introduced we can't know yet if they are flakey; it's likely that during development of a branch some tests will fail and then be resolved due to actual code problems and not due to infrastructure problems.

however, it's only after merging those tests into the mainline that we will learn they are flakey, and at that point it's too late to catch. they aren't hidden in PRs by letting those PRs pass; they just aren't obstructing developers for the problems someone else (or their past self) introduced.

if we have known flakey tests in tracking issues already why do we think that telling random PR authors is going to help. maybe it cools their anxiety, but only until they learn they should assume failed tests are just that - failed tests.

kevin940726 · 2022-12-28T09:09:57Z

Thanks! I appreciate your feedback a lot too!

if the tests are failing in my PR I'm already going to look at those tests and see if they seem related to my changes or not. As anyone who works in the repo should learn rather quickly, if a test fails, it's most likely caused by broken tests and not by broken code.

It's not what I experienced though. People often are confused about the broken tests and have no clue whether it's related to their PR or not. I've been asked a few times already. Seeing the big red cross on one's PR is somewhat discouraging, no matter if it's consciously. It also forbids the PR from being merged, so a maintainer has to jump in, double-check that it's unrelated to the PR, and re-run the job until it passes.

A simple note in the README or on first contributors' PRs might be a way to better communicate this: it's not you, it's our infrastructure.

The comment is the note IMO. Instead of claiming upfront that our tests are not stable and risking people not trusting our tests or even avoiding contributing to them, we only leave a note when there are flaky tests being caught.

I'd rather have a central place to review to see and work on fixing the broken tests (though I thought that was the point of the issues already, and if that's correct, then it's another reminder that our process is the problem since we haven't prioritizing fixing the tests we know are unreliable).

We do have the issues list as a central place to view all the flaky tests. They were reported silently so we had to routinely go to the list to look for something to fix. This PR makes them more visible during PRs, so hopefully we can fix them faster. The Playwright migration is also targeting the flakiest ones to migrate first, so we're prioritizing it, just that there aren't many people working on it right now 😅 .

I have a growing suspicion that what we saw early on was more due to the fact that there were fewer tests written in Puppeteer and also that we're doing a lot more fixturing and mocking.

We already have 355 test cases in Playwright, compared to 449 test cases in Puppeteer. By doing a quick (unverified) search, out of 216 total reported flaky tests, there are currently 174 of them written with Puppeteer. Among the 39 open ones, there are currently 29 of them in Puppeteer as well. This doesn't prove much though as I didn't take flakiness into account.

We don't do fixturing or mocking on stuff that is the focus of the test. We usually only do them for clearing/resetting the state which IMO is the best practice. For instance, we don't manually go to the posts page to delete posts between each test while we can just test it once and call the API in other places to do the same thing in a much faster and more reliable way. We had a discussion about this a while ago and we might bring it up again in a more formal manner via an RFC PR.

I don't know if 2 or 3 times is enough to get past the flakiness typically, and I don't know - does our existing watcher track things around the repo or just on individual PRs?

If it fails too often (more than 2 times in a row) then it's probably a sign that the test is too flaky to be considered valuable, and we should fix it or skip it ASAP. We track it whenever the e2e test job is run, that is, any commit to trunk/release/wp and PR.

I am of the understanding that our flakey test watcher is not monitoring the performance tests, but when something happens as it did yesterday and suddenly all Performance Tests workflow runs fail, it would be nice to alert that.

I'm not sure if it makes sense to monitor the performance test though. If something breaks the performance test then we should just try fixing it instead of letting it pass with warnings. That's what the check is for in the first place, isn't it?

however, it's only after merging those tests into the mainline that we will learn they are flakey, and at that point it's too late to catch.
if we have known flakey tests in tracking issues already why do we think that telling random PR authors is going to help.

I agree, this is a valid concern. However, without a dashboard, I think this is the best option that I can think of for now 😞.

I should clarify though that the comment isn't actually for PR authors, but more for maintainers or experienced contributors who also happen to be PR authors. They are encouraged to review the flaky tests and keep monitoring them if needed. It should be a goal for us to keep improving the stability of our test cases and the comment is just a tool to help make it more visible.

I opened #46785 to make it only post the comment on the PR but not on commits. LMK if that's better and we can merge and iterate from there.

dmsnell · 2023-01-04T21:37:01Z

It also forbids the PR from being merged, so a maintainer has to jump in, double-check that it's unrelated to the PR, and re-run the job until it passes…Instead of claiming upfront that our tests are not stable and risking people not trusting our tests or even avoiding contributing to them, we only leave a note when there are flaky tests being caught.

This makes sense, but I think the problem is fairly widespread and unrelated to the changes in any given PR. I don't have numbers on how many PRs experience flakey tests, but I'm curious if it's normal to have a PR that doesn't. If every PR that I propose gets this alarm then I don't understand how having the comment builds any more trust than a notice up-front.

I'm reminded that many open-source projects ship with partially-failing test suites. The reality is that our tests don't warrant trust.

This PR makes them more visible during PRs, so hopefully we can fix them faster.

This is where I keep asking who the target audience is. On one hand we're talking about alerting feature developers that they aren't the reason the tests failed, but on the other we're talking about making flakey tests more visible to the infrastructure folks.

How are we going to test if this leads to faster response? How does this make these tests more visible by scattering them across individual PRs given that they are already collected in one place? If we haven't seen people go to the issues list and pick up flakey tests, what leads us to believe they will scan through random PRs and look for a comment that may or may not be there, which will potentially show them tests they already came by?

If something breaks the performance test then we should just try fixing it instead of letting it pass with warnings.

That's the same kind of wishful thinking that I think is the reason that education/lists/comments won't help with our flakey tests. We are fully aware that we have flakey tests but we haven't prioritized fixing them. If perf tests have issues they will have to go through the same prioritization. Right now we have flakey tests in the perf test suite, we just don't monitor them. "Just fixing it" hasn't been working.

That's what the check is for in the first place, isn't it?

Not to my understanding. It's true that if the perf tests fail then the PR is rejected. I have suggested we lift this because the point of those tests is to monitor the performance of the editor. The other E2E suites are there for asserting correct behaviors.

If a perf tests fail because of a flakey test it doesn't mean anything other than the test suites are broken. It's more likely I suspect that if an E2E test fails it's because of a real failure (compared to the perf tests) since those tests are intended to track potentially risky flows whereas the perf tests are kind of assuming things work and never expect to fail because of a real flaw.

At a minimum what are your plans for measuring the impact of these changes? How have you prosed we know if this alert is achieving its goal(s)?

kevin940726 · 2023-01-05T15:19:46Z

I don't have numbers on how many PRs experience flakey tests, but I'm curious if it's normal to have a PR that doesn't.

The goal is to minimize the number of flaky tests in the project. After the refactoring, Playwright tests tend to have less flaky tests, often times zero, which is proof that it is possible.

Tests have less value if they are flaky, we don't know if it's flaky because of poorly written test or if the functionality is actually broken sometimes.

Note that keeping the tests stable is also an encouragement for contributors to write more tests, which eventually makes our project more stable. If we don't trust our tests, then nobody will, we are better off just don't write any tests at all.

This is where I keep asking who the target audience is. On one hand we're talking about alerting feature developers that they aren't the reason the tests failed, but on the other we're talking about making flakey tests more visible to the infrastructure folks.

This is unfortunately true, but I don't think we have better options. The comment will notify the reviewers, which are often maintainers of the project who should be actively monitoring the overall health of our tests.

If we haven't seen people go to the issues list and pick up flakey tests, what leads us to believe they will scan through random PRs and look for a comment that may or may not be there, which will potentially show them tests they already came by?

We are fully aware that we have flakey tests but we haven't prioritized fixing them.

I have been actively working on fixing many flaky tests, along with many other contributors. Fixing flaky tests is always a priority, just not that high compared to other important features. This is a maintenance job that we just have to keep doing. Perhaps making the report comment more visible will draw more people to help on this too.

Right now we have flakey tests in the perf test suite, we just don't monitor them. "Just fixing it" hasn't been working.

I'm not familiar with the perf test, but I'm sure there are folks actively working on maintaining them, aren't there?

At a minimum what are your plans for measuring the impact of these changes? How have you prosed we know if this alert is achieving its goal(s)?

This system so far has been helpful for me and other folks to prioritize the flaky tests that we want to fix. I'd say that it's achieving its goal already. Such things might be difficult to measure, but as someone actively working on writing/fixing/migrating/refactoring/reviewing e2e tests, I'd say this is worth the effort.

Of course though, if anyone thinks strongly against this that it's still too annoying for them then we can always revert it as you suggested.

kevin940726 added the [Type] Automated Testing Testing infrastructure changes impacting the execution of end-to-end (E2E) and/or unit tests. label Nov 16, 2022

kevin940726 requested review from Mamaduka and talldan November 16, 2022 08:39

kevin940726 self-assigned this Nov 16, 2022

Mamaduka reviewed Nov 18, 2022

View reviewed changes

kevin940726 marked this pull request as ready for review December 5, 2022 10:14

kevin940726 requested a review from youknowriad December 5, 2022 10:14

kevin940726 added 2 commits December 15, 2022 13:29

Post a summary of the flaky tests to the commit

6caa754

Add test for rendering commit comment

5779016

kevin940726 force-pushed the add/flaky-tests-comment branch from 88c5940 to 5779016 Compare December 15, 2022 05:29

talldan reviewed Dec 15, 2022

View reviewed changes

.github/workflows/end2end-test.yml Show resolved Hide resolved

.github/workflows/end2end-test.yml Show resolved Hide resolved

.github/workflows/end2end-test.yml Show resolved Hide resolved

Update name from admin to puppeteer

f9cb3fd

kevin940726 force-pushed the add/flaky-tests-comment branch from 7311857 to f9cb3fd Compare December 15, 2022 08:57

talldan approved these changes Dec 16, 2022

View reviewed changes

talldan merged commit 6888113 into trunk Dec 16, 2022

talldan deleted the add/flaky-tests-comment branch December 16, 2022 07:33

github-actions bot added this to the Gutenberg 14.9 milestone Dec 16, 2022

kevin940726 mentioned this pull request Dec 26, 2022

Only use a single comment for the flakiness report on PRs #46785

Merged

kevin940726 mentioned this pull request Jan 9, 2023

Fix flaky tests reporter not posting comments on contributors' PR #46988

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Post a summary of the flaky tests to the commit #45798

Post a summary of the flaky tests to the commit #45798

kevin940726 commented Nov 16, 2022 •

edited

Loading

codesandbox bot commented Nov 16, 2022

github-actions bot commented Nov 16, 2022 •

edited

Loading

Mamaduka left a comment

kevin940726 commented Nov 18, 2022

glendaviesnz commented Nov 24, 2022 •

edited

Loading

kevin940726 commented Nov 30, 2022

glendaviesnz commented Nov 30, 2022

kevin940726 commented Dec 5, 2022

youknowriad commented Dec 5, 2022

kevin940726 commented Dec 7, 2022

glendaviesnz commented Dec 14, 2022

kevin940726 commented Dec 15, 2022

kevin940726 commented Dec 15, 2022 •

edited

Loading

talldan left a comment

kevin940726 commented Dec 15, 2022

talldan left a comment

kevin940726 commented Dec 16, 2022

Mamaduka commented Dec 16, 2022

dmsnell commented Dec 16, 2022

dmsnell commented Dec 16, 2022

youknowriad commented Dec 17, 2022

talldan commented Dec 19, 2022

kevin940726 commented Dec 19, 2022

youknowriad commented Dec 19, 2022

dmsnell commented Dec 19, 2022

kevin940726 commented Dec 20, 2022 •

edited

Loading

dmsnell commented Dec 20, 2022

kevin940726 commented Dec 20, 2022

dmsnell commented Dec 20, 2022

kevin940726 commented Dec 28, 2022

dmsnell commented Jan 4, 2023

kevin940726 commented Jan 5, 2023

Post a summary of the flaky tests to the commit #45798

Post a summary of the flaky tests to the commit #45798

Conversation

kevin940726 commented Nov 16, 2022 • edited Loading

What?

Why?

How?

Testing Instructions

Screenshots or screencast

codesandbox bot commented Nov 16, 2022

github-actions bot commented Nov 16, 2022 • edited Loading

Mamaduka left a comment

Choose a reason for hiding this comment

kevin940726 commented Nov 18, 2022

glendaviesnz commented Nov 24, 2022 • edited Loading

kevin940726 commented Nov 30, 2022

glendaviesnz commented Nov 30, 2022

kevin940726 commented Dec 5, 2022

youknowriad commented Dec 5, 2022

kevin940726 commented Dec 7, 2022

glendaviesnz commented Dec 14, 2022

kevin940726 commented Dec 15, 2022

kevin940726 commented Dec 15, 2022 • edited Loading

talldan left a comment

Choose a reason for hiding this comment

kevin940726 commented Dec 15, 2022

talldan left a comment

Choose a reason for hiding this comment

kevin940726 commented Dec 16, 2022

Mamaduka commented Dec 16, 2022

dmsnell commented Dec 16, 2022

dmsnell commented Dec 16, 2022

youknowriad commented Dec 17, 2022

talldan commented Dec 19, 2022

kevin940726 commented Dec 19, 2022

youknowriad commented Dec 19, 2022

dmsnell commented Dec 19, 2022

kevin940726 commented Dec 20, 2022 • edited Loading

dmsnell commented Dec 20, 2022

kevin940726 commented Dec 20, 2022

dmsnell commented Dec 20, 2022

kevin940726 commented Dec 28, 2022

dmsnell commented Jan 4, 2023

kevin940726 commented Jan 5, 2023

kevin940726 commented Nov 16, 2022 •

edited

Loading

github-actions bot commented Nov 16, 2022 •

edited

Loading

glendaviesnz commented Nov 24, 2022 •

edited

Loading

kevin940726 commented Dec 15, 2022 •

edited

Loading

kevin940726 commented Dec 20, 2022 •

edited

Loading