Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Rain-cloud plot #21

Open
Generalized opened this issue Feb 3, 2021 · 4 comments
Open

[Proposal] Rain-cloud plot #21

Generalized opened this issue Feb 3, 2021 · 4 comments

Comments

@Generalized
Copy link

Generalized commented Feb 3, 2021

You might want to consider adding raincloud plots to your package. They gain popularity over the raw boxplots.

A raincloud plot is a combination of a boxplot, density (violin) and the raw data. Additional stuff, like mean, SD or CI may be added for convenience (it's not closed). Raincloud plot:

  1. solves the problem of a boxplot with hiding multiple modes
  2. shows immediately quantiles (density plots don't have it, unless combined with boxplots)
  3. shows the abundance of data. This is especially important when dealing with discrete (numeric) data, like drug dosage. This is poorly represented by density plots and often leads to degeneration of the boxplot (collapses even to a flat line).

When it's about discrete data, the jittered values (showing then false values) should be replaced by dot-strip chart, which stacks the values.

  1. when the mean is added, one can immediately observe its closeness to the median and other quantiles.

Please find this article about them.
Allen, M., Poggiali, D., Whitaker, K., Marshall, T. R., & Kievit, R. A. (2019). Raincloud plots: a multi-platform tool for robust data visualization. Wellcome open research, 4, 63. https://doi.org/10.12688/wellcomeopenres.15191.1

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6480976/

They can be easily created with the gghalves package.
https://twitter.com/hashtag/gghalves

Found also this:
obraz

@smouksassi
Copy link
Owner

Thanks At some point I was thinking about adding geom_sina and or geom_quasirandom / beeswarm plots .
Will get back to you on this

@smouksassi
Copy link
Owner

Hi can you share a data that I can use as an example working on an update that will add violin and position beeswarm and position quasirandom while not exactly using gghalves

@Generalized
Copy link
Author

Generalized commented Feb 10, 2021

Please note, that raincloud without the boxplot is not very useful, as it doesn't show quantiles. Moreover, using raw jittered data inappropriate for discrete data, like drug doses - that's where it's should be replaced by dotplots.

You can use any data set.

response_time <- structure(list(Response = c(6.94105811469065, 3.52954102812241, 
8.44123849702064, 1.33956771585083, 5.35245174852167, 6.96840053631194, 
5.80932514742261, 10.6676332595508, 6.4703965850493, 8.96559096953663, 
7.29088910567057, 8.89279024648916, 8.53320670461069, 10.0396624833181, 
10.6731295613674, 8.23116768429597, 7.31367431306988, 9.79697744434258, 
9.05764474172065, 8.38911345290211, 8.78448217494462, 5.47701069469298, 
12.9651271569658, 17.3643758033415, 10.9822666110311, 16.5248723381264, 
9.11871846447009, 8.81911644915104, 11.4846936979993, 15.8796034237571, 
11.1724991020126, 17.0166996490015, 17.8090763728022, 8.98110103311363, 
16.2310016797805, 10.7584047425627, 16.8519974365423, 10.7830742579277, 
8.57223775546169, 10.5136131754638, 26, 7.25895705669455, 15.8716527885678, 
5, 20.3271302169183, 20.6238575902315, 6.10825801834299, 9.65351686095677, 
12.0035569634042, 13.0072474833103, 9.13630494268584, 7.56278374218666, 
14.6326889238029, 13.8168485641287, 15.0211001568987, 17.6439425846732, 
10.8648004585573, 7.23267465991285, 14.7045460109011, 12.1178097768457, 
9.3939048013323, 8.73887609331093, 18.6916633359407, 19.1620134771203, 
17.161720602849, 15.2756319105639, 23.6514470794482, 20.829498095384, 
15.8716750997116, 19.2546467671376, 18.4642793101234, 15.7266378499761, 
13.2160775597625, 23.3556578745881, 18.4705738585698, 18.2246999167943, 
19.3063904844388, 15.5077891776988, 19.847234959044, 12.5013367858204, 
16.7838490291637, 17.0351784662767, 12.6351735676193, 19.555970889873, 
15.2186087278484, 15.4461873360938, 14.5927341866952, 8.75628018658348, 
21.0070544423412, 21.0468714753563, 16.3119484294647, 17.4544713510641, 
24.0246193105042, 14.858255653037, 18.4287161496008, 32.5834977535751, 
14.9982603313169, 17.4071430817929, 24.4915619932899, 30.4213013980958, 
11.6150149591386, 30.3103670591867, 14.1068208856221, 15.6931404291679, 
16.0467298123304, 20.6753550216723, 24.4681217927389, 28.073254211824, 
19.4261202864192, 14.162165669196, 15.6677231573154, 15.9263273051365, 
23.3545613039988, 14.0977537590875, 15.0741299881899, 15.1410388203248, 
25.7089954347319, 16.3175855537667, 33.3234217469181, 10.8383635689418, 
10, 20.3004344882188, 21.1889441976913, 23.5270859372803, 16.9926637524444, 
16.3673125773663, 19.8879139676685, 17.0365961999078, 17.4738746039858, 
23.8669465532871, 21.2866194202374, 19.8728082335943, 15.9463651297976, 
27.2613462289531, 13.9259422906782, 32.4296363195051, 20.1709169344812, 
26.3023825370941, 21.6284148146921, 24.1881675058874, 18.9083513201797, 
18.0040281816293, 16.6963783249489, 13.0802317108258, 14.8171742117, 
19.6366079971767, 30.1262996526095, 15.364596869443, 10.4120910601315, 
17.8583991262233, 23.3977300553816, 25.7490070104186, 18.3188041858568, 
13.0996243731083, 27.1311337628915, 35, 14.0007094736688, 25.1698370410706, 
17.8862748116895, 24.7391387833379, 29.6207045339983, 33.0846639777505, 
20.3588954295536, 20.1817408751066, 17.3069296808769, 22.4598813656082, 
22.6519987393166, 20.2589694360304, 22.313355114235, 33.0768379237977, 
24.466010111329, 12.3407240195374, 11.0840389330242, 18.9728288449833, 
21.0376266155683, 24.7796765796351, 14.4985337961131, 20.0141652418275, 
14.4364201855426, 26.0543430946129, 31.971581641439, 23.0852154603889, 
31.3868440646259, 26.0204007443285, 30.2301497188274, 32.6437081723286, 
25.8207198631355, 26.7404894722345, 22.9398979865188, 28.7345185151684, 
32.8052946081318, 30.9254814916613, 30.0685079560224, 28.2398995038118, 
20.9223394234401, 29.0794197570588, 25.9940854201729, 34.9663757209427, 
23.2142505818242, 20.1649636667507, 33.458742155392, 32.9573575114733, 
27.0177145894676, 26.3483214837122, 29.9573432316566, 25.1805206422707, 
31.3785304208239, 23.1004057136737, 24.5841866103209, 27.9632773516798, 
29.5659571282678, 27.4962997013747, 25.1509446013167, 33.1824273819824, 
31.726115405194, 20.3368944717519, 24.5048401139325, 31.1132171504031, 
30.8434597859799, 24.4445697751097, 36.4890236006738, 24.8865650842603, 
24.831490418638, 22.1518299110539, 34.1988748755963, 26.098159000857, 
24.3745325667883, 28.9122344140853, 27.9592580047094, 27.9729153174235, 
18.655587903387, 27.5099979772053, 17.2543646063917, 20.8143214074563, 
29.8257479032749, 20.6436651846391, 28.5796842365042, 24.0488627701133, 
29.9938779192839, 26.9545963016118, 26.7691480273568, 25.9575545309811, 
28.7784160260746, 33.0562283874423, 20.1249671133739, 26.0488006150761, 
25.5249229520778, 25.557374437868, 38.5506857278032, 14.6245651702246, 
20.6477161536508, 23.2546443896406, 24.5410657774548, 27.5670165860902, 
25.020450910562, 26.184976749465, 17.7027440816755, 29.4720256261613, 
27.7463993379869, 16.6823782275765, 27.2623862475271, 29.9162289782125, 
24.6230908256822, 25.051587514259, 21.7036873970245, 20.8327775868237, 
22.4649011053264, 39.138088187993, 25.241009218782, 14.1717336690299, 
27.4556719306339, 21.8169750222832, 24.7249630012707, 25.352571618143, 
25.9589907309322, 22.6384740307843, 23.9722441017597, 22.2981075975582, 
20.9656179896212, 22.269842301202, 25.3948598358782, 21.3140834740566, 
26.2850646554798, 20.1284757399584, 24.6004212159253, 23.0778403396023, 
23.7363437008574, 23.0054988921272, 23.2655081777856, 22.8667435992643, 
23.1595044935062, 21.5752755887176, 24.6152959403681, 20, 23.99254610094, 
24.239042091276, 22.636376838146, 24.1625943996995, 22.4053636682921, 
21.2484914584576, 23.3766943121649, 23.8499404047538, 25.0430153527484, 
21.2349576668748, 24.7583102619741, 23.9146805134304, 22.1580451724388, 
24.6992522332068, 23.1513390752852, 24.0968918824751, 39.2078472273421, 
28.8099244717235, 25.013066441343, 35.6252553150798, 37.273312264522, 
33.4551398435178, 35.0463509844323, 35.2626177127978, 35.6183283780144, 
31.8746870536212, 32.4090731891806, 36.8812074128533, 42.0164461155037, 
33.3153297179227, 41.2992014861771, 32.1173999878783, 34.3246310895026, 
28.0934886194896, 27.914798637274, 41.0364553386196, 40.6482499190929, 
46, 35.15180454877, 28.5260896219149, 45.7277923594022, 33.5935615549013, 
32.2014352497874, 37.8412623280909, 35.9583647374357, 27.0580872615683, 
34.9281191403959, 31.7919241447276, 33.9975057119126, 35.6231608435989, 
36.0725395917844, 29.6051066029916, 33.4407577119333, 34.2989264656375, 
35.3672037236702, 33.3188621613895, 32.7105148269777, 23.0490423732541, 
38.0616893913955, 35.3141221089852, 38.5536838132142, 42.2873709419412, 
33.7095280684755, 30.5611333215595, 38.0549074046721, 35.8479887504421
), Timepoint = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("Baseline", 
"Visit 1", "Visit 2", "Visit 3"), class = "factor"), GroupA = c("A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "A", 
"A", "A", "B", "A", "A", "B", "A", "A", "A", "A", "A", "A", "A", 
"B", "B", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "B", 
"A", "A", "A", "B", "A", "A", "B", "A", "B", "A", "A", "B", "B", 
"A", "B", "A", "A", "A", "B", "A", "A", "A", "A", "A", "A", "B", 
"B", "B", "B", "A", "A", "A", "B", "A", "B", "B", "B", "A", "A", 
"A", "B", "A", "A", "A", "A", "A", "B", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "A", "A", 
"A", "B", "A", "A", "B", "A", "A", "A", "A", "A", "A", "A", "B", 
"B", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "A", 
"A", "A", "B", "A", "A", "B", "A", "B", "A", "A", "B", "B", "A", 
"B", "A", "A", "A", "B", "A", "A", "A", "A", "A", "A", "B", "B", 
"B", "B", "A", "A", "A", "B", "A", "B", "B", "B", "A", "A", "A", 
"B", "A", "A", "A", "A", "A", "B", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "A", "A", "A", 
"B", "A", "A", "B", "A", "A", "A", "A", "A", "A", "A", "B", "B", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "A", "A", 
"A", "B", "A", "A", "B", "A", "B", "A", "A", "B", "B", "A", "B", 
"A", "A", "A", "B", "A", "A", "A", "A", "A", "A", "B", "B", "B", 
"B", "A", "A", "A", "B", "A", "B", "B", "B", "A", "A", "A", "B", 
"A", "A", "A", "A", "A", "B", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "B", "A", "A", "A", "B", 
"A", "A", "B", "A", "A", "A", "A", "A", "A", "A", "B", "B", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "A", "A", "A", 
"B", "A", "A", "B", "A", "B", "A", "A", "B", "B", "A", "B", "A", 
"A", "A", "B", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", 
"A", "A", "A", "B", "A", "B", "B", "B", "A", "A", "A", "B", "A", 
"A", "A", "A", "A", "B", "A", "A", "A"), GroupB = c("X", "X", 
"Y", "Y", "Y", "Y", "X", "X", "Y", "X", "X", "Y", "X", "X", "X", 
"X", "X", "X", "X", "X", "Y", "Y", "Y", "Y", "Y", "Y", "X", "X", 
"Y", "X", "Y", "Y", "X", "X", "Y", "X", "X", "Y", "Y", "Y", "X", 
"Y", "X", "Y", "Y", "Y", "X", "X", "Y", "Y", "Y", "X", "X", "Y", 
"Y", "X", "X", "X", "X", "X", "Y", "Y", "Y", "X", "X", "X", "X", 
"Y", "X", "X", "Y", "Y", "Y", "Y", "Y", "X", "Y", "X", "Y", "X", 
"X", "X", "Y", "X", "X", "X", "X", "X", "X", "X", "X", "X", "Y", 
"Y", "Y", "Y", "X", "X", "Y", "X", "X", "Y", "X", "X", "X", "X", 
"X", "X", "X", "X", "Y", "Y", "Y", "Y", "Y", "Y", "X", "X", "Y", 
"X", "Y", "Y", "X", "X", "Y", "X", "X", "Y", "Y", "Y", "X", "Y", 
"X", "Y", "Y", "Y", "X", "X", "Y", "Y", "Y", "X", "X", "Y", "Y", 
"X", "X", "X", "X", "X", "Y", "Y", "Y", "X", "X", "X", "X", "Y", 
"X", "X", "Y", "Y", "Y", "Y", "Y", "X", "Y", "X", "Y", "X", "X", 
"X", "Y", "X", "X", "X", "X", "X", "X", "X", "X", "X", "Y", "Y", 
"Y", "Y", "X", "X", "Y", "X", "X", "Y", "X", "X", "X", "X", "X", 
"X", "X", "X", "Y", "Y", "Y", "Y", "Y", "Y", "X", "X", "Y", "X", 
"Y", "Y", "X", "X", "Y", "X", "X", "Y", "Y", "Y", "X", "Y", "X", 
"Y", "Y", "Y", "X", "X", "Y", "Y", "Y", "X", "X", "Y", "Y", "X", 
"X", "X", "X", "X", "Y", "Y", "Y", "X", "X", "X", "X", "Y", "X", 
"X", "Y", "Y", "Y", "Y", "Y", "X", "Y", "X", "Y", "X", "X", "X", 
"Y", "X", "X", "X", "X", "X", "X", "X", "X", "X", "Y", "Y", "Y", 
"Y", "X", "X", "Y", "X", "X", "Y", "X", "X", "X", "X", "X", "X", 
"X", "X", "Y", "Y", "Y", "Y", "Y", "Y", "X", "X", "Y", "X", "Y", 
"Y", "X", "X", "Y", "X", "X", "Y", "Y", "Y", "X", "Y", "X", "Y", 
"Y", "Y", "X", "X", "Y", "Y", "Y", "X", "X", "Y", "Y", "X", "X", 
"X", "X", "X", "Y", "Y", "Y", "X", "X", "X", "X", "Y", "X", "X", 
"Y", "Y", "Y", "Y", "Y", "X", "Y", "X", "Y", "X", "X", "X", "Y", 
"X", "X", "X", "X", "X", "X", "X")), row.names = c(NA, -360L), class = "data.frame")
# complete.cases() isn't necessary here, but it's good to add it always in case missing data occured
ggplot(data = response_time %>% filter(complete.cases(.)) , 
       aes(x=Timepoint, y=Response)) + theme_bw() +
    geom_half_point(color="orange", side = "l", size = 1, alpha=1) + 
    geom_half_boxplot(color="black", fill="white", side = "l", width = 0.7, alpha = 0.2, errorbar.length = 0.8, nudge = 0.1, outlier.colour = NA) +
    geom_half_violin(side = "r", color="slateblue", fill="slateblue", trim = F) + 
    ggtitle("Response over time", subtitle = "across A and B") + 
    xlab("Timepoint") + ylab("Response") +
    facet_grid(GroupA~GroupB) +
    stat_summary(aes(y=Response), fun.data="mean_sdl", fun.args = list(mult=0), geom="crossbar", color = "red", width=.2)

obraz

and

ggplot(response_time, aes(x=Timepoint, y=Response)) +
    geom_half_boxplot(center=F, errorbar.draw=FALSE,
                      width=0.5, nudge=0.1, outlier.colour = NA) +
    geom_half_violin(side="r", nudge=0, trim=FALSE) +
    geom_half_dotplot(dotsize=0.5, alpha=1, fill="red", color="red",
                      position=position_nudge(x=0.02, y=0), stackratio = .5) +
    facet_grid(GroupA~GroupB) +
    stat_summary(aes(y=Response), fun.data="mean_sdl", fun.args = list(mult=0), geom="crossbar", color = "blue", width=.1) +
    ggtitle("Response over time", subtitle = "across A and B") + 
    theme_bw()

obraz

or

ggplot(response_time, aes(x=Timepoint, y=Response)) +
    geom_half_violin(side = "r", color="grey", fill="grey", trim = FALSE) + 
    geom_half_dotplot(dotsize=1, alpha=1, fill="darkgrey", color="darkgrey",
                      position=position_nudge(x=-0.1, y=0), stackratio = .5, stackdir = "down") +
    facet_grid(GroupA~GroupB) +
    stat_summary(aes(y=Response), 
                 fun.data="mean_sdl", 
                 fun.args = list(mult=0), 
                 geom="crossbar",
                 color = "black", 
                 width=.2) +
    ggtitle("Response over time", subtitle = "across A and B") + 
    theme_bw()

obraz

@smouksassi
Copy link
Owner

thanks a lot for these informative examples for now I added violin plot and additional positions examples are attached
standard violin plot in ggplot can show quantiles based on ecdf ( or use boxplot for sample quantiles) added points with quasirandom jittering

also added a separate point for the mean and label for the N
by pasting group A and B we can also head to head compare the four combinations
Half-half is elegant but will have to make sure it is compatible with plotly and auto rotation of geoms otherwise it will be breaking lot of things.

pic2
pic1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants