Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should xlim be in coord_cartesian mode? #4

Closed
MarcinKosinski opened this issue Feb 14, 2016 · 5 comments
Closed

Should xlim be in coord_cartesian mode? #4

MarcinKosinski opened this issue Feb 14, 2016 · 5 comments

Comments

@MarcinKosinski
Copy link
Contributor

@kassambara Thanks for your great work in creating survminer.
I have one suggestion about xlim parameter.
As I understand more-or-lves when a user specifies xlim then this is added to plot and table parts

resul$plot <- resul$plot + scale_x_continuous()
resul$table <- resul$table + scale_x_continuous()

I assume the only situation in which a user would like to cut axes is when there exists a specific time point after which there are only few patients. Like on the example below

library(RTCGA) # survivalTCGA is github only function
library(RTCGA.clinical)
library(survminer)
library(survival)


survivalTCGA(BRCA.clinical, OV.clinical, 
             extract.cols = "admin.disease_code") -> BRCAOV.survInfo

fit <- survfit(Surv(times, patient.vital_status)~admin.disease_code,
               data = BRCAOV.survInfo)


# Customize the output and then print
#++++++++++++++++++++++++++++++++++++
ggsurvplot(fit, pval = TRUE, conf.int = TRUE,
                  risk.table = TRUE)

surv_ex_1

In such situation one might not be interested in time below 2000 days. So He would like to provide a limits for xlim and use xlim (erroneously) like this

ggsurvplot(fit, pval = TRUE, conf.int = TRUE,
           risk.table = TRUE, xlim = c(0,200))

surv_ex_2

Careful R user might notice the error

Warning messages:
1: Removed 1005 rows containing missing values (geom_path). 
2: Removed 710 rows containing missing values (geom_point). 
3: Removed 1 rows containing missing values (geom_text). 
4: Removed 6 rows containing missing values (geom_text).

and might notice that the graph does not correspond to the previous one.

What have happened is that: first the rows with higer survival times were removed, and then the survival curves where calculated. Which is not the real issue. One would like to still get the smae survival estimates. So I think the coord_cartesian is the answer for this problem

ggsurvplot(fit, pval = TRUE, conf.int = TRUE,
           risk.table = TRUE) -> res

res$plot <- res$plot + coord_cartesian(xlim = c(0,2000))
res$table <- res$table + coord_cartesian(xlim = c(0,2000))
res

surv_ex_3

Which has the same survival curves as the first graph, but it looks like the risk table is broken.
Could you please be so kind and think about this issue, and change it to coord_cartesian with proper risk.table as I am unable to fix your code so that risk.table would look elegant after coord_cartesian?

This is the issue I have also when woking with the survMisc packag which was removed from CRAN suddenly 3 weeks ago.

@kassambara
Copy link
Owner

I changed xlim to Cartesian Coordinates mode, which is the most common type of coordinate system. It will zoom the plot, without clipping the data.
Thank you for pointing out this issue!

By default, ggsurvplot adds the exact value of the pvalue when pval = TRUE.
Do you think that, it's would be better to use p < 0.0001 (when pvalue < 0.0001) instead of 2.2e-50 (for example)?

@MarcinKosinski
Copy link
Contributor Author

I've been to few oncological conference and people rather had p < 0.0001
(when pvalue < 0.0001) instead of 2.2e-50 .
I think this mathematical notation 2.2e-50 is uncomfortable even for me,
even though I am a statistician :)

2016-02-14 10:17 GMT+01:00 Alboukadel KASSAMBARA notifications@github.com:

I changed xlim to Cartesian Coordinates mode, which is the most common
type of coordinate system. It will zoom the plot, without clipping the data.
Thank you for pointing out this issue!

By default, ggsurvplot adds the exact value of the pvalue when pval =
TRUE.
Do you think that, it's would be better to use p < 0.0001 (when pvalue <
0.0001) instead of 2.2e-50 (for example)?


Reply to this email directly or view it on GitHub
#4 (comment).

@kassambara
Copy link
Owner

Done!
p < 0.0001 is know used (when pvalue < 0.0001)

@MarcinKosinski
Copy link
Contributor Author

Thanks. Good idea.
I've seen few typos in NEWS.md.
You can update them with this PR #5

kassambara pushed a commit that referenced this issue May 25, 2016
MarcinKosinski referenced this issue in BioinformaticsFMRP/TCGAbiolinks Dec 17, 2016
kassambara pushed a commit that referenced this issue Feb 2, 2017
@Abhiroop-2924
Copy link

Can we plot log cumulative hazard plot? We do have 'cumhaz' function for cumulative hazard. If both x and y axis of cumhaz plot is in log scale then we will have log cumulative hazard plot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants