Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration of treatment response data in PlotsTab (incl. new Waterfall plot) #2055

Merged
merged 2 commits into from
Jul 4, 2019

Conversation

pvannierop
Copy link
Contributor

@pvannierop pvannierop commented Feb 20, 2019

What? Why?

This PR will implement handling of the new treatment response (genetic entity) data type by the PlotsTab component. A new plot type Watefall plot is added.

Changes

  • Treatment is shown in data type select boxes of the horz. and vert. axis menu's (when available in cBioPortal db). When selected treatment response genetic profiles are shown in the data source select menu's (see Fig.1). Only treatments selected in the oncoprint heatmap menu are available for selection.
  • Treatment data points can be limited (real value lies beyond a certain value) which can be indicated by > or < prefixes to values (for example >8.00, see this backend PR). Bar, scatter and waterfall (see below) plot types contain a legend option Limit Value when one or more data points in the plot are limited. Enabling this legend option represents limited data points as diamonds (see Fig.1). This option is enabled by default and is re-enabled after each axis menu change. When enabled limited data values are excluded from calculations of correlation coefficients in the scatter plot legend.
  • When Treatment data type is selected a --option is shown in the other data type menu that represents no data type selected for the corresponding axis (see Fig.1). For treatment data the Apply log scale option is displayed in the axis menu's. When the Treatment and -- combination is selected by the user, a new plot type Waterfall plot is displayed.
  • A new plot type Waterfall plot is implemented (see Fig.2). This plot type represents a sorted (asc or desc) bar graph of response values expressed relative to a given threshold value a.k.a. pivot_threshold . The pivot_threshold is set on a profile level scope and is passed from the cbioportal backend. When no pivot_threshold is passed unmodified values are represented. When a waterfall plot is shown Log-scale and Sort Order controls are shown in the corresponding axis menu that control log10-transformation and asc-desc sorting, respectively. The legend options for this plot type include a new gene selection box that controls for which gene the mutation or CNA information is represented in the plot. For the waterfall plot the mutation and CNA styling options are mutually exclusive. The selected gene is represented in the title of the plot. When legend option Limit Value is enabled limited data points are marked by a black diamond above respective bars. Sample and mutation queries are represented by red plus signs below respective bars. The width of the waterfall plot is dynamic with a max-width of 1000px. Download of waterfall plot data includes raw and pivot_threshold-adjusted values and the sort order.

Notes

  • Because PlotsTab only shows treatments selected in Oncoprint functionality in this PR for treatment response data is dependent on this PR that implements functionality for treatment response data in the OncoprintTab. For review of the Plotstab functionality a treatment list must be added to the URL (f.i. http://localhost:8081/cbioportal/results/plots? ... &treatment_list=17-AAG%3BAEW541).

Images

Figure 1

  1. Treatment data type
  2. -- data type option in other menu
  3. Limit Value legend option
  4. Limited data points styled as diamonds
  5. Log scale control for treatment profile values

Screenshot from 2019-04-04 09-32-03

Figure 2

  1. Waterfall plot (at max width)
  2. Limit Value legend option
  3. Limited data points styled as diamonds above bars
  4. Gene selection box for mutation or CNA styling
  5. Selected gene represented in title
  6. Log Scale and Sort Order controls for treatment profile values
  7. Sample search as red plus sign

Screenshot from 2019-04-04 09-32-51

Copy link
Contributor

@oplantalech oplantalech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First thing I tried raised an error: I made a query with the brca_tcga with treatment data, and in plots, I selected "Treatments" in the vertical axis:
captura de pantalla 2019-03-05 a les 9 57 17

The error also happens when selecting the "Treatment" on the horizontal axis.

What does this mean: "Only treatments selected in the oncoprint heatmap menu are available for selection." ? Does this mean that you first need to select something in the Oncoprint to see data in the plots?

The option -- shows up when selected Treatment in the plots.

@pvannierop Can you reproduce this error? Or maybe it is a problem of my rebasing? I'd suggest you to rebase the PRs if you cannot reproduce this error before I continue reviewing.

@pvannierop pvannierop force-pushed the treatment_response_feature_plotstab branch from 623f482 to 5ab0430 Compare March 8, 2019 15:03
Copy link
Contributor

@oplantalech oplantalech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR, just a few things:

  • I don't see "Truncated Value" in the legend or in the "Color Samples By" section, but "Limit Value". Also, in the plot legend, the limit value points are shown with a diamond shape instead of a triangle.
  • I would change the checkboxes of the options Mutation Type and Copy Number Alterations in the "Color Samples By" section, and replace them by radio buttons. In this way, it clarifies that you can only select Mutation or CNA, but not both or none.
  • The waterfall plot is very wide - I cannot see it complete in my screen without scrolling. I hope this can be fixed, maybe making the plot a bit smaller?

@pvannierop pvannierop force-pushed the treatment_response_feature_plotstab branch from c29a6be to 49648f3 Compare March 27, 2019 09:10
@pvannierop pvannierop force-pushed the treatment_response_feature_plotstab branch from 35518a8 to a679c9e Compare April 4, 2019 08:04
@adamabeshouse
Copy link
Contributor

a lot of files in this PR that should not be included - build files, and that yarn-error.log

@adamabeshouse
Copy link
Contributor

adamabeshouse commented Apr 16, 2019

I don't think waterfall plot is a good name, it doesn't match the use of waterfall plot that I find by googling? It's just a bar chart, isn't it?

This symbol isn't in the legend
image

@adamabeshouse
Copy link
Contributor

have you thought about renaming "--" to be more explicit, like "None (List of samples)" or "Samples" or something? I find "--" confusing

@@ -130,24 +129,42 @@ export interface IScatterPlotSampleData {
profiledMutations?:boolean;
mutations: AnnotatedMutation[];
copyNumberAlterations: AnnotatedNumericGeneMolecularData[];
xThresholdType?: '>'|'<'|undefined; // threshold symbol for x coordinate in scatter plot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove xT and yT, move them into a new interface IScatterPlotSampleData that extends IPlotSampleData

Copy link
Contributor Author

@pvannierop pvannierop Apr 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have made two attempts in this direction and both were unsuccessful. The reason for this is that I cannot reconcile 1) specific data types accepted by the different plots and 2) the type requirements of many functions in PlotsTabUtils.

I think the problem lies in the fact that in the current state sample data is the only property on which data points are styled in plots. As such all styling related functions in PlotsTabUtils are typed for the IPlotSampleData interface. With the introduction on a value-dependent qualifier such as the thresholdType symbol, the description of IPlotSampleData as a stand-in for what in effect is a IPlotStylingData interface is not longer sufficient.

I am afraid that splitting up of IPlotSampleData into different types for scatterplot (two-dimensional) and boxscatter/water (one-dimensional) as per your suggestion is something I tried two times, but failed. As stated in the paragraph above the reason for this is that IPlotSampleData is treated as a IPlotStylingData interface. The bad naming of this interface suggests that sample attributes are incorrectly mixed with value attributes in my PR. These attributes seem apples and pears, but are similar in the fact that these are attributes used to provide styling to datapoints. For this reason I added this comment:

// this interface contains all attributes
// needed to provide correct styling of 
// graph elements (points, bars, ...)

I my assessment correct so-far?

What do you think of this proposition:

export interface IPlotSampleData {}

export interface ITwoDimentsionPlotStyling {
    xThresholdType?: '>'|'<'|undefined;
    yThresholdType?: '>'|'<'|undefined;
}

export interface IOneDimensionPlotStyling {
    thresholdType?: '>'|'<'|undefined;
}

export interface IPlotStylingData extends IPlotSampleData, IOneDimensionPlotStyling, ITwoDimentsionPlotStyling {}

After this rework, we can remap all styling related function to the new IPlotStylingData interface. Does this rework make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but I'm confused. Can we talk on the phone about this? Can you speak for a half hour on Monday at 10 AM my time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: after discussion with @adamabeshouse we decided to refactor the interface and add 'thresholdType` type checking in the function in PlotsTabUtils that accept data points that conform to the less specific IPlotsTabData interface.

@@ -130,24 +129,42 @@ export interface IScatterPlotSampleData {
profiledMutations?:boolean;
mutations: AnnotatedMutation[];
copyNumberAlterations: AnnotatedNumericGeneMolecularData[];
xThresholdType?: '>'|'<'|undefined; // threshold symbol for x coordinate in scatter plot
yThresholdType?: '>'|'<'|undefined; // threshold symbol for y coordinate in scatter plot
thresholdType?: '>'|'<'|undefined; // threshold symbol for continuous axis in box-scatter and waterfall plot
Copy link
Contributor

@adamabeshouse adamabeshouse Apr 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please put thresholdType separately into IBoxScatterPlotPoint and IWaterfallPlotSampleData

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See discussion in the comment above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resoved in rework of interfaces.

logY?:boolean;
logX?:IAxisLogScaleParams|undefined;
logY?:IAxisLogScaleParams|undefined;
excludeLimitValuesFromCalculation?:boolean;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a comment for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. now that I understand what it is, I think it would be better to change 'Calculation' to 'Correlation' in the name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name was updated to excludeLimitValuesFromCalculation.

// barRatio={1} // removes spaces between bars
style={{
data: {
fill: (d:D) => d.fill,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's significantly more efficient with Victory to make a separate VictoryBar element for each style settings, rather than using functions. You can see an example of how I do this in ScatterPlot.tsx in the getChart() method with VictoryScatter elements

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of scatterplot the approach is to partition the datapoints into groups of identical styling and then consecutively plot different groups of points with hard coded styling information. For bar plots such as the waterfallplot this is not possible because the bar widths and inter-bar widths should be evaluated with all bars being present. When taking the scatterplot approach leads to malformed graphs. This was in fact my first implementation that I had to change to make it work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you calculate the bar widths using all the data together, then separate them out for input to victory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, I don't know. But how much of a problem is the current implementation? Will this performance difference be noticeable by the user? To be honest we do not have budget left to explore alternative ways that may or may not work, and I suspect that your proposed change is not guaranteed to be successful . Can't we not postpone this mod to a later time point?

@pvannierop pvannierop force-pushed the treatment_response_feature_plotstab branch 11 times, most recently from 45c46ee to a517c06 Compare July 1, 2019 09:18
@pvannierop pvannierop force-pushed the treatment_response_feature_plotstab branch 13 times, most recently from 58a7774 to f18bc1b Compare July 3, 2019 16:05
@pvannierop pvannierop force-pushed the treatment_response_feature_plotstab branch from f18bc1b to 3fb2240 Compare July 4, 2019 06:57
@inodb inodb merged commit 625c182 into cBioPortal:rc Jul 4, 2019
jagnathan pushed a commit to pughlab/cbioportal-frontend that referenced this pull request Jan 25, 2022
…ature_plotstab

Integration of treatment response data in PlotsTab (incl. new Waterfall plot)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants