Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.8.7 #167

Merged
merged 1 commit into from
Apr 9, 2020
Merged

1.8.7 #167

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ defaults: &defaults
CIRCLE_ARTIFACTS: /tmp/circleci-artifacts
CIRCLE_TEST_REPORTS: /tmp/circleci-test-results
CODECOV_TOKEN: b0d35139-0a75-427a-907b-2c78a762f8f0
VERSION: 1.8.6
VERSION: 1.8.7
PANDOC_RELEASES_URL: https://github.com/jgm/pandoc/releases
steps:
- checkout
Expand Down
9 changes: 9 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,14 @@
## Changelog

### 1.8.7 (2020-4-8)
* [#137](https://github.com/man-group/dtale/issues/137)
* [#141](https://github.com/man-group/dtale/issues/141)
* [#156](https://github.com/man-group/dtale/issues/156)
* [#160](https://github.com/man-group/dtale/issues/160)
* [#161](https://github.com/man-group/dtale/issues/161)
* [#162](https://github.com/man-group/dtale/issues/162)
* [#163](https://github.com/man-group/dtale/issues/163)

### 1.8.6 [hotfix] (2020-4-5)
* updates to setup.py to include images

Expand Down
66 changes: 53 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
[![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Title.png)](https://github.com/man-group/dtale)

[Live Demo](http://alphatechadmin.pythonanywhere.com)
* [Live Demo](http://alphatechadmin.pythonanywhere.com)
* [Animated US COVID-19 Deaths By State](http://alphatechadmin.pythonanywhere.com/charts/3?chart_type=maps&query=date+%3E+%2720200301%27&agg=raw&map_type=choropleth&loc_mode=USA-states&loc=state_code&map_val=deaths&colorscale=Reds&cpg=false&animate_by=date)
* [3D Scatter Chart](http://alphatechadmin.pythonanywhere.com/charts/4?chart_type=3d_scatter&query=&x=date&z=Col0&agg=raw&cpg=false&y=%5B%22security_id%22%5D)
* [Surface Chart](http://alphatechadmin.pythonanywhere.com/charts/4?chart_type=surface&query=&x=date&z=Col0&agg=raw&cpg=false&y=%5B%22security_id%22%5D)

-----------------

Expand Down Expand Up @@ -41,9 +44,9 @@ D-Tale was the product of a SAS to Python conversion. What was originally a per
- [Dimensions/Main Menu](#dimensionsmain-menu)
- [Header](#header)
- [Main Menu Functions](#main-menu-functions)
- [Describe](#describe), [Custom Filter](#custom-filter), [Building Columns](#building-columns), [Summarize Data](#summarize-data), [Charts](#charts), [Coverage (Deprecated)](#coverage-deprecated), [Correlations](#correlations), [Heat Map](#heat-map), [Instances](#instances), [Code Exports](#code-exports), [About](#about), [Resize](#resize), [Shutdown](#shutdown)
- [Describe](#describe), [Outlier Detection](#outlier-detection), [Custom Filter](#custom-filter), [Building Columns](#building-columns), [Summarize Data](#summarize-data), [Charts](#charts), [Coverage (Deprecated)](#coverage-deprecated), [Correlations](#correlations), [Heat Map](#heat-map), [Highlight Dtypes](#highlight-dtypes), [Instances](#instances), [Code Exports](#code-exports), [About](#about), [Resize](#resize), [Shutdown](#shutdown)
- [Column Menu Functions](#column-menu-functions)
- [Filtering](#filtering), [Moving Columns](#moving-columns), [Hiding Columns](#hiding-columns), [Lock](#lock), [Unlock](#unlock), [Sorting](#sorting), [Formats](#formats), [Column Analysis](#column-analysis)
- [Filtering](#filtering), [Moving Columns](#moving-columns), [Hiding Columns](#hiding-columns), [Delete](#delete), [Lock](#lock), [Unlock](#unlock), [Sorting](#sorting), [Formats](#formats), [Column Analysis](#column-analysis)
- [Menu Functions Depending on Browser Dimensions](#menu-functions-depending-on-browser-dimensions)
- [For Developers](#for-developers)
- [Cloning](#cloning)
Expand Down Expand Up @@ -333,9 +336,34 @@ View all the columns & their data types as well as individual details of each co
|int|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Describe_int.png)|Anything with standard numeric classifications (min, max, 25%, 50%, 75%) will have a nice boxplot with the mean (if it exists) displayed as an outlier if you look closely.|
|float|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Describe_float.png)||

#### Outlier Detection
When viewing integer & float columns in the ["Describe" popup](#describe) you will see in the lower right-hand corner a toggle for Uniques & Outliers.

|Outliers|Filter|
|--------|------|
|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/outliers.png)|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/outlier_filter.png)|

If you click the "Outliers" toggle this will load the top 100 outliers in your column based on the following code snippet:
```python
s = df[column]
q1 = s.quantile(0.25)
q3 = s.quantile(0.75)
iqr = q3 - q1
iqr_lower = q1 - 1.5 * iqr
iqr_upper = q3 + 1.5 * iqr
outliers = s[(s < iqr_lower) | (s > iqr_upper)]
```
If you click on the "Apply outlier filter" link this will add an addtional "outlier" filter for this column which can be removed from the [header](#header) or the [custom filter](#custom-filter) shown in picture above to the right.

#### Custom Filter
Apply a custom pandas `query` to your data (link to pandas documentation included in popup)

|Editing|Result|
|--------|:------:|
|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Filter_apply.png)|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Post_filter.png)|

You can also see any outlier or column filters you've applied (which will be included in addition to your custom query) and remove them if you'd like.

Context Variables are user-defined values passed in via the `context_variables` argument to dtale.show(); they can be referenced in filters by prefixing the variable name with '@'.

For example, here is how you can use context variables in a pandas query:
Expand All @@ -355,12 +383,6 @@ And here is how you would pass that context variable to D-Tale: `dtale.show(df,

Here's some nice documentation on the performance of [pandas queries](https://pandas.pydata.org/pandas-docs/stable/user_guide/enhancingperf.html#pandas-eval-performance)


|Editing|Result|
|--------|:------:|
|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Filter_apply.png)|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Post_filter.png)|


#### Building Columns

[![](http://img.youtube.com/vi/G6wNS9-lG04/0.jpg)](http://www.youtube.com/watch?v=G6wNS9-lG04 "Build Columns in D-Tale")
Expand Down Expand Up @@ -531,13 +553,25 @@ When the data being viewed in D-Tale has date or timestamp columns but for each
|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/rolling_corr_data.png)|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/rolling_corr.png)|

#### Heat Map
This will hide any non-float columns (with the exception of the index on the right) and apply a color to the background of each cell
This will hide any non-float or non-int columns (with the exception of the index on the right) and apply a color to the background of each cell.

- Each float is renormalized to be a value between 0 and 1.0
- You have two options for the renormalization
- **By Col**: each value is calculated based on the min/max of its column
- **Overall**: each value is caluclated by the overall min/max of all the non-hidden float/int columns in the dataset
- Each renormalized value is passed to a color scale of red(0) - yellow(0.5) - green(1.0)
![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Heatmap.png)

Turn off Heat Map by clicking menu option again
![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Heatmap_toggle.png)
Turn off Heat Map by clicking menu option you previously selected one more time

#### Highlight Dtypes
This is a quick way to check and see if your data has been categorized correctly. By clicking this menu option it will assign a specific background color to each column of a specific data type
|category|timedelta|float|int|date|string|
|--------|---------|-----|------|------|-----|
|#E1BEE7|#FFCC80|#B2DFDB|#BBDEFB|#F8BBD0|white|

![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/highlight_dtypes.png)


#### Code Exports
*Code Exports* are small snippets of code representing the current state of the grid you're viewing including things like:
Expand Down Expand Up @@ -639,6 +673,10 @@ All column movements are saved on the server so refreshing your browser won't lo

All column movements are saved on the server so refreshing your browser won't lose them :ok_hand:

#### Delete

As simple as it sounds, click this button to delete this column from your dataframe. (Warning: not un-doable!)

#### Lock
Adds your column to "locked" columns
- "locked" means that if you scroll horizontally these columns will stay pinned to the right-hand side
Expand Down Expand Up @@ -894,6 +932,7 @@ D-Tale works with:
* Flask
* Flask-Compress
* Pandas
* plotly
* scipy
* six
* Front-end
Expand All @@ -909,11 +948,12 @@ Original concept and implementation: [Andrew Schonfeld](https://github.com/ascho
Contributors:

* [Phillip Dupuis](https://github.com/phillipdupuis)
* [Fernando Saravia Rajal](https://github.com/fersarr)
* [Dominik Christ](https://github.com/DominikMChrist)
* [Reza Moshkzar](https://github.com/reza1615)
* [Chris Boddy](https://github.com/cboddy)
* [Jason Holden](https://github.com/jasonkholden)
* [Tom Taylor](https://github.com/TomTaylorLondon)
* [Fernando Saravia Rajal](https://github.com/fersarr)
* [Wilfred Hughes](https://github.com/Wilfred)
* Mike Kelly
* [Vincent Riemer](https://github.com/vincentriemer)
Expand Down
2 changes: 1 addition & 1 deletion docker/2_7/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -44,4 +44,4 @@ WORKDIR /app

RUN set -eux \
; . /root/.bashrc \
; easy_install dtale-1.8.6-py2.7.egg
; easy_install dtale-1.8.7-py2.7.egg
2 changes: 1 addition & 1 deletion docker/3_6/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -44,4 +44,4 @@ WORKDIR /app

RUN set -eux \
; . /root/.bashrc \
; easy_install dtale-1.8.6-py3.7.egg
; easy_install dtale-1.8.7-py3.7.egg
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,9 +64,9 @@
# built documents.
#
# The short X.Y version.
version = u'1.8.6'
version = u'1.8.7'
# The full version, including alpha/beta/rc tags.
release = u'1.8.6'
release = u'1.8.7'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
110 changes: 77 additions & 33 deletions dtale/charts/utils.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
import copy

import pandas as pd

from dtale.utils import (ChartBuildingError, classify_type,
from dtale.utils import (ChartBuildingError, classify_type, find_dtype,
find_dtype_formatter, flatten_lists, get_dtypes,
grid_columns, grid_formatter, json_int, make_list)
grid_columns, grid_formatter, json_int, make_list,
run_query)

YAXIS_CHARTS = ['line', 'bar', 'scatter']
ZAXIS_CHARTS = ['heatmap', '3d_scatter', 'surface']
Expand Down Expand Up @@ -187,7 +190,7 @@ def retrieve_chart_data(df, *args, **kwargs):
all_code = ["chart_data = pd.concat(["] + all_code + ["], axis=1)"]
if len(make_list(kwargs.get('group_val'))):
filters = build_group_inputs_filter(all_data, kwargs['group_val'])
all_data = all_data.query(filters)
all_data = run_query(all_data, filters)
all_code.append('chart_data = chart_data.query({})'.format(filters))
return all_data, all_code

Expand Down Expand Up @@ -236,7 +239,7 @@ def check_exceptions(df, allow_duplicates, unlimited_data=False, data_limit=1500
raise ChartBuildingError(limit_msg.format(data_limit))


def build_agg_data(df, x, y, inputs, agg, z=None):
def build_agg_data(df, x, y, inputs, agg, z=None, animate_by=None):
"""
Builds aggregated data when an aggregation (sum, mean, max, min...) is selected from the front-end.
Expand Down Expand Up @@ -280,22 +283,32 @@ def build_agg_data(df, x, y, inputs, agg, z=None):
return agg_df, code

if z_exists:
groups = df.groupby([x] + make_list(y))
return getattr(groups[make_list(z)], agg)().reset_index(), [
idx_cols = make_list(animate_by) + [x] + make_list(y)
groups = df.groupby(idx_cols)
groups = getattr(groups[make_list(z)], agg)()
if animate_by is not None:
full_idx = pd.MultiIndex.from_product([df[c].unique() for c in idx_cols], names=idx_cols)
groups = groups.reindex(full_idx).fillna(0)
return groups.reset_index(), [
"chart_data = chart_data.groupby(['{cols}'])[['{z}']].{agg}().reset_index()".format(
cols="', '".join([x] + make_list(y)), z=z, agg=agg
)
]
groups = df.groupby(x)
return getattr(groups[y], agg)().reset_index(), [
idx_cols = make_list(animate_by) + [x]
groups = df.groupby(idx_cols)
groups = getattr(groups[y], agg)()
if animate_by is not None:
full_idx = pd.MultiIndex.from_product([df[c].unique() for c in idx_cols], names=idx_cols)
groups = groups.reindex(full_idx).fillna(0)
return groups.reset_index(), [
"chart_data = chart_data.groupby('{x}')[['{y}']].{agg}().reset_index()".format(
x=x, y=make_list(y)[0], agg=agg
)
]


def build_base_chart(raw_data, x, y, group_col=None, group_val=None, agg=None, allow_duplicates=False, return_raw=False,
unlimited_data=False, **kwargs):
unlimited_data=False, animate_by=None, **kwargs):
"""
Helper function to return data for 'chart-data' & 'correlations-ts' endpoints. Will return a dictionary of
dictionaries (one for each series) which contain the data for the x & y axes of the chart as well as the minimum &
Expand All @@ -318,29 +331,32 @@ def build_base_chart(raw_data, x, y, group_col=None, group_val=None, agg=None, a
:type allow_duplicates: bool, optional
:return: dict
"""

data, code = retrieve_chart_data(raw_data, x, y, kwargs.get('z'), group_col, group_val=group_val)
group_fmt_overrides = {'I': lambda v, as_string: json_int(v, as_string=as_string, fmt='{}')}
data, code = retrieve_chart_data(raw_data, x, y, kwargs.get('z'), group_col, animate_by, group_val=group_val)
x_col = str('x')
y_cols = make_list(y)
z_col = kwargs.get('z')
z_cols = make_list(z_col)
if group_col is not None and len(group_col):
data = data.sort_values(group_col + [x])
code.append("chart_data = chart_data.sort_values(['{cols}'])".format(cols="', '".join(group_col + [x])))
main_group = group_col
if animate_by is not None:
main_group = [animate_by] + main_group
sort_cols = main_group + [x]
data = data.sort_values(sort_cols)
code.append("chart_data = chart_data.sort_values(['{cols}'])".format(cols="', '".join(sort_cols)))
check_all_nan(data, [x] + y_cols)
data = data.rename(columns={x: x_col})
code.append("chart_data = chart_data.rename(columns={'" + x + "': '" + x_col + "'})")
if agg is not None and agg != 'raw':
data = data.groupby(group_col + [x_col])
data = data.groupby(main_group + [x_col])
data = getattr(data, agg)().reset_index()
code.append("chart_data = chart_data.groupby(['{cols}']).{agg}().reset_index()".format(
cols="', '".join(group_col + [x]), agg=agg
cols="', '".join(main_group + [x]), agg=agg
))
MAX_GROUPS = 30
group_vals = data[group_col].drop_duplicates()
if len(group_vals) > MAX_GROUPS:
dtypes = get_dtypes(group_vals)
group_fmt_overrides = {'I': lambda v, as_string: json_int(v, as_string=as_string, fmt='{}')}
group_fmts = {c: find_dtype_formatter(dtypes[c], overrides=group_fmt_overrides) for c in group_col}

group_f, _ = build_formatters(group_vals)
Expand All @@ -365,45 +381,73 @@ def build_base_chart(raw_data, x, y, group_col=None, group_val=None, agg=None, a
)

dtypes = get_dtypes(data)
group_fmt_overrides = {'I': lambda v, as_string: json_int(v, as_string=as_string, fmt='{}')}
group_fmts = {c: find_dtype_formatter(dtypes[c], overrides=group_fmt_overrides) for c in group_col}
for group_val, grp in data.groupby(group_col):

def _group_filter():
for gv, gc in zip(make_list(group_val), group_col):
classifier = classify_type(dtypes[gc])
yield group_filter_handler(gc, group_fmts[gc](gv, as_string=True), classifier)
group_filter = ' and '.join(list(_group_filter()))
ret_data['data'][group_filter] = data_f.format_lists(grp)

def _load_groups(df):
for group_val, grp in df.groupby(group_col):

def _group_filter():
for gv, gc in zip(make_list(group_val), group_col):
classifier = classify_type(dtypes[gc])
yield group_filter_handler(gc, group_fmts[gc](gv, as_string=True), classifier)

group_filter = ' and '.join(list(_group_filter()))
yield group_filter, data_f.format_lists(grp)

if animate_by is not None:
frame_fmt = find_dtype_formatter(dtypes[animate_by], overrides=group_fmt_overrides)
ret_data['frames'] = []
for frame_key, frame in data.sort_values(animate_by).groupby(animate_by):
ret_data['frames'].append(
dict(data=dict(_load_groups(frame)), name=frame_fmt(frame_key, as_string=True))
)
ret_data['data'] = copy.deepcopy(ret_data['frames'][-1]['data'])
else:
ret_data['data'] = dict(_load_groups(data))
return ret_data, code
sort_cols = [x] + (y_cols if len(z_cols) else [])
main_group = [x]
if animate_by is not None:
main_group = [animate_by] + main_group
sort_cols = main_group + (y_cols if len(z_cols) else [])
data = data.sort_values(sort_cols)
code.append("chart_data = chart_data.sort_values(['{cols}'])".format(cols="', '".join(sort_cols)))
check_all_nan(data, [x] + y_cols + z_cols)
check_all_nan(data, main_group + y_cols + z_cols)
y_cols = [str(y_col) for y_col in y_cols]
data.columns = [x_col] + y_cols + z_cols
code.append("chart_data.columns = ['{cols}']".format(cols="', '".join([x_col] + y_cols + z_cols)))
data = data[main_group + y_cols + z_cols]
main_group[-1] = x_col
data.columns = main_group + y_cols + z_cols
code.append("chart_data.columns = ['{cols}']".format(cols="', '".join(main_group + y_cols + z_cols)))
if agg is not None:
data, agg_code = build_agg_data(data, x_col, y_cols, kwargs, agg, z=z_col)
data, agg_code = build_agg_data(data, x_col, y_cols, kwargs, agg, z=z_col, animate_by=animate_by)
code += agg_code
data = data.dropna()
if return_raw:
return data.rename(columns={x_col: x})
code.append("chart_data = chart_data.dropna()")

dupe_cols = [x_col] + (y_cols if len(z_cols) else [])
dupe_cols = main_group + (y_cols if len(z_cols) else [])
check_exceptions(
data[dupe_cols].rename(columns={'x': x}),
allow_duplicates or agg == 'raw',
unlimited_data=unlimited_data,
data_limit=40000 if len(z_cols) else 15000
data_limit=40000 if len(z_cols) or animate_by is not None else 15000
)
data_f, range_f = build_formatters(data)

ret_data = dict(
data={str('all'): data_f.format_lists(data)},
min={col: fmt(data[col].min(), None) for _, col, fmt in range_f.fmts if col in [x_col] + y_cols + z_cols},
max={col: fmt(data[col].max(), None) for _, col, fmt in range_f.fmts if col in [x_col] + y_cols + z_cols}
)
if animate_by is not None:
frame_fmt = find_dtype_formatter(find_dtype(data[animate_by]), overrides=group_fmt_overrides)
ret_data['frames'] = []
for frame_key, frame in data.sort_values(animate_by).groupby(animate_by):
ret_data['frames'].append(
dict(data={str('all'): data_f.format_lists(frame)}, name=frame_fmt(frame_key, as_string=True))
)
ret_data['data'] = copy.deepcopy(ret_data['frames'][-1]['data'])
else:
ret_data['data'] = {str('all'): data_f.format_lists(data)}
return ret_data, code


Expand Down
Loading