man-group · aschonfeld · Apr 9, 2020 · Apr 6, 2020
diff --git a/.circleci/config.yml b/.circleci/config.yml
@@ -5,7 +5,7 @@ defaults: &defaults
       CIRCLE_ARTIFACTS: /tmp/circleci-artifacts
       CIRCLE_TEST_REPORTS: /tmp/circleci-test-results
       CODECOV_TOKEN: b0d35139-0a75-427a-907b-2c78a762f8f0
-      VERSION: 1.8.6
+      VERSION: 1.8.7
       PANDOC_RELEASES_URL: https://github.com/jgm/pandoc/releases
     steps:
     - checkout

diff --git a/CHANGES.md b/CHANGES.md
@@ -1,5 +1,14 @@
 ## Changelog
 
+### 1.8.7 (2020-4-8)
+ * [#137](https://github.com/man-group/dtale/issues/137)
+ * [#141](https://github.com/man-group/dtale/issues/141)
+ * [#156](https://github.com/man-group/dtale/issues/156)
+ * [#160](https://github.com/man-group/dtale/issues/160)
+ * [#161](https://github.com/man-group/dtale/issues/161)
+ * [#162](https://github.com/man-group/dtale/issues/162)
+ * [#163](https://github.com/man-group/dtale/issues/163)
+
 ### 1.8.6 [hotfix] (2020-4-5)
  * updates to setup.py to include images
 

diff --git a/README.md b/README.md
@@ -1,6 +1,9 @@
 [![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Title.png)](https://github.com/man-group/dtale)
 
-[Live Demo](http://alphatechadmin.pythonanywhere.com)
+* [Live Demo](http://alphatechadmin.pythonanywhere.com)
+* [Animated US COVID-19 Deaths By State](http://alphatechadmin.pythonanywhere.com/charts/3?chart_type=maps&query=date+%3E+%2720200301%27&agg=raw&map_type=choropleth&loc_mode=USA-states&loc=state_code&map_val=deaths&colorscale=Reds&cpg=false&animate_by=date)
+* [3D Scatter Chart](http://alphatechadmin.pythonanywhere.com/charts/4?chart_type=3d_scatter&query=&x=date&z=Col0&agg=raw&cpg=false&y=%5B%22security_id%22%5D)
+* [Surface Chart](http://alphatechadmin.pythonanywhere.com/charts/4?chart_type=surface&query=&x=date&z=Col0&agg=raw&cpg=false&y=%5B%22security_id%22%5D)
 
 -----------------
 
@@ -41,9 +44,9 @@ D-Tale was the product of a SAS to Python conversion.  What was originally a per
   - [Dimensions/Main Menu](#dimensionsmain-menu)
   - [Header](#header)
   - [Main Menu Functions](#main-menu-functions)
-    - [Describe](#describe), [Custom Filter](#custom-filter), [Building Columns](#building-columns), [Summarize Data](#summarize-data), [Charts](#charts), [Coverage (Deprecated)](#coverage-deprecated), [Correlations](#correlations), [Heat Map](#heat-map), [Instances](#instances), [Code Exports](#code-exports), [About](#about), [Resize](#resize), [Shutdown](#shutdown)
+    - [Describe](#describe), [Outlier Detection](#outlier-detection), [Custom Filter](#custom-filter), [Building Columns](#building-columns), [Summarize Data](#summarize-data), [Charts](#charts), [Coverage (Deprecated)](#coverage-deprecated), [Correlations](#correlations), [Heat Map](#heat-map), [Highlight Dtypes](#highlight-dtypes), [Instances](#instances), [Code Exports](#code-exports), [About](#about), [Resize](#resize), [Shutdown](#shutdown)
   - [Column Menu Functions](#column-menu-functions)
-    - [Filtering](#filtering), [Moving Columns](#moving-columns), [Hiding Columns](#hiding-columns), [Lock](#lock), [Unlock](#unlock), [Sorting](#sorting), [Formats](#formats), [Column Analysis](#column-analysis)
+    - [Filtering](#filtering), [Moving Columns](#moving-columns), [Hiding Columns](#hiding-columns), [Delete](#delete), [Lock](#lock), [Unlock](#unlock), [Sorting](#sorting), [Formats](#formats), [Column Analysis](#column-analysis)
   - [Menu Functions Depending on Browser Dimensions](#menu-functions-depending-on-browser-dimensions)
 - [For Developers](#for-developers)
   - [Cloning](#cloning)
@@ -333,9 +336,34 @@ View all the columns & their data types as well as individual details of each co
 |int|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Describe_int.png)|Anything with standard numeric classifications (min, max, 25%, 50%, 75%) will have a nice boxplot with the mean (if it exists) displayed as an outlier if you look closely.|
 |float|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Describe_float.png)||
 
+#### Outlier Detection
+When viewing integer & float columns in the ["Describe" popup](#describe) you will see in the lower right-hand corner a toggle for Uniques & Outliers.
+
+|Outliers|Filter|
+|--------|------|
+|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/outliers.png)|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/outlier_filter.png)|
+
+If you click the "Outliers" toggle this will load the top 100 outliers in your column based on the following code snippet:
+```python
+s = df[column]
+q1 = s.quantile(0.25)
+q3 = s.quantile(0.75)
+iqr = q3 - q1
+iqr_lower = q1 - 1.5 * iqr
+iqr_upper = q3 + 1.5 * iqr
+outliers = s[(s < iqr_lower) | (s > iqr_upper)]
+```
+If you click on the "Apply outlier filter" link this will add an addtional "outlier" filter for this column which can be removed from the [header](#header) or the [custom filter](#custom-filter) shown in picture above to the right.
+
 #### Custom Filter
 Apply a custom pandas `query` to your data (link to pandas documentation included in popup)  
 
+|Editing|Result|
+|--------|:------:|
+|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Filter_apply.png)|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Post_filter.png)|
+
+You can also see any outlier or column filters you've applied (which will be included in addition to your custom query) and remove them if you'd like.
+
 Context Variables are user-defined values passed in via the `context_variables` argument to dtale.show(); they can be referenced in filters by prefixing the variable name with '@'.
 
 For example, here is how you can use context variables in a pandas query:
@@ -355,12 +383,6 @@ And here is how you would pass that context variable to D-Tale: `dtale.show(df,
 
 Here's some nice documentation on the performance of [pandas queries](https://pandas.pydata.org/pandas-docs/stable/user_guide/enhancingperf.html#pandas-eval-performance)
 
-
-|Editing|Result|
-|--------|:------:|
-|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Filter_apply.png)|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Post_filter.png)|
-
-
 #### Building Columns
 
 [![](http://img.youtube.com/vi/G6wNS9-lG04/0.jpg)](http://www.youtube.com/watch?v=G6wNS9-lG04 "Build Columns in D-Tale")
@@ -531,13 +553,25 @@ When the data being viewed in D-Tale has date or timestamp columns but for each
 |![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/rolling_corr_data.png)|![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/rolling_corr.png)|
 
 #### Heat Map
-This will hide any non-float columns (with the exception of the index on the right) and apply a color to the background of each cell
+This will hide any non-float or non-int columns (with the exception of the index on the right) and apply a color to the background of each cell.
+
   - Each float is renormalized to be a value between 0 and 1.0
+  - You have two options for the renormalization
+    - **By Col**: each value is calculated based on the min/max of its column
+    - **Overall**: each value is caluclated by the overall min/max of all the non-hidden float/int columns in the dataset
   - Each renormalized value is passed to a color scale of red(0) - yellow(0.5) - green(1.0)
 ![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Heatmap.png)
 
-Turn off Heat Map by clicking menu option again
-![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/Heatmap_toggle.png)
+Turn off Heat Map by clicking menu option you previously selected one more time
+
+#### Highlight Dtypes
+This is a quick way to check and see if your data has been categorized correctly.  By clicking this menu option it will assign a specific background color to each column of a specific data type
+|category|timedelta|float|int|date|string|
+|--------|---------|-----|------|------|-----|
+|#E1BEE7|#FFCC80|#B2DFDB|#BBDEFB|#F8BBD0|white|
+
+![](https://raw.githubusercontent.com/aschonfeld/dtale-media/master/images/highlight_dtypes.png)
+
 
 #### Code Exports
 *Code Exports* are small snippets of code representing the current state of the grid you're viewing including things like:
@@ -639,6 +673,10 @@ All column movements are saved on the server so refreshing your browser won't lo
 
 All column movements are saved on the server so refreshing your browser won't lose them :ok_hand:
 
+#### Delete
+
+As simple as it sounds, click this button to delete this column from your dataframe.  (Warning: not un-doable!)
+
 #### Lock
 Adds your column to "locked" columns
   - "locked" means that if you scroll horizontally these columns will stay pinned to the right-hand side
@@ -894,6 +932,7 @@ D-Tale works with:
     * Flask
     * Flask-Compress
     * Pandas
+    * plotly
     * scipy
     * six
   * Front-end
@@ -909,11 +948,12 @@ Original concept and implementation: [Andrew Schonfeld](https://github.com/ascho
 Contributors:
 
  * [Phillip Dupuis](https://github.com/phillipdupuis)
+ * [Fernando Saravia Rajal](https://github.com/fersarr)
  * [Dominik Christ](https://github.com/DominikMChrist)
+ * [Reza Moshkzar](https://github.com/reza1615)
  * [Chris Boddy](https://github.com/cboddy)
  * [Jason Holden](https://github.com/jasonkholden)
  * [Tom Taylor](https://github.com/TomTaylorLondon)
- * [Fernando Saravia Rajal](https://github.com/fersarr)
  * [Wilfred Hughes](https://github.com/Wilfred)
  * Mike Kelly
  * [Vincent Riemer](https://github.com/vincentriemer)

diff --git a/docker/2_7/Dockerfile b/docker/2_7/Dockerfile
@@ -44,4 +44,4 @@ WORKDIR /app
 
 RUN set -eux \
   ; . /root/.bashrc \
-  ; easy_install dtale-1.8.6-py2.7.egg
+  ; easy_install dtale-1.8.7-py2.7.egg
diff --git a/docker/3_6/Dockerfile b/docker/3_6/Dockerfile
@@ -44,4 +44,4 @@ WORKDIR /app
 
 RUN set -eux \
   ; . /root/.bashrc \
-  ; easy_install dtale-1.8.6-py3.7.egg
+  ; easy_install dtale-1.8.7-py3.7.egg
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -64,9 +64,9 @@
 # built documents.
 #
 # The short X.Y version.
-version = u'1.8.6'
+version = u'1.8.7'
 # The full version, including alpha/beta/rc tags.
-release = u'1.8.6'
+release = u'1.8.7'
 
 # The language for content autogenerated by Sphinx. Refer to documentation
 # for a list of supported languages.

diff --git a/dtale/charts/utils.py b/dtale/charts/utils.py
@@ -1,8 +1,11 @@
+import copy
+
 import pandas as pd
 
-from dtale.utils import (ChartBuildingError, classify_type,
+from dtale.utils import (ChartBuildingError, classify_type, find_dtype,
                          find_dtype_formatter, flatten_lists, get_dtypes,
-                         grid_columns, grid_formatter, json_int, make_list)
+                         grid_columns, grid_formatter, json_int, make_list,
+                         run_query)
 
 YAXIS_CHARTS = ['line', 'bar', 'scatter']
 ZAXIS_CHARTS = ['heatmap', '3d_scatter', 'surface']
@@ -187,7 +190,7 @@ def retrieve_chart_data(df, *args, **kwargs):
     all_code = ["chart_data = pd.concat(["] + all_code + ["], axis=1)"]
     if len(make_list(kwargs.get('group_val'))):
         filters = build_group_inputs_filter(all_data, kwargs['group_val'])
-        all_data = all_data.query(filters)
+        all_data = run_query(all_data, filters)
         all_code.append('chart_data = chart_data.query({})'.format(filters))
     return all_data, all_code
 
@@ -236,7 +239,7 @@ def check_exceptions(df, allow_duplicates, unlimited_data=False, data_limit=1500
         raise ChartBuildingError(limit_msg.format(data_limit))
 
 
-def build_agg_data(df, x, y, inputs, agg, z=None):
+def build_agg_data(df, x, y, inputs, agg, z=None, animate_by=None):
     """
     Builds aggregated data when an aggregation (sum, mean, max, min...) is selected from the front-end.
 
@@ -280,22 +283,32 @@ def build_agg_data(df, x, y, inputs, agg, z=None):
         return agg_df, code
 
     if z_exists:
-        groups = df.groupby([x] + make_list(y))
-        return getattr(groups[make_list(z)], agg)().reset_index(), [
+        idx_cols = make_list(animate_by) + [x] + make_list(y)
+        groups = df.groupby(idx_cols)
+        groups = getattr(groups[make_list(z)], agg)()
+        if animate_by is not None:
+            full_idx = pd.MultiIndex.from_product([df[c].unique() for c in idx_cols], names=idx_cols)
+            groups = groups.reindex(full_idx).fillna(0)
+        return groups.reset_index(), [
             "chart_data = chart_data.groupby(['{cols}'])[['{z}']].{agg}().reset_index()".format(
                 cols="', '".join([x] + make_list(y)), z=z, agg=agg
             )
         ]
-    groups = df.groupby(x)
-    return getattr(groups[y], agg)().reset_index(), [
+    idx_cols = make_list(animate_by) + [x]
+    groups = df.groupby(idx_cols)
+    groups = getattr(groups[y], agg)()
+    if animate_by is not None:
+        full_idx = pd.MultiIndex.from_product([df[c].unique() for c in idx_cols], names=idx_cols)
+        groups = groups.reindex(full_idx).fillna(0)
+    return groups.reset_index(), [
         "chart_data = chart_data.groupby('{x}')[['{y}']].{agg}().reset_index()".format(
             x=x, y=make_list(y)[0], agg=agg
         )
     ]
 
 
 def build_base_chart(raw_data, x, y, group_col=None, group_val=None, agg=None, allow_duplicates=False, return_raw=False,
-                     unlimited_data=False, **kwargs):
+                     unlimited_data=False, animate_by=None, **kwargs):
     """
     Helper function to return data for 'chart-data' & 'correlations-ts' endpoints.  Will return a dictionary of
     dictionaries (one for each series) which contain the data for the x & y axes of the chart as well as the minimum &
@@ -318,29 +331,32 @@ def build_base_chart(raw_data, x, y, group_col=None, group_val=None, agg=None, a
     :type allow_duplicates: bool, optional
     :return: dict
     """
-
-    data, code = retrieve_chart_data(raw_data, x, y, kwargs.get('z'), group_col, group_val=group_val)
+    group_fmt_overrides = {'I': lambda v, as_string: json_int(v, as_string=as_string, fmt='{}')}
+    data, code = retrieve_chart_data(raw_data, x, y, kwargs.get('z'), group_col, animate_by, group_val=group_val)
     x_col = str('x')
     y_cols = make_list(y)
     z_col = kwargs.get('z')
     z_cols = make_list(z_col)
     if group_col is not None and len(group_col):
-        data = data.sort_values(group_col + [x])
-        code.append("chart_data = chart_data.sort_values(['{cols}'])".format(cols="', '".join(group_col + [x])))
+        main_group = group_col
+        if animate_by is not None:
+            main_group = [animate_by] + main_group
+        sort_cols = main_group + [x]
+        data = data.sort_values(sort_cols)
+        code.append("chart_data = chart_data.sort_values(['{cols}'])".format(cols="', '".join(sort_cols)))
         check_all_nan(data, [x] + y_cols)
         data = data.rename(columns={x: x_col})
         code.append("chart_data = chart_data.rename(columns={'" + x + "': '" + x_col + "'})")
         if agg is not None and agg != 'raw':
-            data = data.groupby(group_col + [x_col])
+            data = data.groupby(main_group + [x_col])
             data = getattr(data, agg)().reset_index()
             code.append("chart_data = chart_data.groupby(['{cols}']).{agg}().reset_index()".format(
-                cols="', '".join(group_col + [x]), agg=agg
+                cols="', '".join(main_group + [x]), agg=agg
             ))
         MAX_GROUPS = 30
         group_vals = data[group_col].drop_duplicates()
         if len(group_vals) > MAX_GROUPS:
             dtypes = get_dtypes(group_vals)
-            group_fmt_overrides = {'I': lambda v, as_string: json_int(v, as_string=as_string, fmt='{}')}
             group_fmts = {c: find_dtype_formatter(dtypes[c], overrides=group_fmt_overrides) for c in group_col}
 
             group_f, _ = build_formatters(group_vals)
@@ -365,45 +381,73 @@ def build_base_chart(raw_data, x, y, group_col=None, group_val=None, agg=None, a
         )
 
         dtypes = get_dtypes(data)
-        group_fmt_overrides = {'I': lambda v, as_string: json_int(v, as_string=as_string, fmt='{}')}
         group_fmts = {c: find_dtype_formatter(dtypes[c], overrides=group_fmt_overrides) for c in group_col}
-        for group_val, grp in data.groupby(group_col):
-
-            def _group_filter():
-                for gv, gc in zip(make_list(group_val), group_col):
-                    classifier = classify_type(dtypes[gc])
-                    yield group_filter_handler(gc, group_fmts[gc](gv, as_string=True), classifier)
-            group_filter = ' and '.join(list(_group_filter()))
-            ret_data['data'][group_filter] = data_f.format_lists(grp)
+
+        def _load_groups(df):
+            for group_val, grp in df.groupby(group_col):
+
+                def _group_filter():
+                    for gv, gc in zip(make_list(group_val), group_col):
+                        classifier = classify_type(dtypes[gc])
+                        yield group_filter_handler(gc, group_fmts[gc](gv, as_string=True), classifier)
+
+                group_filter = ' and '.join(list(_group_filter()))
+                yield group_filter, data_f.format_lists(grp)
+
+        if animate_by is not None:
+            frame_fmt = find_dtype_formatter(dtypes[animate_by], overrides=group_fmt_overrides)
+            ret_data['frames'] = []
+            for frame_key, frame in data.sort_values(animate_by).groupby(animate_by):
+                ret_data['frames'].append(
+                    dict(data=dict(_load_groups(frame)), name=frame_fmt(frame_key, as_string=True))
+                )
+            ret_data['data'] = copy.deepcopy(ret_data['frames'][-1]['data'])
+        else:
+            ret_data['data'] = dict(_load_groups(data))
         return ret_data, code
-    sort_cols = [x] + (y_cols if len(z_cols) else [])
+    main_group = [x]
+    if animate_by is not None:
+        main_group = [animate_by] + main_group
+    sort_cols = main_group + (y_cols if len(z_cols) else [])
     data = data.sort_values(sort_cols)
     code.append("chart_data = chart_data.sort_values(['{cols}'])".format(cols="', '".join(sort_cols)))
-    check_all_nan(data, [x] + y_cols + z_cols)
+    check_all_nan(data, main_group + y_cols + z_cols)
     y_cols = [str(y_col) for y_col in y_cols]
-    data.columns = [x_col] + y_cols + z_cols
-    code.append("chart_data.columns = ['{cols}']".format(cols="', '".join([x_col] + y_cols + z_cols)))
+    data = data[main_group + y_cols + z_cols]
+    main_group[-1] = x_col
+    data.columns = main_group + y_cols + z_cols
+    code.append("chart_data.columns = ['{cols}']".format(cols="', '".join(main_group + y_cols + z_cols)))
     if agg is not None:
-        data, agg_code = build_agg_data(data, x_col, y_cols, kwargs, agg, z=z_col)
+        data, agg_code = build_agg_data(data, x_col, y_cols, kwargs, agg, z=z_col, animate_by=animate_by)
         code += agg_code
     data = data.dropna()
     if return_raw:
         return data.rename(columns={x_col: x})
     code.append("chart_data = chart_data.dropna()")
 
-    dupe_cols = [x_col] + (y_cols if len(z_cols) else [])
+    dupe_cols = main_group + (y_cols if len(z_cols) else [])
     check_exceptions(
         data[dupe_cols].rename(columns={'x': x}),
         allow_duplicates or agg == 'raw',
         unlimited_data=unlimited_data,
-        data_limit=40000 if len(z_cols) else 15000
+        data_limit=40000 if len(z_cols) or animate_by is not None else 15000
     )
     data_f, range_f = build_formatters(data)
+
     ret_data = dict(
-        data={str('all'): data_f.format_lists(data)},
         min={col: fmt(data[col].min(), None) for _, col, fmt in range_f.fmts if col in [x_col] + y_cols + z_cols},
         max={col: fmt(data[col].max(), None) for _, col, fmt in range_f.fmts if col in [x_col] + y_cols + z_cols}
     )
+    if animate_by is not None:
+        frame_fmt = find_dtype_formatter(find_dtype(data[animate_by]), overrides=group_fmt_overrides)
+        ret_data['frames'] = []
+        for frame_key, frame in data.sort_values(animate_by).groupby(animate_by):
+            ret_data['frames'].append(
+                dict(data={str('all'): data_f.format_lists(frame)}, name=frame_fmt(frame_key, as_string=True))
+            )
+        ret_data['data'] = copy.deepcopy(ret_data['frames'][-1]['data'])
+    else:
+        ret_data['data'] = {str('all'): data_f.format_lists(data)}
     return ret_data, code