Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: AttributeError: 'MultiIndex' object has no attribute '_data' #640

Open
DRMPN opened this issue Sep 17, 2024 · 4 comments
Open

Bug: AttributeError: 'MultiIndex' object has no attribute '_data' #640

DRMPN opened this issue Sep 17, 2024 · 4 comments

Comments

@DRMPN
Copy link

DRMPN commented Sep 17, 2024

Hello.

Thank you for your work!

I'm using reports/old/reports.ipynb file. It produced all tables and some plots as intended.
However, running Strip plots cells in Visualizations results in runtime errors:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[37], line 2
      1 if 'binary' in problem_types:
----> 2     fig = draw_score_stripplot('result', 
      3                                results=all_res.sort_values(by=['framework']),
      4                                type_filter='binary', 
      5                                metadata=metadata,
      6                                xlabel=binary_result_label,
      7                                y_sort_by=tasks_sort_by,
      8                                hue_sort_by=frameworks_sort_key,
      9                                title=f"Results ({binary_result_label}) on {results_group} binary classification problems{title_extra}",
     10                                legend_labels=frameworks_labels,
     11                               );
     12     savefig(fig, create_file(output_dir, "visualizations", "binary_result_stripplot.png"))

File c:\Users\nnikitin-user\Desktop\automlbenchmark\amlb_report\visualizations\stripplot.py:72, in draw_score_stripplot(col, results, type_filter, metadata, y_sort_by, hue_sort_by, filename, **kwargs)
     69 hue = 'framework'
     70 hues = sorted(df[hue].unique(), key=hue_sort_by)
---> 72 fig = draw_stripplot(
     73     df,
     74     x=col,
     75     y=df.index,
     76     hue=hue,
     77     # ylabel='Task',
     78     y_labels=task_labels(df.index.unique()),
     79     hue_order=hues,
     80     legend_title="Framework",
     81     **kwargs
     82 )
     83 if filename:
     84     savefig(fig, create_file("graphics", config.results_group, filename))

File c:\Users\nnikitin-user\Desktop\automlbenchmark\amlb_report\visualizations\stripplot.py:27, in draw_stripplot(df, x, y, hue, xscale, xbound, hue_order, xlabel, ylabel, y_labels, title, legend_title, legend_loc, legend_labels, colormap, size)
     24 sb.despine(bottom=True, left=True)
     26 # Show each observation with a scatterplot
---> 27 sb.stripplot(data=df,
     28              x=x, y=y, hue=hue,
     29              hue_order=hue_order,
     30              palette=colormap,
     31              dodge=True, jitter=True,
     32              alpha=.25, zorder=1)
     34 # Show the conditional means
     35 sb.pointplot(data=df,
     36              x=x, y=y, hue=hue,
     37              hue_order=hue_order,
     38              palette=colormap,
     39              dodge=.5, join=False,
     40              markers='d', scale=.75, ci=None)

File c:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\seaborn\categorical.py:2082, in stripplot(data, x, y, hue, order, hue_order, jitter, dodge, orient, color, palette, size, edgecolor, linewidth, hue_norm, log_scale, native_scale, formatter, legend, ax, **kwargs)
   2074 def stripplot(
   2075     data=None, *, x=None, y=None, hue=None, order=None, hue_order=None,
   2076     jitter=True, dodge=False, orient=None, color=None, palette=None,
   (...)
   2079     ax=None, **kwargs
   2080 ):
-> 2082     p = _CategoricalPlotter(
   2083         data=data,
   2084         variables=dict(x=x, y=y, hue=hue),
   2085         order=order,
   2086         orient=orient,
   2087         color=color,
   2088         legend=legend,
   2089     )
   2091     if ax is None:
   2092         ax = plt.gca()

File c:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\seaborn\categorical.py:67, in _CategoricalPlotter.__init__(self, data, variables, order, orient, require_numeric, color, legend)
     56 def __init__(
     57     self,
     58     data=None,
   (...)
     64     legend="auto",
     65 ):
---> 67     super().__init__(data=data, variables=variables)
     69     # This method takes care of some bookkeeping that is necessary because the
     70     # original categorical plots (prior to the 2021 refactor) had some rules that
     71     # don't fit exactly into VectorPlotter logic. It may be wise to have a second
   (...)
     76     # default VectorPlotter rules. If we do decide to make orient part of the
     77     # _base variable assignment, we'll want to figure out how to express that.
     78     if self.input_format == "wide" and orient in ["h", "y"]:

File c:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\seaborn\_base.py:634, in VectorPlotter.__init__(self, data, variables)
    629 # var_ordered is relevant only for categorical axis variables, and may
    630 # be better handled by an internal axis information object that tracks
    631 # such information and is set up by the scale_* methods. The analogous
    632 # information for numeric axes would be information about log scales.
    633 self._var_ordered = {"x": False, "y": False}  # alt., used DefaultDict
--> 634 self.assign_variables(data, variables)
    636 # TODO Lots of tests assume that these are called to initialize the
    637 # mappings to default values on class initialization. I'd prefer to
    638 # move away from that and only have a mapping when explicitly called.
    639 for var in ["hue", "size", "style"]:

File c:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\seaborn\_base.py:679, in VectorPlotter.assign_variables(self, data, variables)
    674 else:
    675     # When dealing with long-form input, use the newer PlotData
    676     # object (internal but introduced for the objects interface)
    677     # to centralize / standardize data consumption logic.
    678     self.input_format = "long"
--> 679     plot_data = PlotData(data, variables)
    680     frame = plot_data.frame
    681     names = plot_data.names

File c:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\seaborn\_core\data.py:58, in PlotData.__init__(self, data, variables)
     51 def __init__(
     52     self,
     53     data: DataSource,
     54     variables: dict[str, VariableSpec],
     55 ):
     57     data = handle_data_source(data)
---> 58     frame, names, ids = self._assign_variables(data, variables)
     60     self.frame = frame
     61     self.names = names

File c:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\seaborn\_core\data.py:265, in PlotData._assign_variables(self, data, variables)
    260             ids[key] = id(val)
    262 # Construct a tidy plot DataFrame. This will convert a number of
    263 # types automatically, aligning on index in case of pandas objects
    264 # TODO Note: this fails when variable specs *only* have scalars!
--> 265 frame = pd.DataFrame(plot_data)
    267 return frame, names, ids

File c:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\frame.py:664, in DataFrame.__init__(self, data, index, columns, dtype, copy)
    658     mgr = self._init_mgr(
    659         data, axes={"index": index, "columns": columns}, dtype=dtype, copy=copy
    660     )
    662 elif isinstance(data, dict):
    663     # GH#38939 de facto copy defaults to False only in non-dict cases
--> 664     mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
    665 elif isinstance(data, ma.MaskedArray):
    666     import numpy.ma.mrecords as mrecords

File c:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\internals\construction.py:482, in dict_to_mgr(data, index, columns, dtype, typ, copy)
    480     columns = Index(keys)
    481     arrays = [com.maybe_iterable_to_list(data[k]) for k in keys]
--> 482     arrays = [arr if not isinstance(arr, Index) else arr._data for arr in arrays]
    484 if copy:
    485     if typ == "block":
    486         # We only need to copy arrays that will not get consolidated, i.e.
    487         #  only EA arrays

File c:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\internals\construction.py:482, in (.0)
    480     columns = Index(keys)
    481     arrays = [com.maybe_iterable_to_list(data[k]) for k in keys]
--> 482     arrays = [arr if not isinstance(arr, Index) else arr._data for arr in arrays]
    484 if copy:
    485     if typ == "block":
    486         # We only need to copy arrays that will not get consolidated, i.e.
    487         #  only EA arrays

AttributeError: 'MultiIndex' object has no attribute '_data'
@PGijsbers
Copy link
Collaborator

They are in old because they are intended to work with old (version ~1) result files. Newer versions are not completely backwards compatible. When working with new versions, please try the notebooks from https://github.com/pgijsbers/amlb-results for now, those are the ones used for the JMLR paper. Sorry about the confusion.

@PGijsbers PGijsbers reopened this Sep 18, 2024
@PGijsbers
Copy link
Collaborator

@Innixma you also have some visualization code, where can people find that?

@Innixma
Copy link
Collaborator

Innixma commented Sep 18, 2024

@PGijsbers visualization code exists here: https://github.com/Innixma/autogluon-benchmark

Example: https://github.com/Innixma/autogluon-benchmark/blob/master/v1_results/run_eval_tabrepo_v1.py

Running the above code generates the tables and figures shown here (roughly): https://github.com/Innixma/automl-arena

I plan to clean this up and make it more easily available as part of TabRepo 2.0.

@DRMPN
Copy link
Author

DRMPN commented Sep 18, 2024

That looks good, I will try that instead.
Thank you both ❤

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants