Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: MultiIndex.from_frame #23141

Merged
merged 52 commits into from
Dec 9, 2018
Merged

Conversation

ms7463
Copy link
Contributor

@ms7463 ms7463 commented Oct 13, 2018

This pull request is to add the from_frame method of creating multiindexes. Along with this feature the helper method "squeeze" has been added for squeezing single level multiindexes to a standard index (analogous to df.squeeze).

Additionally the to_frame method was updated to guarantee that the order of the labels is preserved when converting a multiindex to a dataframe. Currently this cannot be guaranteed in Python 2.7 due to the use of a dictionary for creating the frame. With this change from_frame and to_frame are perfectly complementary.

@pep8speaks
Copy link

pep8speaks commented Oct 13, 2018

Hello @ArtinSarraf! Thanks for updating the PR.

Comment last updated on November 19, 2018 at 04:25 Hours UTC

@jreback
Copy link
Contributor

jreback commented Oct 13, 2018

huh? pls show a real usecase for this

in the future pls open an issue for discussion first

@ms7463
Copy link
Contributor Author

ms7463 commented Oct 13, 2018

Sorry should have tagged this issue opened ~2 months ago. The issue contains an example as well.

#22420

@ms7463
Copy link
Contributor Author

ms7463 commented Oct 13, 2018

Here are some more concrete examples of how these can be used as well.

Adding metadata.

df = pd.DataFrame(np.random.randn(10, 6), columns=pd.MultiIndex.from_tuples([('NY', 'KJFK'), ('NY', 'KLGA'), ('CA', 'KLAX'), ('CA', 'KBUR'), ('AB', 'CYEG'), ('AB', 'CYYC')], names=['state', 'ICAO']))
country_state_mapping = pd.DataFrame([['US', 'NY'], ['US', 'CA'], ['CDN', 'AB'], ['CDN', 'ON']], columns=['country', 'state'])
meta = df.columns.to_frame(index=False)
new_meta = meta.merge(country_state_mapping, on='state', how='left')[['country', 'state', 'ICAO']]
df.columns = pd.MultiIndex.from_frame(new_meta)  # this can only work if to_frame guarantees label ordering, which it currently does not in 2.7, but does with this PR.

Filtering

# continuing from previous
meta = df.columns.to_frame(index=False)
new_meta = meta.loc[meta.state.isin(['CA', 'AB'])]
df = df.reindex(columns=pd.MultiIndex.from_frame(new_meta))

The examples here are extremely simplified, but reflect how my coworkers and I use pandas for working with multiindices.

@codecov
Copy link

codecov bot commented Oct 13, 2018

Codecov Report

Merging #23141 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #23141      +/-   ##
==========================================
+ Coverage    92.2%    92.2%   +<.01%     
==========================================
  Files         162      162              
  Lines       51700    51709       +9     
==========================================
+ Hits        47670    47679       +9     
  Misses       4030     4030
Flag Coverage Δ
#multiple 90.6% <100%> (ø) ⬆️
#single 43.02% <40%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/indexes/multi.py 95.58% <100%> (+0.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fc64ca8...9159b2d. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Oct 14, 2018

you can do the same thing if everything is in long form.

In [93]: (pd.merge(df.unstack().reset_index(), country_state_mapping, on='state', how='left')
    ...:    .set_index(['country', 'state', 'ICAO', 'level_2'])
    ...:    .sort_index()
    ...:    .unstack(level=[0,1,2])
    ...:    .sort_index(axis=1)
    ...: )
Out[93]: 
                0                                                  
country       CDN                  US                              
state          AB                  CA                  NY          
ICAO         CYEG      CYYC      KBUR      KLAX      KJFK      KLGA
level_2                                                            
0        0.368735  0.428429 -1.173573  0.256197  1.792880  0.414737
1        0.993952 -0.313126 -1.814687 -0.025160 -0.775963 -0.665919
2       -0.635185  2.063095  0.063650 -0.939815  0.219432  1.489314
3       -0.156358  0.201501  0.344305 -1.562817 -0.103385  0.482686
4        0.551272 -1.628747 -1.387776 -1.400052 -1.493635  2.323896
5       -1.487852  0.466714  1.195675  0.006298  0.277463  1.170253
6       -0.453253  1.850580  0.240695  1.375471  0.144580  0.012171
7       -0.079981 -1.139398  0.855885  2.687475 -1.374322  0.543441
8       -1.432287  0.100697 -0.554310 -0.306952  1.011671 -0.723025
9       -0.235468  0.893887  0.638946  0.916163  0.149475  1.121286

What you are proposing is wide from manipulation, which though a nice idea, and in this particular case makes sense. Is not generaly purpose, nor easy for people to do (and requires sorting guarantees).

So why should this be added?

@ms7463
Copy link
Contributor Author

ms7463 commented Oct 14, 2018

One of the nice features of multiindexes is that they act as meta-dataframes for the associated numeric data, allowing a frame to be much more concise and memory efficient. Long form manipulations require multiple pivots (which can be time consuming) and can require significant extra memory. The relationship between multiindexes and a standard dataframe can also be demonstrated through this following example. One can substitute a multiindex (and I have seen this done before) with a separate metaframe like so:

df = pd.DataFrame(np.random.randn(10, 6))
meta = pd.DataFrame([('NY', 'KJFK'), ('NY', 'KLGA'), ('CA', 'KLAX'), ('CA', 'KBUR'), ('AB', 'CYEG'), ('AB', 'CYYC')], columns=['state', 'ICAO'])
df.loc[:, meta.loc[meta.state.isin(['NY', 'AB'])].index]  # equivalent to above filtering example

I believe that this, along with the fact that to_frame already exists shows that from_frame is a reasonable and logical complementary method to be added as a way to instantiate multiindexes.

Regarding the sorting guarantees of to_frame, I think that this is something that any user would reasonably expect, considering the similarities between a mi and a df. Additionally, not only does it keep manipulations, that might involve using it, consistent, but all other related casting methods (e.g. tolist) ensure the sorting of the resultant type matches that of the original multiindex. At the very least I think adding a parameter to to_frame for preserving the original sorting would make sense. It would be very easy to add to the existing implementation.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so need some discussions

and of course would need much testing - having some examples would acceptance possibikity

@ms7463
Copy link
Contributor Author

ms7463 commented Oct 14, 2018

I would be happy to contribute to any further discussions. Is there a more appropriate forum for that?

Also I will add tests to the PR this week.

Are there any types of examples in particular that would help move along the addition of this feature?

Thanks.

@jreback
Copy link
Contributor

jreback commented Oct 15, 2018

Are there any types of examples in particular that would help move along the addition of this feature?

some usecases, and showing how its possible now, but how the 'new' method makes it easeir.

@ms7463
Copy link
Contributor Author

ms7463 commented Oct 16, 2018

Here is a demo to show the benefits of the from_frame method. Thanks for taking the time to look.
Some tests have been added as well.

# for convenience of this demo
pd.MultiIndex.from_frame = staticmethod(lambda df: pd.MultiIndex.from_tuples(list(df.values), names=list(df)))

Setup df

# memory usage of below frame ~82000000
#    multiindexes this large are not uncommon when dealing with things such as weather data for many cities
#    with their associated meta location info (i.e. country, state, county, etc.)
df = df = pd.DataFrame(
    np.random.randn(10000, 1024), 
    index=pd.date_range('19910905', periods=10000, name='date'), 
    columns=pd.MultiIndex.from_product(
        [list('abcd'), list('efgh'), list('ijkl'), list('mnop'), list('qrst')], 
        names=['L1', 'L2', 'L3', 'L4', 'L5']
    )
)

meta_mapping = pd.DataFrame([['x', 'a'], ['x', 'b'], ['y', 'c'], ['z', 'd']], columns=['L0', 'L1'])

Adding meta data

# Long Form
# peak memory usage ~573440080  --  ~7x larger than original df
# execution time: ~13.7 secs

(
    df.unstack().reset_index().merge(meta_mapping, on='L1', how='left')
    .set_index(['L0', 'L1', 'L2', 'L3', 'L4', 'L5', 'date'])[0].unstack([0, 1, 2, 3, 4, 5])
)


# Wide Form
# peak memory usage ~82000000  --  no difference from original df
# execution time: ~0.271 sec  --  ~50x faster than long form
# cleaner and does not require explicitly writing out positional arguments or axis 0 names
# requires guaranteed ordering of to_frame - easy to guarantee/implement

(
    df.T.set_index(pd.MultiIndex.from_frame(
        df.columns.to_frame(index=False)
        .merge(meta_mapping, on='L1', how='left')[['L0', 'L1', 'L2', 'L3', 'L4', 'L5']]
    )).T
)

Quick Prototyping

Use this df for the example below

# Often find it easier to prototype meta frames in excel. I then need to use that frame as a multiindex.
# Assume the below snippet was copied from excel rather than pyperclip
pyperclip.copy('''
country   state   county        city       
USA       ct      fairfield     Greenwich  
USA       ct      fairfield     Stamford   
USA       ny      westchester   Bronxville 
USA       ny      bronx         Bronx      
USA       nj      monmouth      Rumson     
USA       nj      monmouth      Middletown 
USA       nj      ocean         Ocean      
Canada    on      york          Toronto    
''')
meta = pd.read_clipboard()
df = pd.DataFrame(np.random.randn(10, 8), columns=pd.MultiIndex.from_frame(meta))

# As far as I know there is no good and quick way to do this currently. (aside from the from_tuples method I showed above)

Formatting MultiIndex

Use this formatted df for remainder of demo

# With from_frame
meta = df.columns.to_frame(index=False)
meta = meta.assign(state=meta.state.str.upper(), county=meta.county.str.title())
df.columns = pd.MultiIndex.from_frame(meta)

# Without from_frame
# More verbose, not as clear and requires the use of positional levels (as far as I know there is no complementary set_level_values method for get_level_values, which would also be nice)
df.columns = df.columns.set_levels(df.columns.levels[1].str.upper(), level=1).set_levels(df.columns.levels[2].str.title(), level=2)

Complex Filtering/Slicing

meta = df.columns.to_frame(index=False)  # used for all from_frame examples below

Keep only counties with > 1 city

# With from_frame
df.reindex(columns=pd.MultiIndex.from_frame(meta.groupby('county').filter(lambda df: len(df)>1)))

# Long Form - same disadvantages as stated above
(
    df.unstack().reset_index().groupby('county').filter(lambda dfx: len(dfx.city.unique())>1)
    .set_index(['country', 'state', 'county', 'city', 'level_4']).unstack([0,1,2,3])[0]
)

Admittedly the 2 remaining examples below can be done without from_frame as well, like so:

# Requires to_frame to be ordered
mask = (meta...) & (meta...) | (meta...)
df.loc[:, mask]

Get CT + NJ without Greenwich

# With from_frame
filtered_df = meta.loc[(meta.state.isin(['CT', 'NJ'])) & (meta.city != 'Greenwich')]
df.reindex(columns=pd.MultiIndex.from_frame(filtered_df))

# Without from_frame
# While this is more concise, it requires two different chained methods, while the from_frame
# method simplifies the filter to a standard dataframe filter
df.loc[:, pd.IndexSlice['USA', ['CT', 'NJ'], :]].drop('Greenwich', axis=1, level='city')

Get all cities that don't start with "Bron"

# With from_frame
filtered_df = meta.loc[~meta.city.str.startswith('Bron')]
df.reindex(columns=pd.MultiIndex.from_frame(filtered_df))

# Without from_frame
# Not sure of a good way without going to long form

Custom Sorting

This is nice for formatting report output

Given the following custom ordering

order = {'state': ['NJ', 'CT', 'NY', 'ON'], 'country': ['USA', 'Canada']}

# With from_frame
def custom_sort(df, order):  # would be nice to see something like this as a df method (with some edge case handling)
    df = df.copy()
    orig_dtypes = df.dtypes.to_dict()
    for col in df:
        if col in order:
            df.loc[:, col] = pd.Categorical(df.loc[:, col], order[col])
    df = df.sort_values(by=df.columns.tolist())
    for col in df:
        df[col] = df[col].astype(orig_dtypes[col])
    return df

meta = df.columns.to_frame(index=False)
df.reindex(columns=pd.MultiIndex.from_frame(custom_sort(meta, order)))

# Without from_frame
# However, this requires explicitly listing out all the columns
# Not ideal/practical if you have many more items
df.iloc[:, [5,4,6,0,1,3,2,7]]

@ms7463
Copy link
Contributor Author

ms7463 commented Oct 16, 2018

Is there a way to rerun the Azure pipeline tests? The tests only failed for “Windows py36_np14” environment and the failed tests don’t seem to be related to my commits at all. I’m guessing it was a blip in the testing framework.

@ms7463
Copy link
Contributor Author

ms7463 commented Oct 20, 2018

@jreback

Please see additions to demo (Formatting and Custom Sorting):
#23141 (comment)

Also any ideas on the tests? I don't see anything in the logs that point to any changes I've made.

@ms7463
Copy link
Contributor Author

ms7463 commented Oct 27, 2018

@jreback. Any thoughts on the examples provided above?

@jreback
Copy link
Contributor

jreback commented Oct 28, 2018

@pandas-dev/pandas-core if anyone has thoughts here

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some issues with the docstrings. Also would be nice to add See Also sections. If you can take a look at the documentation for the docstrings that would be great.

Parameters
----------
df : pd.DataFrame
DataFrame to be converted to MultiIndex
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you finish with a period.

----------
df : pd.DataFrame
DataFrame to be converted to MultiIndex
squeeze : bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the default.

@classmethod
def from_frame(cls, df, squeeze=True):
"""
Make a MultiIndex from a dataframe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finish with period. Run ./scripts/validate_docstrings.py pandas.MultiIndex.from_frame to make sure the docstring follows all the standards we validate.

Use DataFrame instead of dataframe.


Returns
-------
index : MultiIndex
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for giving a name, just leave the type. Add a description in the next line (indented).

--------
>>> df = pd.DataFrame([[0, u'green'], [0, u'purple'], [1, u'green'],
[1, u'purple'], [2, u'green'], [2, u'purple']],
columns=[u'number', u'color'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is missing the ... for multine commands. Also show the content of df after creating it.

"""
Squeeze a single level multiindex to be a regular Index instane. If
the MultiIndex is more than a single level, return a copy of the
MultiIndex.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run validate_docstrings.py for this docstring, and also read the contributing docstring documentation too. There are several issues.

@ms7463
Copy link
Contributor Author

ms7463 commented Oct 28, 2018

@datapythonista. Changes have been made and pushed. I've also added from_frame to the "See Also" of the other 3 contructor methods. Thanks.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some doc comments. can you update advanced.rst (see where we use .from_arrays and add an example using .from_frame). ping on green.

@pytest.mark.parametrize('names_in,names_out', [
(None, [('L1', 'x'), ('L2', 'y')]),
(['x', 'y'], ['x', 'y']),
('bad_input', ValueError("Names should be list-like for a MultiIndex")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't you split this to 2 tests, with 1 the working cases, and 1 the error cases, easier to read

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@jreback
Copy link
Contributor

jreback commented Dec 4, 2018

looks pretty good to me, just some doc comments remain. @toobaz @datapythonista can you have a look and comment or approve.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small doc-comments. ping on green.

@@ -378,6 +378,7 @@ Backwards incompatible API changes
- Passing scalar values to :class:`DatetimeIndex` or :class:`TimedeltaIndex` will now raise ``TypeError`` instead of ``ValueError`` (:issue:`23539`)
- ``max_rows`` and ``max_cols`` parameters removed from :class:`HTMLFormatter` since truncation is handled by :class:`DataFrameFormatter` (:issue:`23818`)
- :meth:`read_csv` will now raise a ``ValueError`` if a column with missing values is declared as having dtype ``bool`` (:issue:`20591`)
- The column order of the resultant ``DataFrame`` from ``MultiIndex.to_frame()`` is now guaranteed to match the ``MultiIndex.names`` order. (:issue:`22420`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use a :meth: ref for MultiIndex.to_frame() and :attr: for .names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

3 1 jolly
4 2 joy
5 2 joy
>>> pd.MultiIndex.from_frame(df)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a blank line between cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

labels=[[0, 0, 1, 1, 2, 2], [0, 1, 0, 1, 2, 2]],
names=['will_be', 'used'])

>>> df = pd.DataFrame([['ahc', 'iam'], ['ahc', 'wim'], ['boh', 'amg'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a 1-line expln here (I think the first one is self-explanatorY)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

np.array([[1, 2], [3, 4], [5, 6]]),
27
])
def test_from_frame_non_frame(non_frame):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to test_from_frame_error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

# GH 22420
mi = pd.MultiIndex.from_arrays([
pd.date_range('19910905', periods=6, tz='US/Eastern'),
[1, 1, 1, 2, 2, 2],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a repeated test of the above, if so, then not necessary here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was at the suggestion of @TomAugspurger
#23141 (comment)

@jreback jreback added this to the 0.24.0 milestone Dec 6, 2018
@ms7463
Copy link
Contributor Author

ms7463 commented Dec 6, 2018

@jreback - any reason why I would get this linting error in pandas-dev.pandas tests ?

Bash exited with code '1'.

Nothing else accompanying it (I fixed the other ones that were there before). The local pep8 checks don't show anything, and the branch is up to date with master.

Same thing is happening on this PR (#23538)

@datapythonista
Copy link
Member

To see the error you need to click into the log. For many of them it's easy to make them red, and they are highlighted, but in this case it's an "error" with the ordering of the imports:

2018-12-06T03:44:33.5517264Z ERROR: /home/vsts/work/1/s/pandas/core/indexes/multi.py Imports are incorrectly sorted.
2018-12-06T03:44:33.5518052Z ERROR: /home/vsts/work/1/s/pandas/tests/indexes/multi/test_constructor.py Imports are incorrectly sorted.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, the docstring has some format errors and can be improved, but good work.


Parameters
----------
df : pd.DataFrame
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
df : pd.DataFrame
df : DataFrame

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

----------
df : pd.DataFrame
DataFrame to be converted to MultiIndex.
sortorder : int or None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sortorder : int or None
sortorder : int, optional

And please explain what it means to not be provided (None)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

labels=[[0, 0, 1, 1, 2, 2], [0, 1, 0, 1, 2, 2]],
names=['a', 'b'])

# Use explicit names, instead of column names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better as a paragraph than as a code comment (remove the # and leave a blank line after this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

labels=[[0, 0, 1, 1, 2, 2], [0, 1, 0, 1, 2, 2]],
names=['X', 'Y'])

See Also
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goes before the Examples section. ./scripts/validate_docstrings.py pandas.MultiIndex.from_frame should report it as an error. Make sure nothing is reported by the script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this docstring and all the other constructor methods docstrings (since I had to modify them to update the See Alsos). I also fixed the pd.MultiIndex docstring to the best of my abilities (since I had to make some small modifications to that too). However, there were still some issues with the MultiIndex docstring:

2 Errors found:
    Parameters {_set_identity, name, dtype} not documented
    Unknown parameters {labels}

Any ideas on how I should address these? Looks like labels is there to serve as a deprecation reminder.

@ms7463
Copy link
Contributor Author

ms7463 commented Dec 7, 2018

@jreback - tests are clean.

@jreback
Copy link
Contributor

jreback commented Dec 7, 2018

@ArtinSarraf one more merge master and looks good.

@jreback
Copy link
Contributor

jreback commented Dec 7, 2018

@datapythonista @toobaz if you have further comments.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for all the fixes to docstrings @ArtinSarraf

@ms7463
Copy link
Contributor Author

ms7463 commented Dec 8, 2018

@jreback - master has been merged and tests are green.

@toobaz
Copy link
Member

toobaz commented Dec 8, 2018

Looks good to me!

@jreback jreback merged commit d8fd5a6 into pandas-dev:master Dec 9, 2018
@jreback
Copy link
Contributor

jreback commented Dec 9, 2018

thanks @ArtinSarraf

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: pd.MultiIndex.from_frame(). Complement to pd.MultiIndex.to_frame().
9 participants