isin fails with large series/lists of tuples

#### Code Sample

```python
from random import randint

def bug_report(n=2000000, idmax=22750, prodmax=3414341):
    ids = [randint(1, idmax) for _ in range(n)]
    r = lambda: randint(1, prodmax)
    prods = [(-1,-2,-3), (-1,-2,-3)] + [(r(), r(), r()) for _ in range(n-2)]
    
    df = pd.DataFrame({'ids': ids, 'products': prods})
    counts = df['products'].value_counts()
    counts_idxs = counts[counts >= 2].index
    idxs = df['products'].isin(counts_idxs)
    return df[idxs]
```
#### Problem description

There are several ways to trigger the bug, either of them resulting in `isin` returning all `False` whereas some indexes should be `True`.

Take the example above, we have the tuple `(-1,-2,-3)` repeated twice, and it can be checked that both `counts` and `counts_idxs` are `2` and `(-1,-2,-3)`, respectively. Then, independently from the rest of the `products`, the resulting dataset from taking the `idxs` from `isin` should have, at least, 2 items. Calling the function as is, does not. Explanation, causes and possible solutions below:

Manually importing `from pandas.core.algorithms import isin` and settings `idxs = isin(df['products'], counts[counts >= 2].index)` results in the exact same behaviour.

I've tried to reproduce this same behaviour when not using tuples at all and I can't seem to succeed.

#### Proposed solution

This seems to be a regression in `0.20.x` as using latest `0.19.x` (0.19.2) works perfectly fine. Indeed, manually copying `isin` from `0.19.x` and using it instead of `0.20.x` works. One can see that a particular if was reversed/erased in
https://github.com/pandas-dev/pandas/blob/master/pandas/core/algorithms.py#L414
and
https://github.com/pandas-dev/pandas/blob/v0.19.2/pandas/core/algorithms.py#L144
https://github.com/pandas-dev/pandas/blob/v0.19.2/pandas/core/algorithms.py#L161

This results in `0.20.x` relying in `numpy.in1d` whereas `0.19.x` used `lib.ismember`, which is equivalent to `htable.ismember_object` in `0.20.x`. One can confirm this becase:

```python
htable = pandas._libs.hashtable
idxs = htable.ismember_object(df['products'].values, np.asarray(counts[counts >= 2].index))
df[idxs]
```

works fine, whereas

```python
idxs = np.in1d(df['products'].values, np.asarray(counts[counts >= 2].index))
all_sets[idxs]
```

silently fails.

Now, either this is temporally fixed in pandas by not relying in `in1d` or an issue is submitted to numpy (which I will do once I can take a look at `in1d` and see what's happening). Also, one can solve it by not using tuples at all, and applying `hash` beforehand, for example.

I've narrowed a bit more the problem and it is not only related to `n` but also `prodmax`:

Any combination with `n > 1000001 && prodmax > 1986` produces and empty dataframe:
```python
bug_report(n=1000001, prodmax=1987)
bug_report(n=1000001)
bug_report()
```

Whereas having `n <= 1000000` or `prodmax <= 1986` works just fine. Parameter values have been deduced from:

* `n` from https://github.com/pandas-dev/pandas/blob/master/pandas/core/algorithms.py#L414
* `prodmax` by binary search:
```python
def narrow():
    start = 256
    end = 2048
    while start + 1 < end:
        print(start, end)
        df = bug_report_4(n=1000001, prodmax=(start + end) // 2)
        if df.empty:
            end = (start + end) // 2
        else:
            start = (start + end) // 2
    
    return start, df.empty

narrow()
# (1896, False)
```
#### Output of ``pd.show_versions()``

<details>
INSTALLED VERSIONS
------------------
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-72-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.3
pytest: 3.0.5
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.2
bs4: 4.5.3
html5lib: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: None
jinja2: 2.9.4
s3fs: None
pandas_gbq: None
pandas_datareader: None

</details>

This has been confirmed and tested in multiple pcs and environments, always Python 3.x

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

isin fails with large series/lists of tuples #17910

Code Sample

Problem description

Proposed solution

Output of `pd.show_versions()`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

isin fails with large series/lists of tuples #17910

Description

Code Sample

Problem description

Proposed solution

Output of pd.show_versions()

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Output of `pd.show_versions()`