Skip to content

DataFrame.to_json(orient='table') emits data:str instead of data:[dict,] after a number of requests under mod-wsgi  #20728

Closed
@akrherz

Description

@akrherz

Sadly, I don't have a SSCE for this, but the setup seems to reproduce the bug easily for me in production. I am currently using conda-forge current pandas (0.22.0) on python2.7 within a single threaded mod-wsgi daemon process. My general code is

df.to_json(orient='table', default_handler=str)

This will work for some number of sequential requests underneath mod-wsgi. By work, I mean the emitted JSON object has a data attribute with an array of dict objects, one for each row.

"data": [{"col":"val", "col2": "val2"},{"col":"val3", "col2": "val4"}...]

After some number of requests though, the emitted JSON looks like so

"data":"val    val2     ...\nval3   val4  ...\n"

restarting Apache/restarting mod-wsgi will return to_json to properly emitting the same data frame with the proper "data":[dict]

Output of pd.show_versions()

[paste the output of ``pd.show_versions()`` here below this line] >>> pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.14.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-862.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.utf8
LOCALE: None.None

pandas: 0.22.0
pytest: 3.5.0
pip: 9.0.3
setuptools: 39.0.1
Cython: 0.28.2
numpy: 1.14.2
scipy: 1.0.1
pyarrow: None
xarray: None
IPython: 5.6.0
sphinx: 1.7.2
patsy: 0.5.0
dateutil: 2.7.2
pytz: 2018.4
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.2.2
openpyxl: 2.5.2
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.4
lxml: 4.2.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.6
pymysql: None
psycopg2: 2.7.4 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

I have been fighting mod-wsgi for many moons now with various libs like numpy, matplotlib and pandas, so I suspect perhaps this just isn't a good idea. If you have a suggestion of a good long-run web process to run pandas within, I would be grateful to know as well. Thank you for your time!

Metadata

Metadata

Assignees

No one assigned

    Labels

    IO JSONread_json, to_json, json_normalize

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions