Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Error when using CyLP #1022

Open
tanelv opened this issue Nov 23, 2021 · 36 comments
Open

BUG: Error when using CyLP #1022

tanelv opened this issue Nov 23, 2021 · 36 comments
Assignees
Labels
Bug Issue in the Code

Comments

@tanelv
Copy link

tanelv commented Nov 23, 2021

After finally managing to successfully install CyLP, using it in phase_proc_lp (pyart.correct.phase_proc_lp(radar, 2.0, self_const = 12000.0, low_z=0.0, high_z=53.0, min_phidp=0.01, min_ncp=0.3, min_rhv=0.8, LP_solver='cylp_mp', proc=15)) does not work. The error seems to be
"Error in `python': free(): invalid pointer: 0x00005597c77d6c98"

A long list of messages and memory map is being printed out:
cylp_messages.txt And then the script just hangs.

I installed CyLP following these instructions https://github.com/coin-or/CyLP

I tried also installing CyLP following these instructions provided in the Py-ART documentation https://arm-doe.github.io/pyart/setting_up_an_environment.html but unsuccessfully. I got what looked like compiling issues even after installing additional conda compilers. So the original CyLP installation instructions worked, but for some reason the phase_proc_lp function is not working still.

@zssherman
Copy link
Collaborator

Hmmm, haven't seen that error before. How large are the files you are trying to process? What OS are you using? I'll try to install using their methods to see if I can reproduce, I usually use the pip install of the python branch of jjhelmus as it seemed to usually be more stable, but confused why the compilers wouldn't have helped.

@zssherman
Copy link
Collaborator

Also, while I try to see if I can reproduce etc, I recommend possibly opening on issue on their issue tracker as well. Maybe someone else has experienced the issue as well there.

@kmuehlbauer
Copy link
Contributor

kmuehlbauer commented Nov 23, 2021

@tanelv I'm wondering why there are two different environments involved (cbc and wradlib_xr) in the traceback? If things get picked up from another environment this is usually a big source of problems.

If you can provide any additional details, this would help very much for diagnosing.

@zssherman
Copy link
Collaborator

Ah good catch @kmuehlbauer ! Yeah I second that as well.

@tanelv
Copy link
Author

tanelv commented Nov 23, 2021

Hm, good points. The files are IRIS RAW files, around 5-10 MB each. I use CentOS 7:

(cbc) [a93859@stage63 ~]$ cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

The files should be OK, I used the phase_proc_lp function with CyLP on the same files on an older server (running on Scientific Linux 6.9) successfully, but our university is moving to the new HPC system and I would need to migrate to this new system.

Yes, there are two environments. I first tried to get CyLP working under the wradlib_xr environment (where there are both wradlib and pyart installations), but as I could not get it working there I decided to try to make a new environment only for pyart (the cbc env). Actually I managed to get to the same point in wradlib_xr env. CyLP installation finally succeeded, but the script hangs with the same error. When running the script in wradlib_xr env, the traceback does not refer to the other env. But if the two environments still might cause troubles, should I delete both and make a new environment and try to install there?

@kmuehlbauer
Copy link
Contributor

But if the two environments still might cause troubles, should I delete both and make a new environment and try to install there?

Just to be on the safe side. It might not solve the issue, but we would know for sure then.

@tanelv
Copy link
Author

tanelv commented Nov 24, 2021

Sorry for taking so long to answer. I tried to first update the current Anaconda installation, but as it stayed solving the environment for more than 6 hours I stopped it, removed Anaconda completely and installed a new Anaconda from scratch using the current latest version (https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh).
These are the steps I took:

wget https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh
bash Anaconda3-2021.11-Linux-x86_64.sh
conda create -n pyart_py38 python=3.8 arm_pyart coin-or-cbc numba gdal -c conda-forge
conda activate pyart_py38
conda install -c conda-forge pkg-config
pip install cylp

To install CyLP I followed these instructions https://github.com/coin-or/CyLP

And it still hangs with the same error as before (*** Error in `python': free(): invalid pointer: 0x000055a16c1efc68 ***):
cylp_log2.txt (in the log you can also see all the steps I took starting from creating the new environment)

@zssherman
Copy link
Collaborator

zssherman commented Nov 29, 2021

Sorry for the late response, I'm wondering if trying to older version would work. I'm not familiar with the new CyLP version, so seeing what I can find out about it, but maybe install coincbc with:
conda install -c conda-forge coincbc
then for cylp:
pip install git+https://github.com/jjhelmus/CyLP.git@py3
has worked for our package cmac. I would still raise an issue on the CyLP issue tracker as well.

@tanelv
Copy link
Author

tanelv commented Nov 30, 2021

Thanks for the suggestion. To be safe I removed the old pyart_py38 env and created a new one with the same name. I tried to install using the above suggestions with jjhelmus version, but I get compilation errors, no matter which compiler I use
error: command '/gpfs/space/home/a93859/anaconda3/envs/pyart_py38/bin/x86_64-conda_cos6-linux-gnu-gcc' failed with exit status 1
Full error log here cylp_compile_error_log.txt
I now also raised an issue on the CyLP issue tracker coin-or/CyLP#136

@tkralphs
Copy link

tkralphs commented Nov 30, 2021

The fork https://github.com/jjhelmus/CyLP/tree/py3 of CyLP has been merged into master (see coin-or/CyLP#28) and other things have been fixed since then, so I doubt that rolling back to that version will help. Also, coincbc now just installs coin-or-cbc (see conda-forge/coin-or-cbc-feedstock#11), so that also shouldn't make a difference. If you can replicate this in stand-alone CyLP (or even better in stand-alone Cbc), I can take a look, but there's not enough information in coin-or/CyLP#136 to even start to debug.

@zssherman
Copy link
Collaborator

Thanks @tkralphs for the response! Yeah makes sense, I'll keep trying to see if I can reproduce the error. @tanelv Are you able to share one of the files that your using?

@tanelv
Copy link
Author

tanelv commented Nov 30, 2021

Here is one of the files SUR190511130002.zip (IRIS raw)

@kmuehlbauer
Copy link
Contributor

@zssherman It would be great if we could join forces on this one. I'm interested in getting this working too.

@zssherman
Copy link
Collaborator

@kmuehlbauer Awesome, yeah that sounds like a great idea to me! I haven't been able to reproduce the specific error yet, but the code is hanging up on these files. So been digging through the code to see. Also have tried not using the multi processing version of the code to try to isolate the problem.

@kmuehlbauer
Copy link
Contributor

@zssherman my idea is to start from the last working environment, if we could identify such. Then we could increase versions and see which one breaks. Ideally we would set this up using CI in a dedicated branch in our pyart forks. I'll try to get something running, but this might take some time.

@zssherman
Copy link
Collaborator

zssherman commented Nov 30, 2021

@kmuehlbauer Gotcha sound good! So I did try the coincbc conda-forge install with the py3 branch in python3.6 just to try anything, and I was able to run the cylp code. When I updated python and cylp is when I started to hang and any file I tried including the user's file above. The py3 branch of cylp only works for python3.6. Python3.6 is far back, so not sure how useful, but between then and now is when something changed. Whether the current kdp proccesing code doesn't handle the current changes and needs to be updated or something else is causing memory issues. I'm trying to check the coin-or-cbc as well.

@zssherman
Copy link
Collaborator

@kmuehlbauer The environment i used was:
conda create -n cylp_test -c conda-forge python=3.6 numpy netCDF4 coin-or-cbc scipy matplotlib cython gcc_linux-64 gxx_linux-64
with a development install of pyart and github install of the python 3 branch of cylp

@tanelv
Copy link
Author

tanelv commented Dec 1, 2021

Thanks @zssherman for the Python 3.6 reference. I also managed to get CyLP installed in Python 3.6 and my script now runs as it should. These are the steps I took (I removed the old environment before that)

conda create -n pyart_py36 -c conda-forge python=3.6 numpy netCDF4 scipy matplotlib cython gcc_linux-64 gxx_linux-64 arm_pyart coincbc gdal
conda activate pyart_py36
pip install git+https://github.com/jjhelmus/CyLP.git@py3

@kmuehlbauer
Copy link
Contributor

@zssherman Just FYI, I've recreated the Python 3.6 environment as suggested. It worked. I've created other environments for Python 3.7 /3.8 and 3.9. It looked promising first, but now nothing works, even the Python 3.6 environment doesn't work. I have to restart from scratch.

I've found those interesting issues over at CyLP, which might be connected:

Also I found that we have to be careful with the cython version and we would need to recreate the cpp in any case.

@zssherman
Copy link
Collaborator

@kmuehlbauer Sorry for the late response, was on vacation. And makes sense, yeah that is helpful, thanks for finding those! Trying to think how to go about this next because it almost seems like a memory leak issue.

@zssherman
Copy link
Collaborator

As a side note, we will be having assistance on this soon and will most likely do an overhaul of the kdp processing code.

@mgrover1
Copy link
Collaborator

So it looks like Google has an or-tools package that has the ability to access the same linear program solvers we use in cylp.

For example, check out this walkthrough of a mixed-integer programming problem... here is a list of the solvers available:

  • CLP_LINEAR_PROGRAMMING or CLP
  • CBC_MIXED_INTEGER_PROGRAMMING or CBC
  • GLOP_LINEAR_PROGRAMMING or GLOP
  • BOP_INTEGER_PROGRAMMING or BOP
  • SAT_INTEGER_PROGRAMMING or SAT or CP_SAT
  • SCIP_MIXED_INTEGER_PROGRAMMING or SCIP
  • GUROBI_LINEAR_PROGRAMMING or GUROBI_LP
  • GUROBI_MIXED_INTEGER_PROGRAMMING or GUROBI or GUROBI_MIP
  • CPLEX_LINEAR_PROGRAMMING or CPLEX_LP
  • CPLEX_MIXED_INTEGER_PROGRAMMING or CPLEX or CPLEX_MIP
  • XPRESS_LINEAR_PROGRAMMING or XPRESS_LP
  • XPRESS_MIXED_INTEGER_PROGRAMMING or XPRESS or XPRESS_MIP
  • GLPK_LINEAR_PROGRAMMING or GLPK_LP
  • GLPK_MIXED_INTEGER_PROGRAMMING or GLPK or GLPK_MIP

This package is pip installable, and works with the most recent Python versions

@kmuehlbauer
Copy link
Contributor

@mgrover1 That's available from within conda-forge (ortools-python), too.

@mgrover1
Copy link
Collaborator

@mgrover1 That's available from within conda-forge (ortools-python), too.

Awesome - yeah, it looks like they have a Simplex option, which is what is currently used...

@scollis
Copy link
Member

scollis commented Mar 28, 2022 via email

@tkralphs
Copy link

There is no shortage of Python interfaces to MIP solvers. @mkoeppe recently compiled a nice list of all the options, which it would probably be useful to have somewhere other than a ticket in Sage, but here it is:

https://trac.sagemath.org/ticket/26511#comment:56

I believe or-tools uses file I/O to pass instances to a stand-alone Cbc solver and also uses pure Python to build the model, so it's going to be much slower than CyLP. I'm not sure if speed is important for you but if not, then or-tools would probably serve your purpose. python-mip calls the Cbc library directly through cffi so it's passing the instance to Cbc in memory, but would still be slower than CyLP because it also builds the model in Python. I realize that you guys have struggled a lot with CyLP and it makes sense to look at alternatives, but just wanted to make you aware of the tradeoffs.

By the way, I'm not sure if you guys saw it, but @mkoeppe and I just finished some major improvements to CyLP and there are now binary wheels for all platforms, dramatically simplifying installation (no need to install Cbc, see here).

Whether you continue with CyLP or not, I'm still interested in tracking down this bug.

@mgrover1
Copy link
Collaborator

@tkralphs thank for your response - as someone who is new to MIP solvers, I appreciate your insight on the Python MIP interfaces and your work on CyLP.

The main reason for looking into alternatives was the requirement to use Python 3.6, which was causing issues with installing the rest of the environment we use for PyART.

That is fantastic news about the improved installation steps! I just tried it out with a Python 3.9 environment, and it worked beautifully. Happy to provide feedback where we can, and again, thanks for all your work with CyLP.

@tkralphs
Copy link

Just to be clear, CyLP works with any version of Python. I am using it with Python 3.10. I think the Python 3.6 "requirement" came from the fact that installing it in 3.6 seemed to overcome the particular bug reported here for some reason, but I think the situation is not at all clear at this point. Some more digging is needed. If someone could try to replicate this issue with the new wheels, that would be helpful. Perhaps that will fix the bug somehow.

@mgrover1
Copy link
Collaborator

Using the new build files, I am still seeing the following when running our example using CyLP

Processing Code:

import numpy as np
import matplotlib.pyplot as plt
import pyart
from pyart.testing import get_test_data

file = get_test_data('095636.mdv')

# perform LP phase processing (this takes a while)
radar = pyart.io.read_mdv(file)

# the next line force only the first sweep to be processed, this
# significantly speeds up the calculation but should be commented out
# in production so that the entire volume is processed
radar = radar.extract_sweeps([0])

phidp, kdp = pyart.correct.phase_proc_lp(radar, 0.0, debug=True)

Error:

Exec time:  0.5900969505310059
Doing  0
python(43345,0x117afc600) malloc: *** error for object 0x7f7fad9c6660: pointer being freed was not allocated
python(43345,0x117afc600) malloc: *** set a breakpoint in malloc_error_break to debug

@mgrover1
Copy link
Collaborator

@tkralphs we are running into the same issue described in coin-or/CyLP#138 I believe...

@tkralphs
Copy link

OK, thanks, I will try to find some time to build a version of CyLP and Cbc with debugging symbols, so that I can see exactly where this error is occurring.

@mgrover1
Copy link
Collaborator

mgrover1 commented Mar 28, 2022

It looks like printing the array returned by CyLP works:

print(solution)
[[ 2.34766422  2.43061544  3.18696968 ... 34.95957546 34.96160985
  34.96285309]
...
 [ 0.66390759  0.88923667  1.24627207 ... 27.49033571 27.46688439
  27.41206319]
 [ 0.90856286  1.30960192  1.81392097 ... 32.86643274 32.86900836
  32.87058235]

It is a numpy.ndarray:

<class 'numpy.ndarray'>

and we can take the mean of this array

37.23997104342368

but when we assign some variable to this solution in the function phase_proc_lp, we run into the malloc error:

python(63892,0x10f9fa600) malloc: *** error for object 0x7fe6d3f50660: pointer being freed was not allocated
python(63892,0x10f9fa600) malloc: *** set a breakpoint in malloc_error_break to debug

@mgrover1 mgrover1 changed the title Error when using CyLP BUG: Error when using CyLP Mar 29, 2022
@mgrover1
Copy link
Collaborator

mgrover1 commented Apr 5, 2022

@tkralphs following up - have you had a chance to look at the build error here?

@mole-bai
Copy link

mole-bai commented Jun 22, 2022

Thanks @zssherman for the Python 3.6 reference. I also managed to get CyLP installed in Python 3.6 and my script now runs as it should. These are the steps I took (I removed the old environment before that)

conda create -n pyart_py36 -c conda-forge python=3.6 numpy netCDF4 scipy matplotlib cython gcc_linux-64 gxx_linux-64 arm_pyart coincbc gdal
conda activate pyart_py36
pip install git+https://github.com/jjhelmus/CyLP.git@py3

@tanelv I have the same problem as you, but after following your steps to install CYLP, it reports an error when running the test code:

Processing Code:
https://github.com/coin-or/CyLP#modeling-example

Error:
undefined symbol:_ZN17CoinIndexedVectorD2Ev
image

In addition to using pip install git+https://github.com/jjhelmus/CyLP.git@py3 to install cylp-0.7.4, whether you also do other configuration?

@mole-bai
Copy link

@zssherman I encountered the same problem and rolled back to CYLP-0.7.4 as follows:

conda create -n pyart_py36 -c conda-forge python=3.6 numpy netCDF4 scipy matplotlib cython gcc_linux-64 gxx_linux-64 arm_pyart coincbc gdal 
conda activate pyart_py36
pip install git+https://github.com/jjhelmus/CyLP.git@py3

but in cyLP-0.7.4, the most basic function imports reported an error :
image

So if I want to run this function successfully now :

pyart.correct.phase_proc_lp(radar, 2.0, self_const = 12000.0, low_z=0.0, high_z=53.0, min_phidp=0.01, min_ncp=0.3, min_rhv=0.8, LP_solver='cylp_mp', proc=15)

How should I configure my CyLP and Pyart environments? Looking forward to your reply!

@mgrover1
Copy link
Collaborator

@mole-bai - we are working on replacing the CyLP solver in Py-ART. You can use one of the other solvers (LP_solver = "pyglpk" or LP_solver = "cvxopt")

We apologize that we are not able to support solving this CyLP issue.

@mgrover1 mgrover1 added this to the Py-ART Version 2.0 milestone Sep 27, 2022
@mgrover1 mgrover1 added the Bug Issue in the Code label Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Issue in the Code
Projects
None yet
Development

No branches or pull requests

8 participants