Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object put data segmentation fault #5

Closed
QiaoK opened this issue Jan 21, 2021 · 4 comments
Closed

Object put data segmentation fault #5

QiaoK opened this issue Jan 21, 2021 · 4 comments
Assignees

Comments

@QiaoK
Copy link
Contributor

QiaoK commented Jan 21, 2021

This test https://github.com/hpc-io/pdc/blob/qiao_develop/src/tests/obj_put_data.c would fail with segmentation fault in sequential mode because the put data function does not work. See the line “ret = (perr_t) PDCobj_put_data("o2", (void*)data, 128, cont); There is no reason for this to fail because object “o2” and container “cont” is properly created. The data buffer is also valid.

@houjun houjun self-assigned this Jan 22, 2021
@QiaoK
Copy link
Contributor Author

QiaoK commented Jan 30, 2021

sequential test passed. MPI test failed with segmentation fault. To reproduce, use "./mpi_test.sh ./obj_put_data mpiexec 2 4"

@QiaoK
Copy link
Contributor Author

QiaoK commented Jan 30, 2021

Just tested again. Looks like the sequential test sometimes also fails. Here is an example of segmentation fault.

(base) qkang@data6:~/pdc_develop/pdc/src/build/bin$ ./run_test.sh ./obj_put_data
testing: ./obj_put_data

==PDC_SERVER[0]: using [./pdc_tmp/] as tmp dir. 0 OSTs per data file, 0% to BB
==PDC_SERVER[0]: using ofi+tcp
==PDC_SERVER[0]: without multi-thread!
==PDC_SERVER[0]: Read cache enabled!
==PDC_SERVER[0]: Successfully established connection to 0 other PDC servers
==PDC_SERVER[0]: Server ready!

./obj_put_data
==PDC_CLIENT: PDC_DEBUG set to 0!
==PDC_CLIENT[0]: Found 1 PDC Metadata servers, running with 1 PDC clients
==PDC_CLIENT: using ofi+tcp
==PDC_CLIENT[0]: Client lookup all servers at start time!
==PDC_CLIENT[0]: using [./pdc_tmp] as tmp dir, 1 clients per server
create a new pdc
Create a container property
Rank 0 Create a container c0
./run_test.sh: line 20: 25487 Segmentation fault (core dumped) $test_exe $test_args
==PDC_CLIENT: PDC_DEBUG set to 0!
==PDC_CLIENT[0]: Found 1 PDC Metadata servers, running with 1 PDC clients
==PDC_CLIENT: using ofi+tcp
==PDC_CLIENT[0]: Client lookup all servers at start time!
==PDC_CLIENT[0]: using [./pdc_tmp] as tmp dir, 1 clients per server
==PDC_SERVER[0]: error with HG_Finalize

@houjun
Copy link
Member

houjun commented Feb 10, 2021

Not able to reproduce this error on Cori, would be helpful if you can get the stack trace when the segfault happens.

@QiaoK
Copy link
Contributor Author

QiaoK commented Feb 18, 2021

This problem is resolved by removing some analysis code inside the client functions.

@QiaoK QiaoK closed this as completed Feb 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants