Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

root4j should not create new instances of the Hadoop Configuration #3

Closed
superdude264 opened this issue Mar 21, 2017 · 4 comments · Fixed by #5
Closed

root4j should not create new instances of the Hadoop Configuration #3

superdude264 opened this issue Mar 21, 2017 · 4 comments · Fixed by #5

Comments

@superdude264
Copy link

For some operations, root4j uses the same Hadoop Configuration instance as the one found in Spark, but in other cases it creates a new Hadoop Configuration, so it loses the configuration values set by Spark.

Here's an example of when it creates a new instance:
https://github.com/diana-hep/root4j/blob/master/src/main/java/org/dianahep/root4j/RootFileReader.java#L179

This issue was originally reported as a bug against the stocator program used by IBM Data Science Experience to read files Swift ("Object Storage service"). See: CODAIT/stocator#122

@vkhristenko
Copy link
Collaborator

@superdude264 thanks for pointing out that this is indeed an issue - I actually didn't know what the right way of doing it is as you can see there is another version commented out...

I will test it, however another issue could be (I have to check though - not sure) that HadoopConfiguration is not Serializable - have to check that

VK

@vkhristenko
Copy link
Collaborator

Hadoop Configuration is not serializable - http://stackoverflow.com/questions/38224132/use-sparkcontext-hadoop-configuration-within-rdd-methods-closures-like-foreachp

what can be done???

  • wrap it inside of something that is serializable

VK

@mrow4a
Copy link

mrow4a commented Aug 30, 2018

@vkhristenko any plans for adding support for this? XRootDConnector relies on this feature to set readAhead with spark config as flag. https://gitlab.cern.ch/awg/hadoop-xrootd-connector#available-hadoop-flags

@vkhristenko
Copy link
Collaborator

i will check what i can do...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants