Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Dataverse to work with secondary reads from a scaled database #4765

Closed
danmcp opened this issue Jun 20, 2018 · 13 comments
Closed

Allow Dataverse to work with secondary reads from a scaled database #4765

danmcp opened this issue Jun 20, 2018 · 13 comments

Comments

@danmcp
Copy link
Contributor

danmcp commented Jun 20, 2018

With the recent containerization work, the postgres component added the ability to scale:

The problem now is that Dataverse isn't written to handle a scaled database. So in the above implementation, Dataverse is configured to only talk to the master instance of postgres. More work will need to be done in Dataverse to take advantage of the master slave setup to get any performance benefits. Namely handling primary vs. secondary reads. The implications could be widespread so using two datasources as a first step might be appropriate and switching over DB calls 1 by 1 to allow secondary reads will probably be the best approach.

@tkmonson tkmonson self-assigned this Jul 5, 2018
@pdurbin
Copy link
Member

pdurbin commented Jul 17, 2018

@thaorell @pameyer and I have been discussing this issue at http://irclog.iq.harvard.edu/dataverse/2018-07-17 and heard that @tkmonson is working on it, so I dragged it to the new "Community Dev" column at https://waffle.io/IQSS/dataverse

@thaorell emailed me some slides from a recent talk he and @tkmonson gave and said it's ok to upload them to this issue: Dataverse Lunch Talk.pdf

Here are the most relevant slides for this issue (looks like great stuff!):

7
8
9

@pdurbin
Copy link
Member

pdurbin commented Jul 19, 2018

I got on 48c59f5 on my laptop but to test I was hoping to use the new ./build.sh internal feature I added to #4805. Otherwise I have to push images to Docker Hub, which is slow.

The error that @tkmonson reported in Slack is a "broken pipe" exception in Glassfish's server.log.

@tkmonson at some point we need to bring your code and @thaorell 's code together. I will very likely ask you to resolve merge conflicts soon so if you want you can start to work on merging your code into his since his is the bigger change (#4805, above) and yours is smaller (#4827). So you can start looking at that if you want.

@pdurbin
Copy link
Member

pdurbin commented Jul 20, 2018

@tkmonson now that pull request #4805 has been merged can you please base your changes to this issue on the latest from "develop" or #4827 (now in QA) and send me a link to your branch? I'd like to see if I can replicate the "broken pipe" error you were seeing yesterday. Thanks!

@pdurbin
Copy link
Member

pdurbin commented Jul 23, 2018

Pull request #4827 has been merged as well so the branch for this pull request should definitely be based on "develop". Afterward, please make noise if you're still seeing that broken pipe error, @tkmonson

@pdurbin
Copy link
Member

pdurbin commented Aug 7, 2018

@tkmonson hi! I'm catching up after a week long vacation. I see you made pull request #4916 changing VDCNet-ejbPU to masterPU and a variety of other changes but the pull request says "[WIP -- DO NOT MERGE]". Should this issue be dragged from "Community Dev" to "Code Review" at https://waffle.io/IQSS/dataverse ? That is to say, are you ready for code review? Thanks!

@tkmonson
Copy link
Contributor

tkmonson commented Aug 7, 2018

@pdurbin Yes, it's ready for code review at this point. I just forgot to change the title of the PR after finishing the changes yesterday.

@scolapasta
Copy link
Contributor

scolapasta commented Aug 13, 2018

@tkmonson One suggestion here. While this doesn't minimize the work here, it could help for future changes:

Rather than embed the Persistence Context in each EJB, add a new EJB called something like EntityManagerBean and have that embed the Persistence Context. Have all other EJBs embed this new EJB and have all persistence unit related methods in this new EJB. This new EJB could have methods like getEntityManager() that always returns the main, and then have you can move this method from DatasetServiceBean onto the new EJB:
public EntityManager getEntityManager(User user) {
if(user.isAuthenticated()) {
return em;
}
return em2;
}

That way, if other EJBs need this logic, we don't need to duplicate it, plus we can add more complex EntityManager logic, as needed.

@michbarsinai thoughts on this?

@tkmonson
Copy link
Contributor

@scolapasta Way ahead of you! I'm almost finished making those exact changes. I'll try to push them tomorrow.

@pdurbin
Copy link
Member

pdurbin commented Aug 20, 2018

In Slack @tkmonson asked us to move this issue to code review so I just did.

@djbrooke
Copy link
Contributor

@scolapasta has been tasked with working with the MOC and RH folks to put together a plan containing a few different options for moving Harvard Dataverse to MOC, one of which would leverage this code. As it stands now, we’ll put this PR on ice and not merge it, as it will potentially slow down development (developers will need to determine which entity manager to use - read vs write) and solves a problem that does not yet exist. If we decide to go down a path that will leverage this code, the work done here will prove valuable. I'll move this to the inbox for now.

@pdurbin
Copy link
Member

pdurbin commented Oct 10, 2022

we’ll put this PR on ice and not merge it

This is the PR that was put on ice:

@danmcp are you still interested in this issue? If not, let's at least grab a beer sometime. 😄 🍻

@danmcp
Copy link
Contributor Author

danmcp commented Oct 10, 2022

@pdurbin I don't think any use cases are pushing it at the moment. No concerns if we want to close. Can always reopen later if the use case does pop up again.

And it would be great to connect sometime!

@pdurbin
Copy link
Member

pdurbin commented Oct 12, 2022

@danmcp exactly! Once I close this (and I will, thanks), there's a reopen button right there!

I'll shoot you an email about that beer. 😄

@pdurbin pdurbin closed this as not planned Won't fix, can't repro, duplicate, stale Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants