-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scale-Out Architecture? #21
Comments
Hi! Scale out is certainly something we have in mind and was in fact one of my primary motivations for using UIMA at all (DKpro the other half of that motivation). However, when I first seriously looked into using UIMA-AS and DUCC, it turned out that I would have to rewrite the whole pipeline code of YodaQA (that uses UIMAfit heavily right now), and that non-scaled-out setup of YodaQA would probably get significantly more complicated. I don't know if I didn't misjudge anything here, though, maybe it's not as bad as it seems. Anyhow, I don't plan to work on scaleout again in the near future, personally. Patches are welcome, but writing them will probably involve some learning curve within the UIMA ecosystem. A technology that might be most suitable for all this is Leo:
It seems like something like UIMAfit for UIMA-AS (i.e. simple Java code instead of XML madness). But we use UIMAfit heavily now and I think there are little to none existing examples that use UIMAfit + Leo, though its author mentioned on uima-users a few months ago that it should in principle be possible. (P.S.: For the web interface, an extra-UIMA "question dashboard" is instituted that various parts of the pipeline access. It's mainly for reporting extra metadata to the web interface, while the pipeline is still running; this code would have to be modified to be scaleout/network aware, or just temporarily disabled for starters. Dealing with this is not a major effort, just a heads-up that it's there.) |
Just for future reference - the post Petr referred to should be: http://permalink.gmane.org/gmane.comp.apache.uima.general/6270 Thanks for the answer, interesting problem. As you know I also have my hands full with an important integration, but this is yet another item on my future work list. I'll revisit this once my knowledge of UIMA has advanced. |
I'll keep this open as it's a general YodaQA issue that will need to be solved sooner or later. |
Regarding DUCC vs. Mesos, here are my 2 cents: DUCC should be easier to integrate because it is developed for UIMA. Mesos is widely deployed, proven to be highly scalable, and also supports Containers/Docker. |
Hi! It seems jbauer's commentss did not appear in the github issue, just
Overally, cluster processing is one area where my personal experience However, in general, I think the right way to do scaleout is to look at (i) use the computing power to do preprocessing, e.g. preparsing (ii) decouple a lot of the computationally intensive tasks like parsing (iii) a lot of our bottleneck is currently in database lookup, so |
Very interesting. a) Maybe we can already start with the JoBim Text (JBT) integration? What if I write a little REST interface for it and just throw this into YodaQA so it's generally available together with a tutorial on how to set it up (create models and DBs)? Its first processing step is based on Hadoop and hence should already scale rather well with a Mesosphere. b) I could easily package the data backends including JBT into Docker containers just like I did with YodaQA itself. Again, this should help if we follow a Mesos route (in addition to DUCC)... [For the DUCC part I still need some time to really know what I'm doing.] |
I hope to answer re JBT tonight. :) Sorry, I was doing some focused work over the weekend. But it'd be best to have that tracked in a different github issue, I guess... |
jbauer's comments did not appear in Github issue. Strange! I don't have experience working with the technologies (DUCC, Mesos, etc) mentioned previously. So, my comments are based on whatever I have read so far about these technologies. I see the problem of scaling as consisting of two unrelated problems as follows: (1) How to scale within an instance of YodaQA? This problem is solved by UIMA-AS. Seems that we don't need UIMA-AS if YodaQA code is reentrant so that multiple user-inputs (i.e. questions) are processed simultaneously. (2) How to scale to multiple instances of YodaQA? As an example of a requirement: YodaQA must scale to thousands of instances of YodaQA. This problem is solved by DUCC, Mesos, etc. This also includes connecting thousands of nodes/servers in a cluster. To me, Preprocessing is separate from Normal-operation. Normal operation involves executing user input and producing a result to the user. Preprocessing can always be done on nodes where the data (corpus, etc) resides. During normal operation, the same preprocessed data should be available to all yodaqa instances, and some of these instances may reside on the same nodes that have the preprocessed data. Regarding (1) above: If preprocessing is NOT done, and during normal operation if YodaQA is still heavily I/O-bound (involving database lookups) then I/O is a major problem. Running multiple questions simultaneously would somewhat alleviate this problem. Running multiple questions implies reentrant code. |
Well, from a practical POV, it's important that the QA system is
realtime, so I think that should be the focus of any scaleout work.
Not processing multiple user inputs in parallel, but minimizing the
end-to-end time of processing a single input, i.e. possibly spreading
the different sub-paths of the pipeline flow over multiple machines
(DUCC, UIMA-AS might become relevant here), but more importantly
reducing the latency points.
(For experiments, it's fine to parallelize answering many questions at
once. But for experiments, an even better way to speed up is to cache
the results of the database lookups, something that's long overdue in
YodaQA.)
|
In case I haven't mentioned this: I did some measurements and also found IO to be a bottleneck, so I added an SSD and additional ethernet adapters to my systems. During my evaluation (scheduled in 5 weeks) I will run YodaQA and its backends in several configurations (SSD vs HDD, backends on one system vs backends on different systems, with octocore and 32GB RAM vs Quadcore and 8GB RAM) so I get a feeling for what gets results and how close to real-time I can get with simple tricks. I'll keep you posted. |
Yes, the focus should be on minimizing the end-to-end delay. |
I'm not sure what do you mean by that, Vineet. At any rate, almost
everything is reentrant; YodaQA already runs heavily multithreaded
(when all the threads aren't blocking on database IO).
|
When a thread blocks, the processor runs another thread/process that is runnable. If nothing is runnable, then the processor sits idle. |
Look forward to your results. |
The containerization part is now discussed in a dedicated thread: #41 Update: Done. I have build Dockerfiles for all data backends as well as Yoda itself and have orchestrated them via Docker Compose such that all of Yoda can be launched fully automatically. |
Hi,
I just finished building another physical system, so I could run YodaQA on 24 cores and 60GB of RAM in parallel (even though 16/48 is more realistic - I have some other stuff to do). Executing Yoda with all its backends on an octocore with 24GB RAM currently takes 7 seconds, so with a cluster of 24 cores real-time might become feasible. [OK, it doesn't, but don't burst my bubble.]
I know that performance is no core goal right now, but I'm still interested: Is there anything in place or planned to do this? UIMA-AS, OpenStack, MPI / OpenMP, Docker + CoreOS? Which direction do you want to take and how far away is it?
Best,
Joe
The text was updated successfully, but these errors were encountered: