Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New framework: Tests for Spark #151

Merged
merged 3 commits into from
Apr 17, 2013
Merged

New framework: Tests for Spark #151

merged 3 commits into from
Apr 17, 2013

Conversation

trautonen
Copy link
Contributor

Tests for Spark Java framework. I couldn't get the --next-sort working at first try so the correct sort number probably needs to be fixed.

The test supports multiple different configuration options: standalone, maven tomcat plugin and packaged WAR. The default setup is for WAR (resin) with JNDI datasource for MySQL. I might later add setups for the other configurations too.

@bhauer
Copy link
Contributor

bhauer commented Apr 12, 2013

Excellent! Thank you for the contribution.

@pfalls1 pfalls1 merged commit 609ba22 into TechEmpower:master Apr 17, 2013
@pfalls1
Copy link
Contributor

pfalls1 commented Apr 17, 2013

@trautonen Thanks for this!

@perwendel
Copy link

I bit too late but It looks like you're using i correctly!

@bhauer
Copy link
Contributor

bhauer commented May 2, 2013

@trautonen @perwendel We've included Spark in the Round 4 test and will be posting the data today. Just a heads up that the database performance seems to be bottlenecked on something, but we have not investigated what that might be. Have either of you been able to run a quick test on your own hardware to see if you observe the same thing? The behavior is unfortunate because Spark does extremely well at request-routing and the raw JSON serialization test.

@trautonen
Copy link
Contributor Author

The database performance bottleneck is caused most likely by the open session in view implementation. Not sure if it's an issue with Spark's filters or the way I implemented it with thread locals. If i have some time, i'll create another version of the db access which will use hibernate the same way as for example the wicket tests.

@pfalls1
Copy link
Contributor

pfalls1 commented May 2, 2013

@trautonen Thanks for taking a look!

@perwendel
Copy link

I don't think Spark filters should be an issue since the implementation is almost exactly the same as the routes. When looking more thoroughly at the code now I saw the use of thread locals. I didn't notice that before since I just check the Spark parts.

@trautonen Thread locals could be avoided by using attributes:

in the /db route
request.attribute("session", session); // sets attribute

in the /db filter
request.attribute("session"); // gets attribute

(or just close the session in the route)

@perwendel
Copy link

Hi, I'm not sure how you conduct your testing, I guess it's more sophisticated :-), and I don't know what would be a good result but I did some testing running against the /db resource, both with and without Thread Locals in the spark application. A local mySql database was used. I also did tests against the Apache Wicket TechEmpower app to get something to compare with. A new connection was used for each request. These are the results I obtained (100 requests per test):

With thread locals:
mean time per request:  2 ms for ?queries=5
mean time per request: 10 ms for ?queries=50
mean time per request: 62 ms for ?queries=500
--------------------------------------------------------------------
No thread locals and no filters:
mean time per request:  1 ms for ?queries=5
mean time per request:  7 ms for ?queries=50
mean time per request: 57 ms for ?queries=500
--------------------------------------------------------------------
Wicket:
mean time per request:  3 ms for ?queries=5
mean time per request:  9 ms for ?queries=50
mean time per request: 65 ms for ?queries=500

@trautonen trautonen deleted the spark branch May 2, 2013 20:21
@trautonen
Copy link
Contributor Author

The difference cannot be seen without running a lot of concurrent requests. I don't have the full test environment set up, so cannot tell how it does for the real configuration, but I did some local tests with JMeter.

I used the Spark's embedded Jetty with the c3p0 db pool, that starts from the SparkApplication class' main method. With 10 concurrent threads and 20 queries per request I got the following results:

  • Current implementation with thread local based OSIV: 985 requests/second
  • OSIV with storing hibernate session to request with Spark's before filter and cleaning hibernate session in Spark's after filter: 985 requests/second
  • Plain session usage (the same way wicket test uses): 1020 requests/second

This indicates that using thread locals is not any slower than using the request as the placeholder for the session. Thread locals is the way Spring does this same thing also (OpenSessionInViewFilter, TransactionSynchronizationManager).

My tests indicate that we could get 3,5% performance improvement by changing the code to plain session usage. @bhauer if you compare for example wicket and spark tests, do you see over 3,5% difference from the db tests? If yes, then there must be some other issue with the spark setup.

@trautonen trautonen restored the spark branch May 2, 2013 22:01
@perwendel
Copy link

I see!
What would be the desirable performance speaking of requests/second?
Is the results similar when running Spark on tomcat?

@perwendel
Copy link

I also did some local testing against the /db resource with JMeter, both for Spark and Wicket.

The test configuration was the same setup as @trautonen ie. 10 threads, 20 queries per request.
I changed the Wicket application hibernate config to use a real mySql database, of course the same used by the Spark application.

First I let the servers warm up by shooting requests until I saw the throughput had stabilized, I then ran the tests.
300000 requests were executed in each test.

The results:

Spark (Started in Eclipse, embedded Jetty, Plain session usage):
--------------------------------------------
Throughput: 78426 req / minute, 1307 req / second
average: 4 ms
median: 5 ms

Wicket (Started with mvn jetty:run):
--------------------------------------------
Throughput: 63994 req / minute, 1067 req / second
average: 6 ms
median: 7 ms

My results show that Spark is faster than Wicket. However, I'm not sure I've used the correct setup. @bhauer your comments?

Edit: The OS used was Windows 7
Edit: I also ran the tests with ThreadLocals and got almost exactly the same result as for plain session usage.

Edit: Just for fun I also did a test with Spark using prepared statements instead of Hibernate and got a result of 86000 req / minute. Is there a requirement that Hibernate should be used or could you use e.g. prepared statements instead?

@bhauer
Copy link
Contributor

bhauer commented May 4, 2013

@perwendel Your data looks much more like what I would expect to see from Spark. This is why I raised the concern earlier--the data we have gathered in Round 4 does not seem like an accurate representation. Given what I've seen in Spark, I would expect it to perform similar to Gemini.

Based on that, I conjecture that the discrepancy in measured performance is due to a configuration glitch in the Spark test-case we have or has something to do with the pairing of Spark and Wrk--our load tool--or with Resin, the application server we're using.

Are you able to run the Python tester that @pfalls1 put together for this project? If you were able to reproduce the behavior we're seeing, I think you'd be able to isolate the problem quicker.

Failing that, can you test with Resin?

Assuming we can find some time next week, we'll take a closer look on our side.

To answer your question about Hibernate: we are happy to accept tests that both do and do not use an ORM such as Hibernate. We identify tests that do not use an ORM as a "raw" test referring to the use of raw database connectivity. Put another way, the "standard" configuration for a test would use an ORM, but we are happy to include tests with raw database connectivity as well.

@perwendel
Copy link

@bhauer

I almost have no experience at all with Python and I'm not sure exactly how to run the python tester. I installed python 2.7 for windows and tried to run setup.py but got an error:

ImportError: No module named setup_util

the same error was raised even though I installed (I think it was the correct stuff) python setuptools. Any thoughts?

I did test deploying Spark on Resin and the following results were obtained:

Raw:
74770 req/minute
ORM:
67273 req/minute

so I guess the problem is not with Resin.

@bhauer
Copy link
Contributor

bhauer commented May 5, 2013

@perwendel Thanks for testing on Resin. I'll chat with @pfalls1 to see what we can figure out. I'd like to get this fixed for the next round.

@pfalls1
Copy link
Contributor

pfalls1 commented May 6, 2013

@perwendel @trautonen I did a test using hibernate-local.cfg.xml rather than hibernate-jndi.cfg.xml and the results looked much better. I don't yet have an explanation as to why the -jndi version would perform so poorly, but I'm ok with simply using the c3po connection configuration for round 5 if that's what you're all comfortable with.

@trautonen
Copy link
Contributor Author

@pfalls1 I'm fine using c3p0 db pool for round 5. I noticed the possible issue with different db pools and commented it here: #134 (comment)

But if Wicket tests are using the exactly same DB pool configured via JNDI, it's weird that there is such big performance difference.

@bhauer
Copy link
Contributor

bhauer commented May 7, 2013

@trautonen With respect to connection pools generally, for each framework we'd prefer to use the connection pool that is most commonly used--either the one that ships with the framework or the one that is recommended by documentation.

We'll use C3P0 for Spark in Round 5, and ongoing rounds unless you tell us otherwise. Nevertheless, as an academic matter, it would be nice to investigate further why the JNDI configuration runs into a performance bottleneck.

Thanks for the help!

@bhauer
Copy link
Contributor

bhauer commented Oct 10, 2013

@trautonen The following is not directly related to any of the above conversation, but this nevertheless seems like a proper place to bring this up.

We are preparing for Round 7 presently. I don't think Spark's database tests were working for us in previous rounds, but I wanted to run this by you in case you have time to take a look. When validating the URLs, the Spark server is returning an HTTP 500 response for the database tests.

-----------------------------------------------------
  Verifying URLs for spark
-----------------------------------------------------

VERIFYING JSON (/spark/json) ...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    25  100    25    0     0   1152      0 --:--:-- --:--:-- --:--:--  1190
{"message":"Hello world"}
VERIFYING DB (/spark/db) ...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:04 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:05 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:06 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:07 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:08 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:09 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:10 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:11 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:12 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:13 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:14 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:15 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:16 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:17 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:18 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:19 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:20 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:21 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:22 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:23 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:24 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:25 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:26 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:27 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:28 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:29 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:29 --:--:--     0
curl: (22) The requested URL returned error: 500
VERIFYING Query (/spark/db?queries=2) ...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (22) The requested URL returned error: 500

Any thoughts? If not, don't worry. We will just continue to omit the database tests, but at some point it would be nice to get these resolved.

Thanks!

@trautonen
Copy link
Contributor Author

@bhauer I tested locally the spark setup and It worked fine. Can you see stack trace or is there something in the application server logs that could help to resolve the issue. I've done some minor updates to the project as spark framework has been updated too, but would be great to fix the db issues at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants