Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initializing one Connection cost ~300MB heap utilization and it reaches ~600+MB when I have 10 connections #16

Open
sssinghsyr opened this issue Jun 26, 2018 · 13 comments

Comments

@sssinghsyr
Copy link

I was comparing blocking JDBC with this async-pgsql2 API.

Design:
Blocking JDBC - 1000 threads making connection with the DbServer. In this case, heap utilization was <100 MB.

Async-pgsql2 - 1000 async sql-submission using 10 different connections. Heap utilization reaches >600MB.

Any suggestion for the improvement in the design.
I guess, whole pgsql2 will work efficiently when we will re-use connection, do we have ConnectionPool feature available here? Are we going to have?

How can I delete connection? Each connection initialize multiple containers which eats huge heap space.

@davecramer
Copy link
Member

can you elaborate on reuse connection? I presume you are aware that connections cannot be shared ?

@sssinghsyr
Copy link
Author

I have Database connector service build using this async-pgsql2 API which connects with the DbServer.
1000 Clients make connections with the connector service and it routes sql to the DbServer.
Having more connections eat lot of heap space.

Connections cannot be shared mean? I can use same connection to submit different type of submission (different sql query).

@alexanderkjall
Copy link
Collaborator

I think I will need to do a bit of profiling in order to have an informed opinion on how to do this better. Do you have any code to share on how you did your test?

My uninformed opinion is that the ByteBuffer usage in the network communication layer have potential to be optimized, we don't clear and reuse them.

Regarding connection pools, we don't have that available, but it's clear that it's needed in order to use the driver efficiently, same as the normal jdbc driver. But it should maybe be it's own project? Building a connection pool have it's own challenges and I guess that it should be database agnostic

@davecramer
Copy link
Member

Regarding connection pools, we don't have that available, but it's clear that it's needed in order to use the driver efficiently, same as the normal jdbc driver. But it should maybe be it's own project? Building a connection pool have it's own challenges and I guess that it should be database agnostic.

Wow, this is not a light undertaking. There are many good poolers out there and they are significant projects in their own right. I would advise strongly that this is way out of scope

@alexanderkjall
Copy link
Collaborator

I totally agree, I tend to insert maybe's when I'm still thinking stuff through, will try to communicate more clearly.

@sssinghsyr
Copy link
Author

@alexanderkjall
This is my project: async-db-connector.
I am testing with 1000 clients sending queries in parallel. Async-db-connector uses IOmultiplexing to receive all queries with single thread and call async-pgsql2 method multipleRowOperation using newly created CONNECTION.
Problem: Each client query will require new CONNECTION and it will be added into DataSource.connections LinkedQueue. There is no method to remove it from the list and delete its resources once CONNECTION is used. All the connections keep being added and there will be unnecessary iteration of all those.
This restrict me to re-use each connection to limit extra overhead of their creation.

Please correct me with the understanding of CONNECTION.
Also, why I cannot re-use CONNECTION? @davecramer

@davecramer
Copy link
Member

@sssinghsyr you can re-use connections. What you can't do is use the same connection across threads concurrently

@alexanderkjall
Copy link
Collaborator

@sssinghsyr Thanks, I'll look into it and run it through a profiler, it sounds to me like there might be a lingering reference somewhere that doesn't get cleaned up when the connection is closed.

@sssinghsyr
Copy link
Author

@alexanderkjall
Could you please help me to setup this project's test suite? Cannot I run the testcases without docker?
Any setup link will be preferable!

@alexanderkjall
Copy link
Collaborator

@sssinghsyr how to install docker greatly depends on what OS you are running, but maybe this can be a start: https://docs.docker.com/install/

Regarding the profiler it turns out to be a bit trickier to run visualvm with java 10 than it was with java 8, I haven't had time to set that up yet.

@sagenschneider
Copy link
Contributor

I would like to get in #18 to see if this is still an issue. The current thread running 100% creates a ByteBuffer each loop of 1Kb. This can quickly fill the heap space.

Question: are there OutOfMemoryExceptions? if not likely just a growing heap size due to this.

@sssinghsyr
Copy link
Author

@sagenschneider There was no OutOfMemoryExceptions. I was comparing application memory usage between JDBC and ADBA.

@sagenschneider
Copy link
Contributor

@sssinghsyr Yes, likely cause. I'm near finishing the Selector work. It is running for my NioLoop tests. I'm just trying to get it running for remaining tests before merging in to complete #19 PR.

Early look is available here https://github.com/sagenschneider/pgsql2/tree/NetworkLayer/src/main/java/org/postgresql/sql2/communication/network (Note: I'm looking to read/write to ByteBuffers to avoid unnecessary copies of data and object creation).

With Selector and reading from re-used Direct ByteBuffers, it will consume a little more memory than the JDBC driver (though mostly in direct memory space and not heap). This, however, should not be in the 100's of megabytes (and potentially can be shared with HTTP handling ByteBuffers to reduce unused buffers in pools).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants