Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WeightedConnectionPool.getConnection returns null that leads to NoLivingConnectionsError error #53

Open
jodevsa opened this issue Jul 7, 2022 · 8 comments
Labels
bug Something isn't working

Comments

@jodevsa
Copy link

jodevsa commented Jul 7, 2022

🐛 Bug Report

requests sent after 6 requests that failed due to a timeout with a weightedConnectionPool that has 2 upstreams will fail with NoLivingConnectionsError even if the upstreams is running again and not timing out anymore.

This is mostly due to the wrong assumption mentioned here:
https://github.com/elastic/elastic-transport-js/blob/main/src/pool/WeightedConnectionPool.ts#L54

if the GCD is 1 and both weights reached 1 after the subtraction we would need about 800 loops for current weight to reach 1 to be able to choose one of the weights

To Reproduce

Steps to reproduce the behavior:
1- clone my branch #54 which has a new test that should reproduce the bug
2- npm run build && node_modules/tap/bin/run.js test/unit/transport.test.ts

Paste your code here:

test('upstreams are down for the first 7 requests', async t => {


  const pool = new WeightedConnectionPool({ Connection: MockConnectionTimeoutForThefirstSevenRequests })

  pool.addConnection('https://localhost:9200')
  pool.addConnection('https://localhost:9201')

  const transport = new Transport({ connectionPool: pool, maxRetries: 0 })
  for(let i=0;i<20;i++){
  try {
    await transport.request({
      method: 'GET',
      path: '/hello'
    })
  } catch (err: any) {
    console.log(err.name)
    t.equal(err.name, 'TimeoutError')
  }
}
})




export class MockConnectionTimeoutForThefirstSevenRequests extends BaseConnection {
  requestCount = -1
  async request (params: ConnectionRequestParams, options: ConnectionRequestOptions): Promise<ConnectionRequestResponse>
  async request (params: ConnectionRequestParams, options: ConnectionRequestOptionsAsStream): Promise<ConnectionRequestResponseAsStream>
  async request (params: ConnectionRequestParams, options: any): Promise<any> {
    
    return new Promise((resolve, reject) => {

      this.requestCount++;
      if( this.requestCount<=6){
      process.nextTick(reject, new TimeoutError('Request timed out'))
      }
  


      const body = JSON.stringify({ hello: 'world' })
      const statusCode = setStatusCode(params.path)
      const headers = {
        'content-type': 'application/json;utf=8',
        date: new Date().toISOString(),
        connection: 'keep-alive',
        'content-length': '17',
        'x-elastic-product': 'Elasticsearch'
      }
      process.nextTick(resolve, { body, statusCode, headers })
    })
  }
}

Expected behavior

we are able to use the pool after the upstreams are up again

Your Environment

  • node version: v16.0.0
  • master branch of this repo
  • os: Mac
@jodevsa
Copy link
Author

jodevsa commented Jul 7, 2022

detected while working on nodejs/undici#1065 and taking inspiration from the logic in this repo

jodevsa added a commit to jodevsa/elastic-transport-js that referenced this issue Jul 7, 2022
@jodevsa jodevsa changed the title WeightedConnectionPool returns null/NoLivingConnectionsError on getRequest WeightedConnectionPool.getConnection returns null and leads to NoLivingConnectionsError error Jul 7, 2022
@jodevsa
Copy link
Author

jodevsa commented Jul 7, 2022

I'd like to open a PR with a fix after someone verifies my findings

@jodevsa jodevsa changed the title WeightedConnectionPool.getConnection returns null and leads to NoLivingConnectionsError error WeightedConnectionPool.getConnection returns null that leads to NoLivingConnectionsError error Jul 7, 2022
@rudolf
Copy link

rudolf commented Jul 29, 2022

@jodevsa
Copy link
Author

jodevsa commented Jul 29, 2022

@rudolf can I work on a fix?

@rudolf
Copy link

rudolf commented Aug 1, 2022

@jodevsa As far as I can tell we've got no one internally working on this so a fix would be very welcome!

//cc @delvedor

@technige
Copy link
Contributor

technige commented Aug 1, 2022

Hi @jodevsa. We would be delighted to receive a pull request for this issue. Please read the contributing guidelines, and let us know when your fix is ready to review. Thanks :)

@rudolf
Copy link

rudolf commented Aug 1, 2022

The algorithm seems to be inspired by https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.863.3985&rep=rep1&type=pdf

@jodevsa
Copy link
Author

jodevsa commented Oct 14, 2022

I just saw this :(

HI @technige ,

Thanks for your kind message. I'll start working on a fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants