Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk helper semaphore handling results in hanging await for long-running requests (exceeding flushInterval) #1562

Closed
brentmjohnson opened this issue Oct 5, 2021 · 9 comments
Assignees

Comments

@brentmjohnson
Copy link

🐛 Bug Report

Bulk helper hangs forever when flushInterval is exceeded while iterator is already awaiting semaphore.

To Reproduce

Steps to reproduce the behavior:

  1. Run below code against a test cluster - should complete successfully
  2. Simulate long running server-side operation exceeding configured flushInterval. Multiple ways to do this but one way is to modify the compiled Helpers.js with the following:
client.bulk(Object.assign({}, bulkOptions, { body: bulkBody }), reqOptions, async (err, { body }) => {
await new Promise(resolve => setTimeout(resolve, flushInterval));
if (err) return callback(err, null)
  1. Re-run below code and watch the hanging await caused by onFlushTimeout() invoked on a payload already awaiting semaphore()

Paste your code here:

'use strict'
const util = require("util")
const { Readable } = require('stream');
const { Client } = require('@elastic/elasticsearch');

async function* generator() {
    let i = 0
    while (i < 10) {
        await new Promise(resolve => setTimeout(resolve, 1 * 1000));
        yield { i: i++ };
    }
};

const readableStream = Readable.from(generator());

const elasticClient = new Client({
    [TESTCLUSTER]
});

(async () => {
    const bulkHelper = elasticClient.helpers.bulk({
        flushBytes: 43,
        concurrency: 1,
        datasource: readableStream,
        onDocument(doc) {
            console.log(doc)
            return {
                index: { _index: 'semaphoretest' }
            }
        },
        onDrop(doc) {
            console.log(doc);
        }
    }).catch((err) => {
        console.error(err);
    });

    while (util.inspect(bulkHelper).includes('pending')) {
        await new Promise(resolve => setTimeout(resolve, 1 * 1000));
        console.log('...waiting');
    }

    console.log(await bulkHelper);
})();

Expected behavior

Bulk helper awaits gracefully for queued requests to complete, error, or timeout.

Paste the results here:

{ i: 0 }
...waiting
{ i: 1 }
...waiting
...waiting
...waiting
...[forever]

Your Environment

  • node version: v14.17.6
  • @elastic/elasticsearch version: >=7.15.0
  • os: Linux
@p1x44r
Copy link

p1x44r commented Dec 31, 2021

Yes, I can confirm this. It my case this tends to happen only when I push for larger concurrency and/or flushBytes (process' performance should allow for me to). I'm streaming data from large MySQL tables concurrently and bulk helper client hangs seemingly randomly during the 40+ minute indexing process.

node version: v16.13.1
@elastic/elasticsearch version: 7.13.0
OS: macOS/RHEL 8 (confirmed on both)

@YoannMa
Copy link

YoannMa commented Mar 30, 2022

A workaround that's working for me is to set the flushInterval to Math.pow(2, 31) - 1 and to rely only on the flushBytes system. It's not pretty thought.

@rprovodenko
Copy link
Contributor

We seem to be experiencing the same issue

@stale
Copy link

stale bot commented Apr 27, 2022

We understand that this might be important for you, but this issue has been automatically marked as stale because it has not had recent activity either from our end or yours.
It will be closed if no further activity occurs, please write a comment if you would like to keep this going.

Note: in the past months we have built a new client, that has just landed in master. If you want to open an issue or a pr for the legacy client, you should do that in https://github.com/elastic/elasticsearch-js-legacy

@stale stale bot added the stale label Apr 27, 2022
@YoannMa
Copy link

YoannMa commented Apr 27, 2022

(comment to avoid stale)

@stale stale bot removed the stale label Apr 27, 2022
@delvedor delvedor added the bug label May 23, 2022
@delvedor
Copy link
Member

delvedor commented May 23, 2022

Heya, I wrote this to reproduce the issue:

const dataset = [
  { user: 'jon', age: 23 },
  { user: 'arya', age: 18 },
  { user: 'tyrion', age: 39 }
]

test('issue #1562', async t => {
  async function handler (req, res) {
    console.log(req.url)
    setTimeout(() => {
      res.writeHead(200, { 'content-type': 'application/json' })
      res.end(JSON.stringify({ errors: false, items: [{}] }))
    }, 1000)
  }

  const [{ port }, server] = await buildServer(handler)
  const client = new Client({ node: `http://localhost:${port}` })

  async function * generator () {
    const data = dataset.slice()
    for (const doc of data) {
      await sleep(1000)
      yield doc
    }
  }

  const result = await client.helpers.bulk({
    datasource: Readable.from(generator()),
    flushBytes: 1,
    flushInterval: 1000,
    concurrency: 1,
    onDocument (doc) {
      return {
        index: { _index: 'test' }
      }
    },
    onDrop (doc) {
      t.fail('This should never be called')
    }
  })

  t.type(result.time, 'number')
  t.type(result.bytes, 'number')
  t.match(result, {
    total: 3,
    successful: 3,
    retry: 0,
    failed: 0,
    aborted: false
  })

  server.stop()
})

My observations:

  • if the flushInterval is exactly the same as the server timeout, the third request is never being sent
  • if the flushInterval is different from the server timeout (doesn't matter if higher or lower), the code works as expected.

It happens both in v7 and v8.
This is weird, we'll investigate. Thank you for reporting!

@JoshMock
Copy link
Member

JoshMock commented Oct 4, 2023

@brentmjohnson I have a potential fix for this in #2027. Details in the comments there for how to test it out, if this is still affecting you. Barring any surprises during testing, the fix should be ready to go out in the next patch or minor release.

@JoshMock
Copy link
Member

JoshMock commented Dec 6, 2023

Writing a note for future-me, in case I need to test this again: the code below was useful for testing bulk flushInterval issues like the one described here. Not quite clean enough to add to any test suite, but still nice to hold on to. 😄

import * as http from 'http'
import { Readable } from 'stream'
import { Client } from '../'
import { buildServer } from './utils'
import { sleep } from './integration/helper'

const flushInterval = 1000

const dataset = [
  { user: 'jon', age: 23 },
  { user: 'arya', age: 18 },
  { user: 'tyrion', age: 39 }
]

async function handler (req: http.IncomingMessage, res: http.ServerResponse) {
  setTimeout(() => {
    res.writeHead(200, { 'content-type': 'application/json' })
    res.end(JSON.stringify({ errors: false, items: [{}] }))
  }, 1400)
}

async function main() {
  const [{ port }, server] = await buildServer(handler)
  const client = new Client({ node: `http://localhost:${port}` })

  console.log('one')
  await sleep(10000)
  console.log('two')
  await sleep(10000)
  console.log('three')
  await sleep(10000)

  const result = await client.helpers.bulk({
    datasource: Readable.from(generator()),
    flushBytes: 1,
    flushInterval: flushInterval,
    concurrency: 1,
    onDocument (_) {
      return {
        index: { _index: 'test' }
      }
    },
    onDrop (_) {
      throw new Error('onDrop')
    }
  })

  console.log(result)

  server.stop()
}

let generated = 0
async function * generator () {
  const data = dataset.slice()
  for (const doc of data) {
    await sleep(flushInterval)
    generated++
    console.log(`generated ${generated}`)
    yield doc
  }
}

main()
  .then(() => console.log('then'))
  .catch((err) => console.error('catch', err))
  .finally(() => console.log('finally'))

@JoshMock
Copy link
Member

Merged and released in 8.12.1. Thanks to @pquentin for handling the last steps while I was out on leave. 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants