Skip to content
This repository was archived by the owner on Mar 20, 2023. It is now read-only.
This repository was archived by the owner on Mar 20, 2023. It is now read-only.

RuntimeError: Task got bad yield: 200 #61

Closed
@bisoldi

Description

@bisoldi

I'm getting a Task got bad yield: 200 error while using the bulk method. I've tried this different ways (i.e. without my own generator function). I've also tried an unofficial Helpers class that was written and I get the same error.

Here is the stack trace:

File "/Users/brooks.isoldi/git/Futures/css/ingest/PythonElasticsearchIngest/venv/lib/python3.6/site-packages/elasticsearch_async/transport.py", line 150, in main_loop
    method, url, params, body, headers=headers, ignore=ignore, timeout=timeout)
RuntimeError: Task got bad yield: 200

Below is the relevant code:

from elasticsearch import RequestsHttpConnection
from elasticsearch_async import AsyncElasticsearch
from assume_role_aws4auth import AssumeRoleAWS4Auth

credentials = boto3.Session().get_credentials()
awsauth = AssumeRoleAWS4Auth(credentials, 'us-east-1', 'es')
event_loop = asyncio.get_event_loop()
es_client = AsyncElasticsearch(hosts=['https://MY-ES_HOST'], http_compress=True, http_auth=awsauth, use_ssl=True,
                               verify_certs=True, connection_class=RequestsHttpConnection, loop=event_loop)


def read_chunk(file_path: str, max_batch_size: int, max_records: int):
    actions: str = ''
    actions_size: int = 0
    num_actions: int = 0
    with gzip.open(file_path, 'rt') as f:
        for line in f:
            request = json.dumps(dict({'index': dict({})})) + '\n' + line + '\n'
            request_size = len(request.encode('utf-8'))

            # Check to see if this record will put us over the limits
            if (actions_size + request_size) > max_batch_size or num_actions == max_records:
                yield actions
                actions = ''
                num_actions = 0
                actions_size = 0

            # Add the record
            actions += request
            num_actions += 1
            actions_size += request_size

    if actions != '':
        yield actions


async def process(filename: str):
    for action_chunk in read_chunk(filename, 10000000, 500):
        try:
            resp = await es_client.bulk(body=action_chunk, index='logs', doc_type='doc', _source=False)
            logger.error(resp)
        except Exception as ex:
            logger.error('Found an exception')
            logger.error(''.join(traceback.format_exception(etype=type(ex), value=ex, tb=ex.__traceback__)))
        await asyncio.sleep(.1)


event_loop.run_until_complete(process(filename))
pending = asyncio.Task.all_tasks()
event_loop.run_until_complete(asyncio.gather(*pending))

Any thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions