Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix splitlines bug. #1

Closed
wants to merge 1 commit into from
Closed

Conversation

unsub42
Copy link

@unsub42 unsub42 commented Jun 21, 2023

Description

There are a number of connectors (which specific ones is still under investigation; but at least Mandiant and Domaintools contribute to this issue) that provide description fields containing UTF-16 characters. Some of these UTF-16 characters include the 0x85 character. This is treated as a line terminator by the python splitlines function (it is an old IBM mainframe line terminator). When these characters are encountered by the SSEclient within the pycti library the JSON bundles are split at the 0x85 character causing the connector to generate an exception when processing json.loads.

Environment

  1. OS (where OpenCTI server runs): Ubuntu 22.02
  2. OpenCTI version: 5.7.6 and earlier
  3. OpenCTI client: python
  4. Other environment details: dockerized version

Reproducible Steps

Steps to create the smallest reproducible scenario:

  1. Run backup-files stream connector
  2. wait for the offending data to be processed through the stream
  3. backup-files exception:
    Traceback (most recent call last):
    File "/usr/local/lib/python3.10/site-packages/pycti/connector/opencti_connector_helper.py", line 460, in run
    self.callback(msg)
    File "/opt/opencti-highside-sync/connectors-master/stream/backup-files/src/backup-files.py", line 78, in _process_message
    data = json.loads(msg.data)
    ujson.JSONDecodeError: Unmatched '"' when decoding 'string'
    Terminated

Expected Output

The bundle in question written to an output file.

Actual Output

The bundle does not get written, and the backup-files connector restarts at the last saved timestamp,
and re-processes files until it gets to the bundle in question, and then dies again. The process repeats
until the connector is stopped.

Additional information

N/A

Screenshots (optional)

N/A

@unsub42
Copy link
Author

unsub42 commented Jun 21, 2023

@richard-julien This is the issue I discussed in email.

@unsub42
Copy link
Author

unsub42 commented Nov 13, 2023

Bumping this up again. This problem has re-occurred (and my local workaround was not in place at the time). A Shodan enrichment for the IP address: 149.210.166.136 contains 0xc2 0x85 (UTF-16 character) in the description. This causes the SSEclient to split on these characters disrupting processing of stream data.

@flavienSindou
Copy link

Thanks for your contribution. We fixed this with #2 and will soon release a new package version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants