Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

404 returned when querying nodes that contain URL Encoded characters #1761

Open
LyndonArmitage opened this issue Nov 11, 2021 · 2 comments
Open
Labels
bug Something isn't working

Comments

@LyndonArmitage
Copy link

With an event similar to the attached scheme 3 namespaces are created. 2 of these are Dataset namespaces and 1 is a job namespace.

When trying to access one of these datasets a 404 is encountered

URL: http://localhost:3000/api/v1/lineage/?nodeId=dataset:jdbc%3Ah2%3Amem%3Asql_tests_like:HBMOFA.ORDDETP

{
  "eventType": "COMPLETE",
  "eventTime": 1636646662.687894,
  "run": {
    "runId": "d3968e5f-84ac-48c1-954c-f999ff27ef3a",
    "facets": null
  },
  "job": {
    "namespace": "sql-runner-dev",
    "name": "ORDDETP - ORDDETP.avro",
    "facets": {
      "sourceCodeLocation": null,
      "sql": {
        "_producer": "lyndon-thinkpad/127.0.1.1",
        "_schemaURL": "https://openlineage.io/spec/facets/1-0-0/SQLJobFacet.json#/$defs/SQLJobFacet",
        "query": "SELECT t.*, CURRENT_DATE AS ingest_date FROM HBMOFA.ORDDETP t WHERE (BAYY >= 93 AND BAMMDD >= 801 AND BAMMDD < 1111) OR (BAYY = 92 AND BAMMDD >= 1301)"
      },
      "documentation": {
        "_producer": "lyndon-thinkpad/127.0.1.1",
        "_schemaURL": "https://openlineage.io/spec/facets/1-0-0/DocumentationJobFacet.json#/$defs/DocumentationJobFacet",
        "description": "SQL Runner Job for /tmp/sql_runner_tests4560779590026189736/config.conf"
      }
    }
  },
  "inputs": [
    {
      "namespace": "jdbc:h2:mem:sql_tests_like",
      "name": "HBMOFA.ORDDETP",
      "facets": {
        "documentation": null,
        "dataSource": {
          "_producer": "lyndon-thinkpad/127.0.1.1",
          "_schemaURL": "https://openlineage.io/spec/facets/1-0-0/DatasourceDatasetFacet.json#/$defs/DatasourceDatasetFacet",
          "name": "jdbc:h2:mem:sql_tests_like",
          "uri": "jdbc:h2:mem:sql_tests_like"
        },
        "schema": null
      },
      "inputFacets": null
    }
  ],
  "outputs": [
    {
      "namespace": "s3://sql-runner",
      "name": "2021-11-11/incremental/ORDDETP.avro",
      "facets": {
        "documentation": null,
        "dataSource": {
          "_producer": "lyndon-thinkpad/127.0.1.1",
          "_schemaURL": "https://openlineage.io/spec/facets/1-0-0/DatasourceDatasetFacet.json#/$defs/DatasourceDatasetFacet",
          "name": "s3://sql-runner",
          "uri": "s3://sql-runner"
        },
        "schema": null
      },
      "outputFacets": null
    }
  ],
  "producer": "lyndon-thinkpad/127.0.1.1",
  "schemaURL": "https://openlineage.io/spec/1-0-2/OpenLineage.json#/$defs/RunEvent"
}

A partial stack trace from the docker process looks as follows:

marquez-api  | 172.21.0.4 - - [11/Nov/2021:16:49:56 +0000] "GET /api/v1/namespaces/jdbc%3Ah2%3Amem%3Asql_tests_like/datasets/HBMOFA.ORDDETP/versions?limit=100&offset=0 HTTP/1.1" 200 648 "http://localhost:3000/lineage/dataset/jdbc%3Ah2%3Amem%3Asql_tests_like/HBMOFA.ORDDETP" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36" 27
marquez-api  | ERROR [2021-11-11 16:49:59,168] io.dropwizard.jersey.errors.IllegalStateExceptionMapper: Error handling a request: 7b955941cb928966
marquez-api  | ! java.lang.IllegalStateException: No match available
marquez-api  | ! at java.base/java.util.regex.Matcher.start(Unknown Source)
marquez-api  | ! at marquez.service.models.NodeId.parts(NodeId.java:190)
marquez-api  | ! at marquez.service.models.NodeId.asDatasetId(NodeId.java:214)
marquez-api  | ! at marquez.service.LineageService.getJobUuid(LineageService.java:191)
marquez-api  | ! at marquez.service.LineageService.lineage(LineageService.java:40)
marquez-api  | ! at marquez.api.OpenLineageResource.getLineage(OpenLineageResource.java:96)

Note that the original OpenLineage POST request succeeds.

@LyndonArmitage
Copy link
Author

For more detail and the original discussion see the OpenLineage Slack here: https://openlineage.slack.com/archives/C01CK9T7HKR/p1636625611097100

@LyndonArmitage
Copy link
Author

Presumably the issue sits in: src/main/java/marquez/service/models/NodeId.java

Specifically this method

private String[] parts(int expectedParts, String expectedType)

Perhaps there is an issue with the Regex pattern used?

public static final String ID_DELIM = ":"; // line 34
Pattern p = Pattern.compile("(?:" + ID_DELIM + "(?!//|\\d+))"); // line 182
// means the pattern is equal to:
Pattern p = Pattern.compile("(?::(?!//|\\d+))");

I think this means the pattern is doing this:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests

2 participants