Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL encode dataset names to support multibyte characters #1198

Merged
merged 7 commits into from
Sep 26, 2024
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion cosmos/operators/local.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import json
import os
import tempfile
import urllib.parse
import warnings
from abc import ABC, abstractmethod
from functools import cached_property
Expand Down Expand Up @@ -450,7 +451,7 @@ def get_datasets(self, source: Literal["inputs", "outputs"]) -> list[Dataset]:
uris = []
for completed in self.openlineage_events_completes:
for output in getattr(completed, source):
dataset_uri = output.namespace + "/" + output.name
dataset_uri = output.namespace + "/" + urllib.parse.quote(output.name)
tatiana marked this conversation as resolved.
Show resolved Hide resolved
uris.append(dataset_uri)
self.log.debug("URIs to be converted to Dataset: %s", uris)

Expand Down
Loading