Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for text type #204

Closed
13 tasks done
Tracked by #241 ...
sirex opened this issue Mar 9, 2022 · 1 comment · Fixed by #304
Closed
13 tasks done
Tracked by #241 ...

Add support for text type #204

sirex opened this issue Mar 9, 2022 · 1 comment · Fixed by #304
Labels

Comments

@sirex
Copy link
Collaborator

sirex commented Mar 9, 2022

In manifest there is a possibility to separate natural language text properties from any other string types.

Specification: https://atviriduomenys.readthedocs.io/dsa/duomenu-tipai.html#tekstiniai-duomenys

text type should be implemented only on postgresql backend.

On PostgreSQL text type properties should be saves as JSONB objects, like this:

{
  "lt": "Pavyzdys",
  "en": "Example",
  "": "Example",
}

Here "" key will be used, when language code for property is not specified.

  • Add support for text type.
  • Parse name@lang string and store language code as type parameter.
  • Implement write operation on text types.
  • Implement read operation on text types.
    • Select language by Accept-Language.
    • Fall back to default language specified in configuration.
  • Test if patching and changelog works properly.
  • Test if reading from external sources works properly.
  • Test if pushing works properly.
  • Add support for cases, when language code is not given, in this case it should save under empty language code and level must be less than or equal to 3. Level 2 if multiple languages are mixed.
  • Test if select(prop@lang) works.
  • Test if prop@lang=value works.
  • Test if sort(prop@lang) works.

Links

@sirex sirex moved this to Tikrinama in II duomenų atvėrimo etapas Mar 9, 2022
@sirex sirex self-assigned this Mar 9, 2022
@sirex sirex moved this to Todo in Portalo plėtra Mar 9, 2022
@sirex sirex removed the status in Portalo plėtra Jun 7, 2022
@sirex sirex moved this to Todo in Portalo plėtra Jun 8, 2022
@sirex sirex removed the status in Portalo plėtra Jun 8, 2022
@sirex sirex removed their assignment Jun 13, 2022
@sirex sirex linked a pull request Oct 2, 2022 that will close this issue
@sirex sirex added this to Palaikymas Nov 8, 2022
@adp-atea adp-atea moved this to In Progress in Portalo plėtra Feb 8, 2023
@adp-atea adp-atea moved this from In Progress to Review in Portalo plėtra Feb 13, 2023
@sirex sirex moved this from Review to Todo in Portalo plėtra Feb 17, 2023
@adp-atea adp-atea moved this from Todo to In Progress in Portalo plėtra Feb 20, 2023
@adp-atea adp-atea moved this from In Progress to Todo in Portalo plėtra Feb 21, 2023
@adp-atea adp-atea moved this from Todo to In Progress in Portalo plėtra Feb 23, 2023
@adp-atea adp-atea moved this from In Progress to Todo in Portalo plėtra Mar 1, 2023
@sirex sirex removed this from Palaikymas Mar 6, 2023
@adp-atea adp-atea moved this from Todo to In Progress in Portalo plėtra Mar 23, 2023
@adp-atea adp-atea moved this from Todo to In Progress in Portalo plėtra Apr 18, 2023
@adp-atea adp-atea moved this from In Progress to Todo in Portalo plėtra May 8, 2023
@adp-atea adp-atea moved this from Todo to In Progress in Portalo plėtra May 11, 2023
@adp-atea adp-atea moved this from In Progress to Review in Portalo plėtra May 17, 2023
@sirex sirex moved this from Review to Todo in Portalo plėtra Jul 26, 2023
@adp-atea adp-atea moved this from Todo to In Progress in Portalo plėtra Aug 2, 2023
@adp-atea adp-atea moved this from In Progress to Review in Portalo plėtra Aug 11, 2023
@adp-atea adp-atea moved this from Review to In Progress in Portalo plėtra Aug 11, 2023
@adp-atea adp-atea moved this from In Progress to Review in Portalo plėtra Aug 15, 2023
@adp-atea adp-atea moved this from Review to In Progress in Portalo plėtra Oct 19, 2023
@adp-atea adp-atea moved this from In Progress to Todo in Portalo plėtra Oct 19, 2023
@sirex
Copy link
Collaborator Author

sirex commented Oct 25, 2023

A suggestion. Maybe we should separate text from string types? Like here:

d | r | b | m | property | type   | ref | source  | uri
example                  |        |     |         |
                         | prefix | dct |         | http://purl.org/dc/terms/
  |   |   | City         |        |     | CITIES  |
  |   |   |   | name     | text   |     |         | dct:title
  |   |   |   | name@en  | string |     | NAME_EN |
  |   |   |   | name@lt  | string |     | NAME_LT |

Here we have a name of text type and we can specify uri for name, which also includes all language tags. We have to specify uri only for text type, because RDF has built-in support for languages.

name@en and name@lt are defined as string, but since property name has language tag, we know, that these strings are part of name.

Also text type could be implied, without explicitly declaring it, like this:

d | r | b | m | property | type   | ref | source
example                  |        |     |
  |   |   | City         |        |     | CITIES
  |   |   |   | name@en  | string |     | NAME_EN
  |   |   |   | name@lt  | string |     | NAME_LT

Here, we know, that name is a text, because property names have language tags.

Regarding source, if we would have following manifest table:

d | r | b | m | property | type   | ref | source | level
example                  |        |     |        |
  |   |   | City         |        |     | CITIES |
  |   |   |   | name     | text   |     | NAME   | 2

If we don't know the language or data has mixed languages, then we can read data directly into text type, this would result in:

{"name": {"": "Vilnius"}}

Empty language tag does not mean a default language, it means, that language is unknown.

When reading or writing, text type should accept two forms:

Implicit form:

{"name": "Vilnius"}

Explicit form:

{"name": {"lt": "Vilnius"}}

When querying data, for example here:

/example/City?select(name)&name.startswith("V")&sort(name)

we use implicit form, since language tag is not specified. When language tag is not specified, we always detect language from:

  • Accept-Language header, to pick client prefered language.
  • If client does not prived Accept-Language or if prefered languaged is not available (is not declared in manfiest), then fall back to default language specified in Spinta languages configuration and pick first language declared in manifest.

For example if we write data like this:

{"name": "Vilnius"}

Then we use language detection described above and set client prefered or system default language. If client prefered en and if en is declared in manifest, then we convert above value into:

{"name": {"en": "Vilnius"}}

But if explicif form is given, then we do not do any detection, and used specified language. If specified language is not declared, then we raise an error.

If we want to specify an unknown language, we use C as language code in @ expressions, for example:

/example/City?select(name@C)&name@C.startswith("V")&sort(name@C)

Here, we explicily refer to unknown language, which internaly is represented as:

{"name": {"": "Vilnius"}}

@adp-atea adp-atea moved this from Todo to In Progress in Portalo plėtra Oct 27, 2023
@adp-atea adp-atea moved this from In Progress to Todo in Portalo plėtra Nov 6, 2023
@adp-atea adp-atea moved this from Todo to In Progress in Portalo plėtra Nov 7, 2023
@adp-atea adp-atea moved this from In Progress to Review in Portalo plėtra Nov 14, 2023
@github-project-automation github-project-automation bot moved this from Review to Done in Portalo plėtra Nov 21, 2023
@sirex sirex moved this from Done to Deploy in Portalo plėtra Nov 21, 2023
@sirex sirex moved this from Deploy to Test in Portalo plėtra Nov 21, 2023
@ATEAanalyst ATEAanalyst added the U5 label Aug 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Test
Development

Successfully merging a pull request may close this issue.

2 participants