Linking attributes to data sources #903

mfrictionless · 2021-06-18T01:07:31Z

mfrictionless
Jun 18, 2021

Hey Everyone,

Looking for some help connecting Attributes to their Data Sources using Consumer API.

It's not immediately clear to me how the attributes in the Input and Output schemes are grouped and connected their data sources. Any help here would be appreciated.

{
    "executionPlan": {
        "_id": "0dc6ba88-d557-48a7-9814-30ef8760aa87",
        "name": null,
        "systemInfo": {
            "name": "spark",
            "version": "2.4.2"
        },
        "agentInfo": {
            "name": "spline",
            "version": "0.6.1"
        },
        "extra": {
            "appName": "Codeless Init Example Job",
            "attributes": [
                {
                    "id": "0dc6ba88-d557-48a7-9814-30ef8760aa87:10",
                    "name": "date",
                    "dataTypeId": "481b9029-ba21-4503-b1bc-45817d671071"
                },
                {
                    "id": "0dc6ba88-d557-48a7-9814-30ef8760aa87:11",
                    "name": "domain_code",
                    "dataTypeId": "6c8001b8-6b0f-4918-8831-787dbf079e76"
                },
                {
                    "id": "0dc6ba88-d557-48a7-9814-30ef8760aa87:12",
                    "name": "page_title",
                    "dataTypeId": "6c8001b8-6b0f-4918-8831-787dbf079e76"
                },
                {
                    "id": "0dc6ba88-d557-48a7-9814-30ef8760aa87:13",
                    "name": "count_views",
                    "dataTypeId": "85cd2984-74af-46ad-8619-977dd8c0ee10"
                },
                {
                    "id": "0dc6ba88-d557-48a7-9814-30ef8760aa87:14",
                    "name": "total_response_size",
                    "dataTypeId": "85cd2984-74af-46ad-8619-977dd8c0ee10"
                },
                {
                    "id": "0dc6ba88-d557-48a7-9814-30ef8760aa87:20",
                    "name": "page",
                    "dataTypeId": "6c8001b8-6b0f-4918-8831-787dbf079e76"
                }
            ],
            "dataTypes": [
                {
                    "_typeHint": "dt.Simple",
                    "id": "481b9029-ba21-4503-b1bc-45817d671071",
                    "name": "timestamp",
                    "nullable": true
                },
                {
                    "_typeHint": "dt.Simple",
                    "id": "6c8001b8-6b0f-4918-8831-787dbf079e76",
                    "name": "string",
                    "nullable": true
                },
                {
                    "_typeHint": "dt.Simple",
                    "id": "85cd2984-74af-46ad-8619-977dd8c0ee10",
                    "name": "integer",
                    "nullable": true
                },
                {
                    "_typeHint": "dt.Simple",
                    "id": "6f6a7098-dadc-4064-8e9c-197c3a558d80",
                    "name": "boolean",
                    "nullable": true
                },
                {
                    "_typeHint": "dt.Simple",
                    "id": "25b2d491-13e1-4790-8979-8857a4266f29",
                    "name": "integer",
                    "nullable": false
                }
            ]
        },
        "inputs": [
            {
                "sourceType": "csv",
                "source": "file:/opt/spline-spark-agent/examples/data/input/batch/wikidata.csv"
            }
        ],
        "output": {
            "sourceType": "parquet",
            "source": "file:/opt/spline-spark-agent/examples/data/output/batch/codeless_init_job_results"
        }
    },
    "graph": {
        "nodes": [
            {
                "_id": "0dc6ba88-d557-48a7-9814-30ef8760aa87:5",
                "_type": "Read",
                "name": "LogicalRelation",
                "properties": null
            },
            {
                "_id": "0dc6ba88-d557-48a7-9814-30ef8760aa87:4",
                "_type": "Transformation",
                "name": "SubqueryAlias",
                "properties": null
            },
            {
                "_id": "0dc6ba88-d557-48a7-9814-30ef8760aa87:3",
                "_type": "Transformation",
                "name": "Filter",
                "properties": null
            },
            {
                "_id": "0dc6ba88-d557-48a7-9814-30ef8760aa87:2",
                "_type": "Transformation",
                "name": "Filter",
                "properties": null
            },
            {
                "_id": "0dc6ba88-d557-48a7-9814-30ef8760aa87:1",
                "_type": "Transformation",
                "name": "Project",
                "properties": null
            },
            {
                "_id": "0dc6ba88-d557-48a7-9814-30ef8760aa87:0",
                "_type": "Write",
                "name": "InsertIntoHadoopFsRelationCommand",
                "properties": null
            }
        ],
        "edges": [
            {
                "source": "0dc6ba88-d557-48a7-9814-30ef8760aa87:5",
                "target": "0dc6ba88-d557-48a7-9814-30ef8760aa87:4"
            },
            {
                "source": "0dc6ba88-d557-48a7-9814-30ef8760aa87:4",
                "target": "0dc6ba88-d557-48a7-9814-30ef8760aa87:3"
            },
            {
                "source": "0dc6ba88-d557-48a7-9814-30ef8760aa87:3",
                "target": "0dc6ba88-d557-48a7-9814-30ef8760aa87:2"
            },
            {
                "source": "0dc6ba88-d557-48a7-9814-30ef8760aa87:2",
                "target": "0dc6ba88-d557-48a7-9814-30ef8760aa87:1"
            },
            {
                "source": "0dc6ba88-d557-48a7-9814-30ef8760aa87:1",
                "target": "0dc6ba88-d557-48a7-9814-30ef8760aa87:0"
            }
        ]
    }
}

Thanks,

M

Answered by wajda

Jun 18, 2021

If I understand your question correctly you are asking how Spline UI knows which attribute is input or output of which operation, and which data source it belongs to. This data is taken from another REST request - /consumer/operations/61f93234-149b-4565-af7a-6a6b1bfe5107:14
The JSON that you posted contains the graph structure, and the list of all attributes and datatypes that appears anywhere in the given execution plan. When you click on a node, it fetches the operation details JSON with the schema info. Each operation can take multiple schemas, one of which being output and the rest are inputs.

View full answer

wajda · 2021-06-18T15:19:36Z

wajda
Jun 18, 2021
Maintainer

If I understand your question correctly you are asking how Spline UI knows which attribute is input or output of which operation, and which data source it belongs to. This data is taken from another REST request - /consumer/operations/61f93234-149b-4565-af7a-6a6b1bfe5107:14
The JSON that you posted contains the graph structure, and the list of all attributes and datatypes that appears anywhere in the given execution plan. When you click on a node, it fetches the operation details JSON with the schema info. Each operation can take multiple schemas, one of which being output and the rest are inputs.

1 reply

mfrictionless Jun 18, 2021
Author

Thank you. I did not explore the Operations output when first running through this. It works perfectly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Linking attributes to data sources #903

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Linking attributes to data sources #903

mfrictionless Jun 18, 2021

Replies: 1 comment · 1 reply

wajda Jun 18, 2021 Maintainer

mfrictionless Jun 18, 2021 Author

mfrictionless
Jun 18, 2021

Replies: 1 comment 1 reply

wajda
Jun 18, 2021
Maintainer

mfrictionless Jun 18, 2021
Author