Skip to content

varun-dc/databricks-nodejs-duplicate-column-select-bug-reproduction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

This repo demonstrates a bug where aliasing two columns with the same name results in the query's result missing data.

For example this query returns the expected data,

SELECT carat as a, color as b FROM default.diamonds LIMIT 2;

-- Result
┌─────────┬────────┬─────┐
│ (index) │   a    │  b  │
├─────────┼────────┼─────┤
│    0'0.23''E' │
│    1'0.21''E' │
└─────────┴────────┴─────┘

Whereas this query returns results missing data,

SELECT carat as a, color as a FROM default.diamonds LIMIT 2;

-- Result
┌─────────┬─────┐
│ (index) │  a  │
├─────────┼─────┤
│    0'E' │
│    1'E' │
└─────────┴─────┘

Running this example,

# Install deps
$ npm install

# Export the necessary configuration/authentication values
$ DATABRICKS_TOKEN=<your Databricks account access token>
$ DATABRICKS_SERVER_HOSTNAME=<your Databricks server host name>
$ DATABRICKS_HTTP_PATH=<your Databricks cluster's http path>

# Run the example showing the bug
$ node main.js

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published