Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt docs generate fails when a column name is " " #2564

Closed
1 of 5 tasks
clrcrl opened this issue Jun 17, 2020 · 13 comments
Closed
1 of 5 tasks

dbt docs generate fails when a column name is " " #2564

clrcrl opened this issue Jun 17, 2020 · 13 comments
Labels
bug Something isn't working

Comments

@clrcrl
Copy link
Contributor

clrcrl commented Jun 17, 2020

Describe the bug

A clear and concise description of what the bug is. What command did you run? What happened?

Steps to reproduce

  1. Create a model with a column name " " (I know)
$ cat models/customers.sql
select 1 as " "
  1. Run dbt docs generate

(Note this also applies to source tables with this column name, but easiest to reproduce by creating a model)

Expected behavior

An error, but with a good message

Actual behaviour

Encountered an error:
None is not of type 'string'

Failed validating 'type' in schema['properties']['name']:
    {'type': 'string'}

On instance['name']:
    None

Screenshots and log output

2020-06-17 19:22:49.606297 (MainThread): Encountered an error:
2020-06-17 19:22:49.607069 (MainThread): None is not of type 'string'

Failed validating 'type' in schema['properties']['name']:
    {'type': 'string'}

On instance['name']:
    None
2020-06-17 19:22:49.611463 (MainThread): jsonschema.exceptions.ValidationError: None is not of type 'string'

Failed validating 'type' in schema['properties']['name']:
    {'type': 'string'}

On instance['name']:
    None

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/main.py", line 81, in main
    results, succeeded = handle_and_check(args)
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/main.py", line 159, in handle_and_check
    task, res = run_from_args(parsed)
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/main.py", line 212, in run_from_args
    results = task.run()
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/task/generate.py", line 249, in run
    catalog = Catalog(catalog_data)
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/task/generate.py", line 62, in __init__
    self.add_column(col)
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/task/generate.py", line 97, in add_column
    column = ColumnMetadata.from_dict(column_data)
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/hologram/__init__.py", line 594, in from_dict
    cls.validate(data)
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/hologram/__init__.py", line 937, in validate
    raise ValidationError.create_from(error) from error
hologram.ValidationError: None is not of type 'string'

Failed validating 'type' in schema['properties']['name']:
    {'type': 'string'}

On instance['name']:
    None

System information

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

dbt --version
installed version: 0.18.0-b1
   latest version: 0.17.0

Your version of dbt is ahead of the latest release!

Plugins:
  - bigquery: 0.18.0b1
  - snowflake: 0.18.0b1
  - redshift: 0.18.0b1
  - postgres: 0.18.0b1

Additional context

  • Might also happen if you have a column named 'null'.
  • Here is the line of code

Workaround

Rename the column!

This often crops up if an existing relation (rather than a dbt-generated relation) has a strange column name. To find the column, use the information schema. For example, on Snowflake this would look like:

select * from <YOUR_DATABASE>.information_schema.columns
where (
    trim(column_name) = ''
    or column_name is null
    or lower(column_name) = 'null'
    or lower(column_name) = 'none'
) 

Make sure you check all databases that your dbt project references!

@clrcrl clrcrl added bug Something isn't working triage labels Jun 17, 2020
@drewbanin drewbanin removed the triage label Jun 17, 2020
@drewbanin drewbanin added this to the Marian Anderson milestone Jun 17, 2020
@jtcohen6 jtcohen6 removed this from the Marian Anderson milestone Jul 6, 2020
@ramgopalan
Copy link

Is there any update on this issue?

@clrcrl
Copy link
Contributor Author

clrcrl commented Sep 2, 2020

No update

Fortunately there's a good workaround — you can rename the column (do you really want a column named ' ' in your database? 👀 )

@ramgopalan
Copy link

Aah okay i don't have a column named ' ' but I give null to a column as I need it for validations at later point. eg: null as 'abc' .. this is resulting in same error

@clrcrl
Copy link
Contributor Author

clrcrl commented Sep 2, 2020

I feel that any column named null is going to be a bad idea 😬

@ramgopalan
Copy link

ramgopalan commented Sep 2, 2020

Yeah true that, we just want to use it so that the union doesn't break. Hence populating that column with null values. The column name given in the syntax is not NULL. but in the statement we are using it as NULL as 'abc' (select null as 'abc') - this is also resulting in this error.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Sep 2, 2020

Just to clarify, you're defining a model like:

select null as 'abc'
abc
null

I'm surprised to hear that you're seeing this error, since abc is a reasonable (not-null) column name...

@ramgopalan
Copy link

@jtcohen6 Yes, i use ephimeral models the column names are not null or ' '. They are all unique, but still I get the same error as I am populating it with null

@jtcohen6
Copy link
Contributor

jtcohen6 commented Sep 2, 2020

You're getting this error?

Encountered an error:
None is not of type 'string'

Failed validating 'type' in schema['properties']['name']:
    {'type': 'string'}

On instance['name']:
    None

Would you mind providing a few more details so that I could replicate it?

@ramgopalan
Copy link

ramgopalan commented Sep 3, 2020

Hi @jtcohen6 , I tried to replicate it in my colleague's system. It worked well, seems there is an issue with my system. I tried uninstalling dbt, it's dependency packages then installing it from scratch. I also uninstalled python and checked but still it fails. Do you recommend any options other than what I tried? PFB the error and error log:

image

Log File:

image

I initialized a new dbt project and ran dbt docs generate on top of it, which still resulted in the above errors.

@clrcrl
Copy link
Contributor Author

clrcrl commented Sep 3, 2020

As the original bug report states, this can also happen when an existing table in your data warehouse has a poorly-named column! (The above instructions were to replicate the bug in the easiest way possible)

I'd suggest using the information schema to find any columns with a name that would get coerced to a null value.

If you're using snowflake, this might look like:

select * from <YOUR_DATABASE>.information_schema.columns
where (
    trim(column_name) = ''
    or column_name is null
    or lower(column_name) = 'null'
    or lower(column_name) = 'none'
) 

Make sure you check it for all Snowflake databases that you use.

@ramgopalan
Copy link

ramgopalan commented Sep 3, 2020

Thanks Claire, You are a Savior! There was some irrelevant columns with null already existing which was leading to this error. @clrcrl I spent literally hours but still had no clue, thanks once again for you time and inputs :)

@clrcrl
Copy link
Contributor Author

clrcrl commented Sep 3, 2020

Ha, yes the first time we saw this bug it took a while to figure out what was going on!!!

@clrcrl
Copy link
Contributor Author

clrcrl commented Jul 8, 2021

Resolved by #3499 (tested with the develop version of dbt)

Screen Shot 2021-07-08 at 6 12 05 PM

@clrcrl clrcrl closed this as completed Jul 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants