Parse docs blocks when they appear in .sql files #979

ghost · 2018-09-06T22:42:36Z

Feature

Feature description

It's my belief that documentation being in a separate file from the code it's documenting will result in the documentation falling behind model development in all but the most disciplined organizations. Therefore, I believe that {% docs %} blocks should be parsed from .sql files in addition to .md files.

Granted, syntax highlighting is unlikely to be happy about the presence of markdown in a .sql file, but this is something we already deal with, having Jinja in .sql files. In the same way that we can nest a SQL comment within a Jinja comment, i.e. {% /* comment fo' reals, yo! */ #} so that both SQL and Jinja syntax highlighters will ignore the comment, users could conceivably wrap a docs block in a SQL block comment to the same effect, e.g.:

/* {% docs foo %}
## Bar!
{% enddocs %} */
select ...

Who will this benefit?

This would benefit people who wish to include documentation in the model itself, and would not negatively impact people who choose to document in the .yml file, or people who choose not to document at all (the heathens!)

The text was updated successfully, but these errors were encountered:

drewbanin · 2018-10-01T23:39:16Z

We could probably implement this by adding an option for docs blocks file extensions. Right now, dbt looks for them in .md files, but we could add a config option to also look in .sql files (or really, extensions of your choosing).

What happens if there are multiple docs blocks in a model file? This could happen if you wanted to document the model itself, and some of the resulting columns. Should dbt automatically infer that eg. the first docs block is the documentation for the model? Or, should we add some other specifier to the {% docs %} signature that indicates that it's for the current model/a specific column/etc?

I'm into this idea, and I think the next step is the flesh out the syntax that will ultimately be used here. Once we have a good grasp on the syntax and semantics, I think the implementation shouldn't be too tricky.

jakebiesinger · 2018-10-01T23:53:00Z

One possible convention:

All docs blocks in .sql files are required to have the model name as a prefix.
We document the model itself by looking for a block w/a name that matches the model name.
We document columns by looking for blocks w/name model_name.column_name.
No restrictions are placed on the positioning of any doc blocks inside the .sql file

Documenting models would be super-simple, and column docs could live right next to their column definitions. E.g., my_model.sql:

{% docs my_model %}
This model is sooo slick. Learn more [here](example.com)
{% enddocs %}

SELECT
  {% docs my_model.foo %}
    The number of `Foo`s we should create.
  {% enddocs %} 
  1 AS foo,
  2 AS bar

I don't think I'd mind having to explicitly say the model name or the column names. Doing so still beats defining this stuff in .md or .yml files IMHO.

ghost · 2018-10-02T06:18:04Z

The simplest solution might be to only parse a single docs block out of the .sql file and assume it to be the model documentation. Not the cleanest or most intuitive, but it would work.

Or it could be irrelevant entirely; since under the current implementation the docs block needs to be referenced in the schema.yml file, that behavior could be retained requiring the user to make that link once for each model, but thereafter only need to change the docs block in the model to update the documentation.

this_model.sql

{% docs something %}
...
{% enddocs %}

select
...

schema.yml

version: 2

models:
  - name: this_model
    description: {{ doc('something') }}
    ...

Lastly, maybe it's possible to have an unnamed docs block, only valid in a model file, which is always model documentation?

{% docs %}
...
{% enddocs %}

select
...

jakebiesinger · 2018-10-02T17:51:06Z

I like the anonymous, single-block idea, and you're likely right that the implementation would be easier if we stick to single-blocks.

I'm not a big fan of having to go to the schema.yml to explicitly tie the docs to the model. IMHO, the user already told us that it's a docstring for the model by putting it in the model's .sql file!

On the other hand, it's a one-time cost for each model and would be better than what we have now.

But I also really like the idea of being able to document models and columns inline, without any other configuration. People might actually document stuff if we made it that easy.

Since the .sql files have macros that aren't available to the docs parser (e.g., `ref`), we can instead get the AST for the docs macros and directly read out the contents without rendering them. This assumes that when people write `docs` blocks, they don't contain to any other macros or other Jinja tomfoolery. It's hard for me to tell if that's a reasonable assumption or not. Relevant to dbt-labs#979

tayloramurphy · 2018-10-12T15:39:08Z

Turns out I'm basically asking for the same thing here #1051 but for custom tests!

MartinGuindon · 2019-03-21T00:44:39Z

Yes please. 😁

github-actions · 2022-01-07T01:55:24Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

martinspaeth · 2022-01-07T22:57:59Z

We are also using dbt more and more and the project says a lot to me in many aspects. The documentation is very clear and the API is very straight forward. However, one point that also bothers me is the artificial separation of model definition and sql. I likewise think that the schema yml are not really necessary.

Background:

Currently we are thinking about splitting our schema yml into several yml's - one per model. A single yml file quickly becomes confusing.

Our folder structure would then look like this.


models
 - model_a.yml 
 - model_a.sql

 - model_b.yml 
 - model_b.sql

This would make the structure clearer with the current tool, but it would require a lot of typing. The model name as well as the schema yml structure are not necessary with an approach as it is described here.

Proposal

Here my 3 ct to the topic Documentation in the sql file.

The first anonymous docs block found describes the documentation of the model associated with the file:

Content of model_a.sql:

{% docs %}
This model is sooo slick. Learn more [here](example.com)
{% enddocs %}

SELECT
  {% docs .foo %}
    The number of `Foo`s we should create.
  {% enddocs %} 
  1 AS foo,
  2 AS bar

The parser could automatically enrich the anonymous docs block with the name of the file it was found in respectively the model name - in this case "docs model_a".

{% docs %}
This model is sooo slick. Learn more [here](example.com)
{% enddocs %}

Inline documentation of columns could be implemented using a leading "." :

  {% docs .foo %}
    The number of `Foo`s we should create.
  {% enddocs %}

Again, the parser could extrapolize the model name: {% docs model_a.foo %}

The schema. yml handling in dbt docs could then default to "description: {{ doc("model_name") }}" if no model description was explicitly specified. Analogous for columns
then description: {{ doc("model_name.columnName") }} would be possible.

This approach would be, from what I can see so far, backwards compatible and implementable with manageable effort.

What this approach does not yet cover: how to deal with tests. if a readable format can also be found to include them directly in sql, it would be possible to describe all the information belonging to the model directly in the model.

github-actions · 2022-07-07T02:14:47Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

github-actions · 2022-07-14T02:15:34Z

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest; add a comment to notify the maintainers.

alexscott-ff · 2023-11-20T22:50:51Z

Just wanted to get a read here from maintainers if possible - is this something that has been discussed and nixed, or is it of interest? We are now in the position of documenting both the .sql files AND putting documentation in schema.yml and it leads to documentation getting out of sync.

The documentation within the model file itself is preferable when reviewing PRs changing a model, updating an existing model, and/or updating documentation about an existing model. schema.yml often gets forgotten about so our documentation tends to get stale.

visserp · 2023-11-24T09:49:35Z

We're also dealing with this issue. Mostly writing docs in separate .md files but it creates file clutter and having the documentation separated from the SQL makes it harder to make changes as the goal and intentions of the model are detached from the code.

Having documentation with the code would be best as everything the model is trying to achieve is all in the same context

Phil-T1 · 2023-11-29T11:24:43Z

This is definitely an issue.

As a developer you really want the same information in both places without the unnecessary duplication and maintenance.

Being able to add the doc block before the SQL makes so much more sense.

ktuft-cbh · 2023-11-29T14:13:31Z

Agree with the recent comments; this problem is more acute than ever as our projects & docs grow in size & complexity. Multi-project instances, Python support, Semantic layer, metrics - it's not easy for a developer to keep track of, especially the why of certain things. SQL-embedded doc blocks feel like a natural solution that would fit very well within the modern dbt paradigm.

nicodp-leap · 2023-12-04T14:58:58Z

Totally agree with recent comments.
Having the info/documentation in the same file where the model is defined is something needed to avoid unnecessary duplication.

drewbanin modified the milestone: 0.11.1 - Lucretia Mott Sep 7, 2018

This was referenced Oct 3, 2018

Updates DocumentParser to look for docs in .sql files #1042

Closed

Attaches matching docs to model names and columns. #1043

Closed

drewbanin mentioned this issue Oct 3, 2018

Add ability to document macros #1041

Closed

drewbanin added dbt-docs [dbt feature] documentation site, powered by metadata artifacts enhancement New feature or request labels Oct 13, 2018

github-actions bot added the stale Issues that have gone stale label Jan 7, 2022

github-actions bot removed the stale Issues that have gone stale label Jan 8, 2022

github-actions bot added the stale Issues that have gone stale label Jul 7, 2022

github-actions bot closed this as completed Jul 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse docs blocks when they appear in .sql files #979

Parse docs blocks when they appear in .sql files #979

ghost commented Sep 6, 2018 •

edited by ghost

Loading

drewbanin commented Oct 1, 2018

jakebiesinger commented Oct 1, 2018

ghost commented Oct 2, 2018

jakebiesinger commented Oct 2, 2018 •

edited

Loading

tayloramurphy commented Oct 12, 2018

MartinGuindon commented Mar 21, 2019

github-actions bot commented Jan 7, 2022

martinspaeth commented Jan 7, 2022 •

edited

Loading

github-actions bot commented Jul 7, 2022

github-actions bot commented Jul 14, 2022

alexscott-ff commented Nov 20, 2023

visserp commented Nov 24, 2023

Phil-T1 commented Nov 29, 2023

ktuft-cbh commented Nov 29, 2023

nicodp-leap commented Dec 4, 2023

Parse docs blocks when they appear in .sql files #979

Parse docs blocks when they appear in .sql files #979

Comments

ghost commented Sep 6, 2018 • edited by ghost Loading

Feature

Feature description

Who will this benefit?

drewbanin commented Oct 1, 2018

jakebiesinger commented Oct 1, 2018

ghost commented Oct 2, 2018

jakebiesinger commented Oct 2, 2018 • edited Loading

tayloramurphy commented Oct 12, 2018

MartinGuindon commented Mar 21, 2019

github-actions bot commented Jan 7, 2022

martinspaeth commented Jan 7, 2022 • edited Loading

Background:

Related topics

Proposal

github-actions bot commented Jul 7, 2022

github-actions bot commented Jul 14, 2022

alexscott-ff commented Nov 20, 2023

visserp commented Nov 24, 2023

Phil-T1 commented Nov 29, 2023

ktuft-cbh commented Nov 29, 2023

nicodp-leap commented Dec 4, 2023

ghost commented Sep 6, 2018 •

edited by ghost

Loading

jakebiesinger commented Oct 2, 2018 •

edited

Loading

martinspaeth commented Jan 7, 2022 •

edited

Loading