-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse docs blocks when they appear in .sql files #979
Comments
We could probably implement this by adding an option for docs blocks file extensions. Right now, dbt looks for them in What happens if there are multiple docs blocks in a model file? This could happen if you wanted to document the model itself, and some of the resulting columns. Should dbt automatically infer that eg. the first docs block is the documentation for the model? Or, should we add some other specifier to the I'm into this idea, and I think the next step is the flesh out the syntax that will ultimately be used here. Once we have a good grasp on the syntax and semantics, I think the implementation shouldn't be too tricky. |
One possible convention:
Documenting models would be super-simple, and column docs could live right next to their column definitions. E.g., my_model.sql: {% docs my_model %}
This model is sooo slick. Learn more [here](example.com)
{% enddocs %}
SELECT
{% docs my_model.foo %}
The number of `Foo`s we should create.
{% enddocs %}
1 AS foo,
2 AS bar I don't think I'd mind having to explicitly say the model name or the column names. Doing so still beats defining this stuff in .md or .yml files IMHO. |
The simplest solution might be to only parse a single docs block out of the .sql file and assume it to be the model documentation. Not the cleanest or most intuitive, but it would work. Or it could be irrelevant entirely; since under the current implementation the docs block needs to be referenced in the schema.yml file, that behavior could be retained requiring the user to make that link once for each model, but thereafter only need to change the docs block in the model to update the documentation. this_model.sql {% docs something %}
...
{% enddocs %}
select
... schema.yml version: 2
models:
- name: this_model
description: {{ doc('something') }}
... Lastly, maybe it's possible to have an unnamed docs block, only valid in a model file, which is always model documentation? {% docs %}
...
{% enddocs %}
select
... |
I like the anonymous, single-block idea, and you're likely right that the implementation would be easier if we stick to single-blocks. I'm not a big fan of having to go to the On the other hand, it's a one-time cost for each model and would be better than what we have now. But I also really like the idea of being able to document models and columns inline, without any other configuration. People might actually document stuff if we made it that easy. |
Since the .sql files have macros that aren't available to the docs parser (e.g., `ref`), we can instead get the AST for the docs macros and directly read out the contents without rendering them. This assumes that when people write `docs` blocks, they don't contain to any other macros or other Jinja tomfoolery. It's hard for me to tell if that's a reasonable assumption or not. Relevant to dbt-labs#979
Turns out I'm basically asking for the same thing here #1051 but for custom tests! |
Yes please. 😁 |
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days. |
We are also using dbt more and more and the project says a lot to me in many aspects. The documentation is very clear and the API is very straight forward. However, one point that also bothers me is the artificial separation of model definition and sql. I likewise think that the schema yml are not really necessary. Background:Currently we are thinking about splitting our schema yml into several yml's - one per model. A single yml file quickly becomes confusing. Our folder structure would then look like this.
This would make the structure clearer with the current tool, but it would require a lot of typing. The model name as well as the schema yml structure are not necessary with an approach as it is described here. Related topicsIn the forum I came across a scipt that can add similar functionality: https://discourse.getdbt.com/t/here-is-a-way-to-write-dbt-docs-as-sql-comment/1658 The approach described here is quite nice, but preprocessing outside the standard is not an option for us. First MR's that would allow a development in this direction natively are unfortunately not continued: #1042 (comment) Also the described approach to integrate sql files via a config was unfortunately not pursued further. I think the activities as well as the many likes of the issue deserve some attention :-) ProposalHere my 3 ct to the topic Documentation in the sql file. The first anonymous docs block found describes the documentation of the model associated with the file: Content of model_a.sql:
The parser could automatically enrich the anonymous docs block with the name of the file it was found in respectively the model name - in this case "docs model_a".
Inline documentation of columns could be implemented using a leading "." :
Again, the parser could extrapolize the model name: The schema. yml handling in dbt docs could then default to " This approach would be, from what I can see so far, backwards compatible and implementable with manageable effort. What this approach does not yet cover: how to deal with tests. if a readable format can also be found to include them directly in sql, it would be possible to describe all the information belonging to the model directly in the model. |
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days. |
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest; add a comment to notify the maintainers. |
Just wanted to get a read here from maintainers if possible - is this something that has been discussed and nixed, or is it of interest? We are now in the position of documenting both the The documentation within the model file itself is preferable when reviewing PRs changing a model, updating an existing model, and/or updating documentation about an existing model. |
We're also dealing with this issue. Mostly writing docs in separate .md files but it creates file clutter and having the documentation separated from the SQL makes it harder to make changes as the goal and intentions of the model are detached from the code. Having documentation with the code would be best as everything the model is trying to achieve is all in the same context |
This is definitely an issue. As a developer you really want the same information in both places without the unnecessary duplication and maintenance. Being able to add the doc block before the SQL makes so much more sense. |
Agree with the recent comments; this problem is more acute than ever as our projects & docs grow in size & complexity. Multi-project instances, Python support, Semantic layer, metrics - it's not easy for a developer to keep track of, especially the why of certain things. SQL-embedded doc blocks feel like a natural solution that would fit very well within the modern dbt paradigm. |
Totally agree with recent comments. |
Feature
Feature description
It's my belief that documentation being in a separate file from the code it's documenting will result in the documentation falling behind model development in all but the most disciplined organizations. Therefore, I believe that
{% docs %}
blocks should be parsed from .sql files in addition to .md files.Granted, syntax highlighting is unlikely to be happy about the presence of markdown in a .sql file, but this is something we already deal with, having Jinja in .sql files. In the same way that we can nest a SQL comment within a Jinja comment, i.e.
{% /* comment fo' reals, yo! */ #}
so that both SQL and Jinja syntax highlighters will ignore the comment, users could conceivably wrap a docs block in a SQL block comment to the same effect, e.g.:Who will this benefit?
This would benefit people who wish to include documentation in the model itself, and would not negatively impact people who choose to document in the .yml file, or people who choose not to document at all (the heathens!)
The text was updated successfully, but these errors were encountered: