Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] state.apply fails if pillar uses custom grain from _grains #65027

Open
3 tasks
bdrx312 opened this issue Aug 22, 2023 · 19 comments
Open
3 tasks

[BUG] state.apply fails if pillar uses custom grain from _grains #65027

bdrx312 opened this issue Aug 22, 2023 · 19 comments
Assignees
Labels
Bug broken, incorrect, or confusing behavior Confirmed Salt engineer has confirmed bug/feature - often including a MCVE

Comments

@bdrx312
Copy link
Contributor

bdrx312 commented Aug 22, 2023

Description

state.apply fails and does not sync custom grains and modules scripts from /srv/salt/_grains and and /srv/salt/_modules directories if a pillar uses the custom grain or custom execution module.

Setup

Please be as specific as possible and give set-up details.

  • VM (KVM) running on AWS ec2 instance.
  • onedir packaging
  • masterless

Steps to Reproduce the behavior

  • Setup directories and files; set minion to masterless; create a custom grain script and a pillar that uses custom grain returned from the script:

    cat > /etc/salt/minion <<'EOF'
    file_client: local
    master_type: disable
    EOF
    
    mkdir /srv/salt /srv/pillar /srv/salt/_grains
    cat > /srv/salt/top.sls <<'EOF'
    base:
      '*':
        - test
    EOF
    
    cat > /srv/salt/test.sls <<'EOF'
    "do nothing":
      test.nop: []
    EOF
    
    cat > /srv/salt/_grains/custom_grain.py <<'EOF'
    def main():
        return {'custom_grain': 'test_value'}
    EOF
    
    cat > /srv/pillar/top.sls <<'EOF'
    base:
      '*':
        - defaults
    EOF
    
    cat > /srv/pillar/defaults.sls <<'EOF'
    mypillar: "{{ grains['custom_grain'] }}"
    EOF
    
  • Run state.apply which results in an error

salt-call --local state.apply
salt.exceptions.SaltRenderError: Jinja variable 'dict object' has no attribute 'custom_grain'; line 1

---
mypillar: "{{ grains['custom_grain'] }}"    <======================

---
[CRITICAL] Pillar render error: Rendering SLS 'defaults' failed. Please see master log for details.
local:
    Data failed to compile:
--------
    Pillar failed to render with the following messages:
--------
    Rendering SLS 'defaults' failed. Please see master log for details.

Expected behavior
The custom _grains and _modules should be synced before rendering the pillars and state.apply should run successfully to completion.

Screenshots
If applicable, add screenshots to help explain your problem.

Versions Report

salt --versions-report
Salt Version:
    Salt: 3006.2

Python Version:
    Python: 3.10.12 (main, Aug 3 2023, 21:47:10) [GCC 11.2.0]

Dependency Versions:
    cffi: 1.14.6
    cherrypy: 18.6.1
    dateutil: 2.8.2
    Jinja2: 3.1.2
    msgpack: 1.0.2
    packaging: 22.0
    pycparser: 2.21
    pycryptodome: 3.9.8
    python-gnupg: 0.4.8
    PyYAML: 6.0.1
    PyZMQ: 23.2.0
    relenv: 0.13.3
    timelib: 0.2.4
    Tornado: 4.5.3
    ZMQ: 4.3.4

System Versions:
    dist: rhel 8.8 Ootpa
    locale: utf-8
    machine: x86_64
    release: 4.18.0-477.13.1.el8_8.x86_64
    system: Linux
    version: Red Hat Enterprise Linux 8.8 Ootpa

Additional context
Running salt-call --local saltutil.sync_all --pillar-root=/dev/null before running state.apply syncs the _grains and _modules correctly and allows the state.apply to run correctly.

@bdrx312 bdrx312 added Bug broken, incorrect, or confusing behavior needs-triage labels Aug 22, 2023
@bdrx312 bdrx312 changed the title [BUG] [BUG] state.apply fails if pillar uses custom grain Aug 22, 2023
@bdrx312 bdrx312 changed the title [BUG] state.apply fails if pillar uses custom grain [BUG] state.apply fails if pillar uses custom grain from _grains Aug 22, 2023
@dmurphy18
Copy link
Contributor

@bdrx312 Please retry the issue with the latest 3006.2, a number of fixes have been made since 3006.0 was released.

@dmurphy18 dmurphy18 self-assigned this Aug 22, 2023
@bdrx312
Copy link
Contributor Author

bdrx312 commented Aug 22, 2023

@bdrx312 Please retry the issue with the latest 3006.2, a number of fixes have been made since 3006.0 was released.

I did not realize that 3006.2 was out already. I updated and tried again but got the same error. I will edit the post to put the update version info.

@dmurphy18
Copy link
Contributor

@bdrx312 Tried this with Salt 3005.1 classic packaging and it failed, wondering if there is something missing in the instructions

local:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Specified SLS 'defaults' in environment 'base' is not available on the salt master

Trying to ensure it used to work and now broken

@dmurphy18
Copy link
Contributor

dmurphy18 commented Aug 25, 2023

Also from a salt-master

root@Unknown:/srv/pillar# salt td11 pillar.items
td11:
    ----------
    _errors:
        - Specified SLS 'defaults' in environment 'base' is not available on the salt master
root@Unknown:/srv/pillar#```

The salt-master

root@Unknown:/srv/pillar# l
total 8.0K
drwxr-xr-x. 4 root root 30 Aug 3 2020 ..
-rw-r--r--. 1 root root 28 Aug 25 16:23 top.sls
-rw-r--r--. 1 root root 41 Aug 25 16:23 default.sls
drwxr-xr-x. 2 root root 88 Aug 25 16:25 arch
drwxr-xr-x. 3 root root 49 Aug 25 16:25 .
root@Unknown:/srv/pillar#
root@Unknown:/srv/pillar# cat top.sls
base:
'*':
- defaults
root@Unknown:/srv/pillar# cat default.sls
mypillar: "{{ grains['custom_grain'] }}"
root@Unknown:/srv/pillar#

@dmurphy18 dmurphy18 added cannot-reproduce cannot be replicated with info/context provided and removed needs-triage labels Aug 25, 2023
@bdrx312
Copy link
Contributor Author

bdrx312 commented Aug 26, 2023

@bdrx312 Tried this with Salt 3005.1 classic packaging and it failed, wondering if there is something missing in the instructions

I added the salt minion configuration /etc/salt/minion to set it to masterless mode. I think all that is needed is the master_type: disable, and I went ahead and added file_client: local. I will have to check at work on Monday to see if any other settings are needed, but I believe that should be all to get it working.

@dmurphy18
Copy link
Contributor

@bdrx312 Even in masterless with file_client local, I am still unale to reproduce this with Salt 3006.1, and do not see a SaltRenderError in the logs:

[DEBUG   ] Gathering pillar data for state run
[DEBUG   ] Finished gathering pillar data for state run
[INFO    ] Loading fresh modules for state activity
[DEBUG   ] The functions from module 'jinja' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded jinja.render
[DEBUG   ] The functions from module 'yaml' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded yaml.render
[DEBUG   ] The functions from module 'highstate' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded highstate.output
local:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Specified SLS 'defaults' in environment 'base' is not available on the salt master
root@tdeb11:/srv/pillar# 

Can you reexamine the configurations on the minion in order to reproduce the issue you are experiencing ?, but so far I have been unable to reproduce the rendering error.

@bdrx312
Copy link
Contributor Author

bdrx312 commented Sep 5, 2023

@bdrx312 Even in masterless with file_client local, I am still unale to reproduce this with Salt 3006.1, and do not see a SaltRenderError in the logs:

[DEBUG   ] Gathering pillar data for state run
[DEBUG   ] Finished gathering pillar data for state run
[INFO    ] Loading fresh modules for state activity
[DEBUG   ] The functions from module 'jinja' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded jinja.render
[DEBUG   ] The functions from module 'yaml' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded yaml.render
[DEBUG   ] The functions from module 'highstate' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded highstate.output
local:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Specified SLS 'defaults' in environment 'base' is not available on the salt master
root@tdeb11:/srv/pillar# 

Can you reexamine the configurations on the minion in order to reproduce the issue you are experiencing ?, but so far I have been unable to reproduce the rendering error.

I had a typo/mismatch in the file name of the pillar defaults file. In the pillar top file I specified defaults, but in the file creation it was just /srv/pillar/default.yml (no s). I have corrected the original post to add the s making it /srv/pillar/defaults.yml.

@dmurphy18
Copy link
Contributor

dmurphy18 commented Sep 5, 2023

So with the default.sls -> defaults.sls, and re-run salt-call --local state.apply
It appears to work for me with Salt 3006.1

[DEBUG   ] File /var/cache/salt/minion/accumulator/139644020863808 does not exist, no need to cleanup
[DEBUG   ] The functions from module 'state' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded state.check_result
[DEBUG   ] The functions from module 'highstate' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded highstate.output
local:
----------
          ID: do nothing
    Function: test.nop
      Result: True
     Comment: Success!
     Started: 12:52:45.047935
    Duration: 1.222 ms
     Changes:   

Summary for local
------------
Succeeded: 1
Failed:    0
------------
Total states run:     1
Total run time:   1.222 ms
root@tdeb11:/srv/pillar# l
total 16K
drwxr-xr-x 2 root root 4.0K Sep  5 12:52 .
drwxr-xr-x 4 root root 4.0K Aug 25 16:14 ..
-rw-r--r-- 1 root root   41 Aug 25 16:14 defaults.sls
-rw-r--r-- 1 root root   28 Aug 25 16:14 top.sls
root@tdeb11:/srv/pillar#

Update to Salt 3006.2 and same result, it works for me with Salt 3006.2 too
From salt-call --local grains.items

    cpuarch:
        x86_64
    custom_grain:
        test_value

@bdrx312
Copy link
Contributor Author

bdrx312 commented Sep 6, 2023

I just tried also on a fresh bento/ubuntu-22.04 vagrant vm and experienced the same behavior. Try manually clearing your cache with salt-call --local saltutil.clear_cache and then re-run the state.apply

root@salt-test-box:~# salt-call --local state.apply
[ERROR   ] Rendering exception occurred
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 502, in render_jinja_tmpl
    output = template.render(**decoded_context)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 1291, in render
    self.environment.handle_exception()
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 925, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 1, in top-level template code
jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'custom_grain'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 261, in render_tmpl
    output = render_str(tmplstr, context, tmplpath)
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 509, in render_jinja_tmpl
    raise SaltRenderError("Jinja variable {}{}".format(exc, out), buf=tmplstr)
salt.exceptions.SaltRenderError: Jinja variable 'dict object' has no attribute 'custom_grain'
[CRITICAL] Rendering SLS 'defaults' failed, render error:
Jinja variable 'dict object' has no attribute 'custom_grain'
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 502, in render_jinja_tmpl
    output = template.render(**decoded_context)
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 1291, in render
    self.environment.handle_exception()
  File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 925, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 1, in top-level template code
jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'custom_grain'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/salt/pillar/__init__.py", line 887, in render_pstate
    state = compile_template(
  File "/usr/lib/python3/dist-packages/salt/template.py", line 99, in compile_template
    ret = render(input_data, saltenv, sls, **render_kwargs)
  File "/usr/lib/python3/dist-packages/salt/loader/lazy.py", line 149, in __call__
    return self.loader.run(run_func, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/salt/loader/lazy.py", line 1201, in run
    return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/salt/loader/lazy.py", line 1216, in _run_as
    return _func_or_method(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/salt/renderers/jinja.py", line 62, in render
    tmp_data = salt.utils.templates.JINJA(
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 261, in render_tmpl
    output = render_str(tmplstr, context, tmplpath)
  File "/usr/lib/python3/dist-packages/salt/utils/templates.py", line 509, in render_jinja_tmpl
    raise SaltRenderError("Jinja variable {}{}".format(exc, out), buf=tmplstr)
salt.exceptions.SaltRenderError: Jinja variable 'dict object' has no attribute 'custom_grain'
[CRITICAL] Pillar render error: Rendering SLS 'defaults' failed. Please see master log for details.
local:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Rendering SLS 'defaults' failed. Please see master log for details.

@dmurphy18
Copy link
Contributor

@bdrx312 Will try that, but I was using a VirtualBox Debian 11 amd64, from cold start.

And got

local:
    Data failed to compile:
----------
    Pillar failed to render with the following messages:
----------
    Rendering SLS 'defaults' failed. Please see master log for details.
root@tdeb11:/srv/salt#

And the SaltRenderError in the logs.
Thanks for the clear_cache
Will dig in

@dmurphy18 dmurphy18 added Confirmed Salt engineer has confirmed bug/feature - often including a MCVE and removed cannot-reproduce cannot be replicated with info/context provided labels Sep 6, 2023
@dmurphy18 dmurphy18 added this to the Sulfur v3006.4 milestone Sep 6, 2023
@dmurphy18
Copy link
Contributor

dmurphy18 commented Sep 7, 2023

Running the saltutil.clear_cache removes the custom grain, however saltutil.sync_grains does return it and all works as it should.

However, the documentation https://docs.saltproject.io/en/latest/topics/grains/index.html#syncing-grains, states that the state.highstate should automatically sync the grains, but that is not happening and getting the same failure as with state.apply. Also the custom grain is not getting rebuilt when the minion is restarted, e.g. systemctl restart salt-minion, as shown it is missing running grains.ls

This problem is only related to _grains, custom grains in /etc/salt/grains appear fine.

@dmurphy18
Copy link
Contributor

dmurphy18 commented Sep 7, 2023

This appears to be a corner-case with masterless minion, since with the salt://_grains on a master, the problem does not happen, except after the following commands:

  • salt tc7 saltutil.clear_cache
  • systemctl restart salt-minion
  • salt tc7 grains.ls (custom_grains is not listed)
  • salt tc7 saltutil.refresh_grains (doesn't sync _grains as per doc)
  • salt tc7 grains.ls (custom_grains is not listed)
  • salt tc7 saltutil.sync_grains
  • salt tc7 grains.ls (custom_grains is not listed)
  • custom_grains still not listed after this
  • salt tc7 saltutil.refresh_grains
  • salt tc7 grains.ls
  • salt tc7 saltutil.sync_all
  • salt tc7 grains.ls
    Even restarting the salt-master is not bringing custom_grains back, so also a hole with a salt-master too

Appears misread the doc and minion restart doesn't sync _grains

@dmurphy18
Copy link
Contributor

dmurphy18 commented Sep 12, 2023

Well the issue appears to be that the rendering error is encountered loading up the grains and pillar before we get to execute the call_highstate which will sync_all, and error out due to pillar errors just be we call highstate, see lines https://github.com/saltstack/salt/blob/master/salt/modules/state.py#L1173-L1192

Problem is with class Minion and SProxyMinion too, after the classes have loaded the grains (which doesn't load the _grains custom grains), they then immediately compile the pillar which has a file making use of the custom grain from _grains, and then encounter the render error. SProxyMinion method gen_modules, even does a sync_all, but it is too late after the call to compile_pillar which shows the render error.

Got a chicken and egg issue

@dmurphy18
Copy link
Contributor

dmurphy18 commented Sep 26, 2023

Have a problem found here , that is chicken and egg
https://github.com/saltstack/salt/pull/65186/files/f1de5431aac91d64d321c3ef31a5e970c9fd3ffe by @Ch3LL

In the dunder init for class SMinion the opts["master_uri"] is not filled in till after the call to ioloop.run_sync, see https://github.com/saltstack/salt/blob/master/salt/minion.py#L928-L932

But with the salt.utils.extmods.sync(opts, "grains"), there will be an attempt in AsyncReqChannel to use opts["master_uri"] for the remote client, it is called from SyncWrapper.

@dmurphy18
Copy link
Contributor

Closing since associated PR is merged

@dmurphy18
Copy link
Contributor

Re-opening this and reverting changes in #65186 since it is an incomplete fix and while merged and released in Salt 3006.5, it is causing problems.
See Issue #65692 and PR #65738

@dmurphy18 dmurphy18 reopened this Dec 20, 2023
@dmurphy18 dmurphy18 removed this from the Sulfur v3006.4 milestone Dec 20, 2023
@ixs
Copy link
Contributor

ixs commented Jul 1, 2024

Is it possible this is around with current 3007.1 as well?
I just upgraded to 3007.1 and can reproduce this, while downgrading to e.g. 3006.1 is working fine?

@dmurphy18
Copy link
Contributor

The problem will exist on 3007.1 and 3006.8, working on it but higher priority issues taking precedence, have the changes done, but need to write a lot more tests so don't have the problems that I did with the first attempt at fixing this which resulted in having to reverse the change.

@dmurphy18
Copy link
Contributor

PR #66737 is the rework of work done when accidentally closed #65792 when accidentally closed the branch that the work was being worked on.

@dwoz dwoz modified the milestones: Sulfur v3006.9, Sulfur v3006.10 Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Confirmed Salt engineer has confirmed bug/feature - often including a MCVE
Projects
None yet
Development

No branches or pull requests

4 participants