Eval clears all variables #984

wanaryytel · 2016-10-21T20:30:27Z

Hey,
I have a small TICKscript for monitoring when the server runs out of free disk space. Looks like this:

// Alerts when more than 3 GB + 10% of the server total disk space are used up.

var gb = 1000 * 1000 * 1000

stream
    |from()
        .measurement('disk')
// Some unexplainable magic happens with this eval. It would be lovely to use but as long as
// it breaks everything, let it be commented out.
//  |eval(lambda: "free" / gb)
//          .as('free_gb')
//      .keep('free', 'total', 'used', 'used_percent')
    |groupBy('host', 'path')
    |window()
        .period(30s)
        .every(30s)
    |alert()
        .id('Free disk space @ {{ index .Tags "host" }}')
        .message('{{ .ID }} is {{ .Level }}! {{ index .Fields "used_percent" | printf "%0.2f" }}% is used up.')

        .crit(lambda: gb / 2 + "total" / 100 * 10 > "free")
        .warn(lambda: 3 * gb + "total" / 100 * 10 > "free")
        .stateChangesOnly(12h)

        .email()
        .slack()

        .log('/var/log/kapacitor/low_disk.log')

PS. Although Github displays it weird, I don't have indentation like that in the actual script.

Problem is the commented out |eval() part that for some reason clears all my incoming variables and if I comment that in, no alerts are triggered since all variables are <nil>. Is this desired behavior? I guess one problem might be that I'm passing more variables to keep() than are changed in that |eval(), but I tried it with only free as well and got the same results. It seems to me that I copied the |eval() directly from documentation and can't figure out what's wrong with it.
But the reason I need it at all is that the incoming data gives free space on disk in bytes and I would like to send out MB-s or GB-s instead of 6584745867 bytes. :) So another question, is there a way to
a) add the eval() output to message()?
or b) do that in any other way? I tried to do some magic with the printf thingie, but I couldn't find any documentation for it anywhere so didn't get anywhere with that either.

I'm running on a 64bit Ubuntu 14.04 and have the latest Kapacitor installed from https://dl.influxdata.com/kapacitor/releases/kapacitor_1.0.2_amd64.deb.

The text was updated successfully, but these errors were encountered:

nathanielc · 2016-10-24T21:12:05Z

@wanaryytel Your example looks correct. Can you share some sample data that can reproduce the error?

wanaryytel · 2016-10-25T08:23:00Z

@nathanielc How should that data be gathered? By recording and saving it to a file?

nathanielc · 2016-10-25T15:01:30Z

Sure that works well, the recording data files are stored in /var/lib/kapacitor/replays by default. Or just a text file with line protocol that reproduces the error would be enough. Thanks

wanaryytel · 2016-10-25T18:58:36Z

@nathanielc Is there a way to go through the .srpl file data to make sure there isn't any sensitive data in that? It seems to be binary format by default. Or another way to get the output of the recording?

nathanielc · 2016-10-25T19:02:30Z

@wanaryytel Sure, the .srpl file is really just gzipped text data so you can extract it and take a look.

wanaryytel · 2016-10-25T20:38:56Z

@nathanielc
Here you go :)

8b940579-27f7-4d90-9a91-cdf74e1ad96d.txt

wanaryytel · 2016-11-06T22:40:46Z

@nathanielc Hey, I updated to the latest release but it didn't fix the problem. I think you missed the part in the initial description that said "I guess one problem might be that I'm passing more variables to keep() than are changed in that |eval(), but I tried it with only free as well and got the same results."

That's exactly what I'm seeing. Output from the same script's kapactior show low_disk:

ID: low_disk
Error: 
Template: 
Type: stream
Status: enabled
Executing: true
Created: 20 Oct 16 06:55 EDT
Modified: 06 Nov 16 17:35 EST
LastEnabled: 06 Nov 16 17:35 EST
Databases Retention Policies: ["telegraf"."autogen"]
TICKscript:
// Alerts when more than 3 GB + 10% of the server total disk space are used up.
var gb = 1000 * 1000 * 1000

stream
    |from()
        .measurement('disk')
    // Some unexplainable magic happens with this eval. It would be lovely to use but as long as
    // it breaks everything, let it be commented out.
    |eval(lambda: "free" / gb)
        .as('free_gb')
        .keep('free', 'free_gb')
    |groupBy('host', 'path')
    |window()
        .period(30s)
        .every(30s)
    |alert()
        .id('Free disk space @ `{{ index .Tags "host" }}`')
        .message('{{ .ID }} is *{{ .Level }}*! {{ index .Fields "used_percent" | printf "%0.2f" }}% is used up.')
        .crit(lambda: gb / 2 + "total" / 100 * 10 > "free")
        .warn(lambda: 3 * gb + "total" / 100 * 10 > "free")
        //      .stateChangesOnly(12h)
        .email()
        .slack()
        .log('/var/log/kapacitor/low_disk.log')

DOT:
digraph low_disk {
graph [throughput="0.00 points/s"];

stream0 [avg_exec_time_ns="0" ];
stream0 -> from1 [processed="69"];

from1 [avg_exec_time_ns="6.097µs" ];
from1 -> eval2 [processed="69"];

eval2 [avg_exec_time_ns="114.429µs" eval_errors="0" ];
eval2 -> groupby3 [processed="69"];

groupby3 [avg_exec_time_ns="4.01µs" ];
groupby3 -> window4 [processed="69"];

window4 [avg_exec_time_ns="5.229µs" ];
window4 -> alert5 [processed="13"];

alert5 [alerts_triggered="0" avg_exec_time_ns="320.375µs" crits_triggered="0" infos_triggered="0" oks_triggered="0" warns_triggered="0" ];

stats6 [avg_exec_time_ns="0" ];
stats6 -> derivative7 [processed="6"];

derivative7 [avg_exec_time_ns="0" ];
derivative7 -> alert8 [processed="5"];

alert8 [alerts_triggered="5" avg_exec_time_ns="763.275µs" crits_triggered="5" infos_triggered="0" oks_triggered="0" warns_triggered="0" ];
}

No alerts are triggered with this script. When I comment out the eval, everything works fine. Could you reopen this issue?

nathanielc · 2016-11-07T16:16:36Z

@wanaryytel I tested this again with your replay file and the above script. It works as is. Both the free and free_gb fields exist on the data when it is passed to the alert node.

Specifically which version did you test?

wanaryytel · 2016-11-07T20:35:53Z

@nathanielc Both v1.0.2 and v1.1.0-rc2. Exactly the same behavior. I'm not 100% sure if all the variables are cleared, but when the eval is commented in, no alerts are triggered. It was just my assumption that when all variables are undefined, then that could be a probable reason for not triggering alerts.

nathanielc · 2016-11-07T22:51:01Z

@wanaryytel Hmm, that is odd.

Can you try this TICKscript? It just has a few log nodes added in:

stream
    |from()
        .measurement('disk')
    // Some unexplainable magic happens with this eval. It would be lovely to use but as long as
    // it breaks everything, let it be commented out.
    |log()
        .prefix('PRE EVAL')
    |eval(lambda: "free" / gb)
        .as('free_gb')
        .keep('free', 'free_gb')
    |log()
        .prefix('POST EVAL')
    |groupBy('host', 'path')
    |window()
        .period(30s)
        .every(30s)
    |alert()
        .id('Free disk space @ `{{ index .Tags "host" }}`')
        .message('{{ .ID }} is *{{ .Level }}*! {{ index .Fields "used_percent" | printf "%0.2f" }}% is used up.')
        .crit(lambda: gb / 2 + "total" / 100 * 10 > "free")
        .warn(lambda: 3 * gb + "total" / 100 * 10 > "free")
        //      .stateChangesOnly(12h)
        .email()
        .slack()
        .log('/var/log/kapacitor/low_disk.log')

Then could you share the log lines that start with PRE/POST EVAL? That should help us get to the bottom of what is going on.

wanaryytel · 2016-11-08T10:01:43Z

@nathanielc Ran your script (and added the gb variable at the start) for about 5 minutes, but again - no alerts triggered and no log file created for it.

ID: ld
Error: 
Template: 
Type: stream
Status: enabled
Executing: true
Created: 08 Nov 16 04:54 EST
Modified: 08 Nov 16 04:54 EST
LastEnabled: 08 Nov 16 04:54 EST
Databases Retention Policies: ["telegraf"."autogen"]
TICKscript:
var gb = 1000 * 1000 * 1000

stream
    |from()
        .measurement('disk')
    // Some unexplainable magic happens with this eval. It would be lovely to use but as long as
    // it breaks everything, let it be commented out.
    |log()
        .prefix('PRE EVAL')
    |eval(lambda: "free" / gb)
        .as('free_gb')
        .keep('free', 'free_gb')
    |log()
        .prefix('POST EVAL')
    |groupBy('host', 'path')
    |window()
        .period(30s)
        .every(30s)
    |alert()
        .id('Free disk space @ `{{ index .Tags "host" }}`')
        .message('{{ .ID }} is *{{ .Level }}*! {{ index .Fields "used_percent" | printf "%0.2f" }}% is used up.')
        .crit(lambda: gb / 2 + "total" / 100 * 10 > "free")
        .warn(lambda: 3 * gb + "total" / 100 * 10 > "free")
        //      .stateChangesOnly(12h)
        .email()
        .slack()
        .log('/var/log/kapacitor/ld.log')

DOT:
digraph ld {
graph [throughput="0.00 points/s"];

stream0 [avg_exec_time_ns="0s" ];
stream0 -> from1 [processed="253"];

from1 [avg_exec_time_ns="8.359µs" ];
from1 -> log2 [processed="253"];

log2 [avg_exec_time_ns="0s" ];
log2 -> eval3 [processed="253"];

eval3 [avg_exec_time_ns="59.18µs" eval_errors="0" ];
eval3 -> log4 [processed="253"];

log4 [avg_exec_time_ns="0s" ];
log4 -> groupby5 [processed="253"];

groupby5 [avg_exec_time_ns="3.283µs" ];
groupby5 -> window6 [processed="253"];

window6 [avg_exec_time_ns="16.909µs" ];
window6 -> alert7 [processed="70"];

alert7 [alerts_triggered="0" avg_exec_time_ns="464.241µs" crits_triggered="0" infos_triggered="0" oks_triggered="0" warns_triggered="0" ];

stats8 [avg_exec_time_ns="13.33µs" ];
stats8 -> derivative9 [processed="23"];

derivative9 [avg_exec_time_ns="5.774µs" ];
derivative9 -> alert10 [processed="22"];

alert10 [alerts_triggered="22" avg_exec_time_ns="0s" crits_triggered="22" infos_triggered="0" oks_triggered="0" warns_triggered="0" ];
}

File at /var/log/kapacitor/ld.log wasn't created, but I do have logfiles for all the other scripts that don't include an eval function. Also checked the kapacitord.err log, which indicated that everything was fine. Any more ideas what I could try?

nathanielc · 2016-11-15T21:55:16Z

Can you check the /var/log/kapacitor/kapacitor.log file? It will have the PRE EVAL and POST EVAL log lines so we can determine what might be causing the issue.

wanaryytel · 2016-11-15T22:31:16Z

@nathanielc
Hey,
here's an excerpt:

[ld:log2] 2016/11/15 17:27:01 I! PRE EVAL {"Name":"disk","Database":"telegraf","RetentionPolicy":"autogen","Group":"","Dimensions":{"ByName":false,"TagNames":null},"Tags":{"dc":"eu-west-1","fstype":"ext4","host":"host1.company.com","hosttype":"alfa","path":"/","provider":"big"},"Fields":{"free":7524429824,"inodes_free":838045,"inodes_total":1310720,"inodes_used":472675,"total":21137846272,"used":12539834368,"used_percent":62.4983515368576},"Time":"2016-11-15T22:27:00Z"}

[ld:log2] 2016/11/15 17:27:01 I! PRE EVAL {"Name":"disk","Database":"telegraf","RetentionPolicy":"autogen","Group":"","Dimensions":{"ByName":false,"TagNames":null},"Tags":{"dc":"eu-west-1","fstype":"ext3","host":"host1.company.com","hosttype":"alfa","path":"/disk","provider":"big"},"Fields":{"free":3931746304,"inodes_free":262133,"inodes_total":262144,"inodes_used":11,"total":4221763584,"used":75558912,"used_percent":1.8855292503878998},"Time":"2016-11-15T22:27:00Z"}

[ld:log2] 2016/11/15 17:27:01 I! PRE EVAL {"Name":"disk","Database":"telegraf","RetentionPolicy":"autogen","Group":"","Dimensions":{"ByName":false,"TagNames":null},"Tags":{"dc":"eu-west-1","fstype":"ext4","host":"host1.company.com","hosttype":"alfa","path":"/disk2","provider":"big"},"Fields":{"free":17060515840,"inodes_free":1310544,"inodes_total":1310720,"inodes_used":176,"total":21137846272,"used":3003588608,"used_percent":14.969960985721439},"Time":"2016-11-15T22:27:00Z"}

[ld:log4] 2016/11/15 17:27:01 I! POST EVAL {"Name":"disk","Database":"telegraf","RetentionPolicy":"autogen","Group":"","Dimensions":{"ByName":false,"TagNames":null},"Tags":{"dc":"eu-west-1","fstype":"ext4","host":"host1.company.com","hosttype":"alfa","path":"/","provider":"big"},"Fields":{"free":7524429824,"free_gb":7},"Time":"2016-11-15T22:27:00Z"}

[ld:log4] 2016/11/15 17:27:01 I! POST EVAL {"Name":"disk","Database":"telegraf","RetentionPolicy":"autogen","Group":"","Dimensions":{"ByName":false,"TagNames":null},"Tags":{"dc":"eu-west-1","fstype":"ext3","host":"host1.company.com","hosttype":"alfa","path":"/disk","provider":"big"},"Fields":{"free":3931746304,"free_gb":3},"Time":"2016-11-15T22:27:00Z"}

[ld:log4] 2016/11/15 17:27:01 I! POST EVAL {"Name":"disk","Database":"telegraf","RetentionPolicy":"autogen","Group":"","Dimensions":{"ByName":false,"TagNames":null},"Tags":{"dc":"eu-west-1","fstype":"ext4","host":"host1.company.com","hosttype":"alfa","path":"/disk2","provider":"big"},"Fields":{"free":17060515840,"free_gb":17},"Time":"2016-11-15T22:27:00Z"}

nathanielc · 2016-11-16T16:50:50Z

@wanaryytel Great, the logs show that the fields are being kept correctly. As for why the alert is not triggering it seems to be a different issue. Could it be that these conditions are not correct?

        .crit(lambda: gb / 2 + "total" / 100 * 10 > "free")
        .warn(lambda: 3 * gb + "total" / 100 * 10 > "free")

wanaryytel · 2016-11-17T08:58:37Z

@nathanielc It is the first thing that comes to mind, but considering

I have another script (without eval) that has the same threshold values that does alert me with the same data
I have tried much higher thresholds with no avail

it seems improbable at best.

Is there anything else I can try?

nathanielc · 2016-11-17T16:11:14Z

@wanaryytel Use the |log() to debug why your alert is not firing. With the log node you can see exactly the data that the alert node is processing and determine if it is working as expected.

nathanielc added a commit that referenced this issue Oct 25, 2016

fix #984, eval keep now works with non referenced vars

a23ed27

nathanielc mentioned this issue Oct 25, 2016

Eval.Keep now works with non referenced vars #994

Merged

3 tasks

nathanielc closed this as completed in #994 Oct 25, 2016

nathanielc pushed a commit that referenced this issue Oct 25, 2016

fix #984, eval keep now works with non referenced vars (#994)

412b10c

jdursel mentioned this issue Nov 8, 2016

Fields cannot be found after using eval keep with named fields #1026

Closed

nathanielc reopened this Nov 15, 2016

nathanielc closed this as completed Nov 17, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval clears all variables #984

Eval clears all variables #984

wanaryytel commented Oct 21, 2016 •

edited

Loading

nathanielc commented Oct 24, 2016

wanaryytel commented Oct 25, 2016

nathanielc commented Oct 25, 2016

wanaryytel commented Oct 25, 2016

nathanielc commented Oct 25, 2016

wanaryytel commented Oct 25, 2016

wanaryytel commented Nov 6, 2016 •

edited

Loading

nathanielc commented Nov 7, 2016

wanaryytel commented Nov 7, 2016

nathanielc commented Nov 7, 2016

wanaryytel commented Nov 8, 2016

nathanielc commented Nov 15, 2016

wanaryytel commented Nov 15, 2016

nathanielc commented Nov 16, 2016

wanaryytel commented Nov 17, 2016

nathanielc commented Nov 17, 2016

Eval clears all variables #984

Eval clears all variables #984

Comments

wanaryytel commented Oct 21, 2016 • edited Loading

nathanielc commented Oct 24, 2016

wanaryytel commented Oct 25, 2016

nathanielc commented Oct 25, 2016

wanaryytel commented Oct 25, 2016

nathanielc commented Oct 25, 2016

wanaryytel commented Oct 25, 2016

wanaryytel commented Nov 6, 2016 • edited Loading

nathanielc commented Nov 7, 2016

wanaryytel commented Nov 7, 2016

nathanielc commented Nov 7, 2016

wanaryytel commented Nov 8, 2016

nathanielc commented Nov 15, 2016

wanaryytel commented Nov 15, 2016

nathanielc commented Nov 16, 2016

wanaryytel commented Nov 17, 2016

nathanielc commented Nov 17, 2016

wanaryytel commented Oct 21, 2016 •

edited

Loading

wanaryytel commented Nov 6, 2016 •

edited

Loading