Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redundant information on job responses #42

Closed
juanpablo-santos opened this issue Feb 22, 2018 · 6 comments
Closed

Redundant information on job responses #42

juanpablo-santos opened this issue Feb 22, 2018 · 6 comments

Comments

@juanpablo-santos
Copy link
Contributor

Hi,

currently a generic-webhook trigger invocation returns too much information, as it contains the response of all jobs, whether they're "triggered or not" by the generic webhook. Those "not triggered" jobs include the resolved variables, which doesn't add anything, as they haven't been used because the job hasn't been triggered.

Furthermore, for a given invocation, the resolved variables are always the same for all the pinged jobs, so they could be extracted to their own variable on GenericWebHookRequestReceiver, and returned only once.

Also, as of 1.22 onwards, because of the actual json/xml being contributed as a resolved variable, it also means that the resolved variables content is returned twice, for each job, first one consisting of the sum of each "json/xml leaf" variable, and the other one because of the contributed variable (although I suppose we can manually remove the contributed variable on the pipeline from the triggered jobs).

Why is this a problem? We're using gitea's webhooks to trigger some jobs on jenkins and because all of this, we're beginning to have some webhook responses which are more than 65K characters, which is the maximum allowed by the column which stores the webhook response, so Gitea doesn't store the response, doesn't swallow the unexpected error and doesn't update their appropiate tbale's row, so the webhook remains as not sent, so, on the next webhook call, the webhook is called again, and again, etc. This has become more pressing when we updated to latest version of the plugin, which doubled the output's response.

I understand we've most probably hit a bug on the gitea side, but it would be nice to be able to control the amount of information returned by the generic-webhook-trigger-plugin, or at least to have not returning duplicated info. I can prepare a PR for the latter if needed, as I'm not really sure how to the former could be accomplished..

br,
juan pablo

@tomasbjerre
Copy link
Contributor

Just read this quickly, I'll have a closer look later. But to avoid a very big response I do:

  • Invoke without authorization, except for the token-parameter.
  • I use a different token parameter in each job.

With this method only 1 job will be returned in the response as all other jobs will be invisible. This is also gives better performance.

@juanpablo-santos
Copy link
Contributor Author

Hi @tomasbjerre,

thanks for tip on the token parameter, indeed alleviates the problem. We still have two jobs with the same token (we have to launch two jobs based on the same event, and we would like to keep it that way), but the response is already smaller

br,
juan pablo

@tomasbjerre
Copy link
Contributor

Hi,

currently a generic-webhook trigger invocation returns too much information, as it contains the response of all jobs, whether they're "triggered or not" by the generic webhook. Those "not triggered" jobs include the resolved variables, which doesn't add anything, as they haven't been used because the job hasn't been triggered.

Actually the response is intended to reduce support issues. If someone expects a job to be triggered but it is not, then this information is valuable for them to understand why it did not trigger.

Furthermore, for a given invocation, the resolved variables are always the same for all the pinged jobs, so they could be extracted to their own variable on GenericWebHookRequestReceiver, and returned only once.

But the jobs may extract different values. And they might use the regexp cleaning feature of the extracted variables differently.

Also, as of 1.22 onwards, because of the actual json/xml being contributed as a resolved variable, it also means that the resolved variables content is returned twice, for each job, first one consisting of the sum of each "json/xml leaf" variable, and the other one because of the contributed variable (although I suppose we can manually remove the contributed variable on the pipeline from the triggered jobs).

Yes, this is not perfect...

Why is this a problem? We're using gitea's webhooks to trigger some jobs on jenkins and because all of this, we're beginning to have some webhook responses which are more than 65K characters, which is the maximum allowed by the column which stores the webhook response, so Gitea doesn't store the response, doesn't swallow the unexpected error and doesn't update their appropiate tbale's row, so the webhook remains as not sent, so, on the next webhook call, the webhook is called again, and again, etc. This has become more pressing when we updated to latest version of the plugin, which doubled the output's response.

Ok and that was solved by my first reply right?

I understand we've most probably hit a bug on the gitea side, but it would be nice to be able to control the amount of information returned by the generic-webhook-trigger-plugin, or at least to have not returning duplicated info. I can prepare a PR for the latter if needed, as I'm not really sure how to the former could be accomplished..

br,
juan pablo

@juanpablo-santos
Copy link
Contributor Author

Hi @tomasbjerre,

Furthermore, for a given invocation, the resolved variables are always the same for all the pinged jobs, so they could be extracted to their own variable on GenericWebHookRequestReceiver, and returned only once.

But the jobs may extract different values. And they might use the regexp cleaning feature of the extracted variables differently.

indeed, but the resolved variables passed to all jobs are always the same, regardless of how those jobs consume them. OTOH, that would mean to change the returned json, which may broke any process expecting the current output.

In any case, I'm nitpicking, the token parameter should be the way to go; plus, the real issue isn't here but on the git server side, so please feel free to close the issue if you think it's ok as it is,

thanks,
juan pablo

@justoaguilar
Copy link

Hi @tomasbjerre,
we are just facing with a similar problem. Same token for a webhook used for a lot of repos. The webhook's response sizes 4MB.

I wonder if it might possible to add a new parameter to enable/disable full tracing in response and only return triggered jobs if this flag is not enabled.

For example, something like this:
http://jenkins/generic-webhook-trigger/invoke?token=my-token&action=my-action&response=triggered/all/....

This way anybody could debug their configurations and then, changing that parameter, could reduce the overhead between jenkins and git system.

Best regards
Justo

@tomasbjerre
Copy link
Contributor

Should be possible, yes. I would have it default to verbose logging. I think this feature helps a lot of users. PR:s are welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants