Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Patch building and Webhooks API #307

Open
yurinnick opened this issue Jul 20, 2023 · 22 comments · May be fixed by kernelci/kernelci-pipeline#299 or kernelci/kernelci-pipeline#297
Open

[RFC] Patch building and Webhooks API #307

yurinnick opened this issue Jul 20, 2023 · 22 comments · May be fixed by kernelci/kernelci-pipeline#299 or kernelci/kernelci-pipeline#297

Comments

@yurinnick
Copy link

yurinnick commented Jul 20, 2023

Webhook API should help to integrate KernelCI with patch management (Patchwork) and version control systems (Github, Gitlab). It will provide interface to trigger non-upstream patch builds that should be able to publish results back later.

Implementation

As the first step, I'd like to implement Patchwork integration with KernelCI. Following changes should be implemented:

  1. Add /webhooks/patchwork API that will expect certain (TBD) input from Patchwork side, enough to build and test kernel, and report back results
  2. Extend kernelci.Node, KernelBuildMetadata, kernelci.config.Tree with patch-related fields
  3. Implement patch mbox application on top of git checkout
  4. Implement Patchwork patch checks update mechanism

Minimal patch information

We need a patch and all dependent patches information, as well as submitter information for email notifications.

{
  "patches": [
    {
      "patchwork_id": "str",
      "hash": "str",
      "web_url": "str",
      "date": "str",
      "mbox": "str"
    }
  ],
  "submitter": { 
    "id": "int", 
    "url": "str", 
    "name": "str", 
    "email": "str" 
  }
}

Discussion

  • How to make patches information generic enough to share across multiple systems? Should we make it generic at all?
  • How to pass Patchwork specific metadata into Node? data field or a separate field?
  • Calculating and passing revision information
  • What data should we expect in /webhooks/patchwork? (Collaboration with Patchwork developers)
@gctucker
Copy link
Contributor

Add /webhooks/patchwork API that will expect certain (TBD) input from Patchwork side, enough to build and test kernel, and report back results

Do you mean adding an endpoint to the KernelCI API?

I think a better approach is to have a micro-service receiving the webhooks from Patchwork and then relaying things to the KernelCI API. We have something like this already for LAVA callbacks:
https://github.com/kernelci/kernelci-pipeline/blob/main/src/lava_callback.py

That way, the KernelCI API doesn't need to "know" about Patchwork and we can update the bridge any time just by restarting the client-side service. It's a key principle of the new API, to be modular and leave all the specific logic happen on the client side.

@gctucker
Copy link
Contributor

How to pass Patchwork specific metadata into Node? data field or a separate field?

Yes, the data field is like a void * data structure managed by the client-side logic. The database is NoSQL (MongoDB) so there's no schema for the payload stored in data. All the other fields are validated through a schema though. A bit like a file system, the meta-data is structured by the kernel but what goes into the files is managed by users - if that's a suitable enough analogy.

@gctucker
Copy link
Contributor

How to make patches information generic enough to share across multiple systems? Should we make it generic at all?

I would suggest to make it entirely specific to Patchwork and whatever data BPF CI currently has, with a dedicated micro-service just for this use-case. This would be the first time we add "pre-merge" testing capability to KernelCI so we can keep things as simple as possible and tailored to this particular use-case. Then when we start adding other kinds of pre-merge testing sources we might have other dedicated micro-services for them and potentially consolidate things with a more standard way of triggering pipeline based on patch series, but I don't think we need to solve this kind of generic problem right now just for BPF CI integration.

@yurinnick
Copy link
Author

Going dedicated microservice route will certainly help with iterations and decoupling logic from the core part. But now I am thinking about how to handle sending information back to patchworks?
We can store mapping patch_id<>pipeline_id on "patchwork webhook microservice" side and poll results from there. Or we can pass patch metadata into free-form data field and create special patchwork notifier on kernelci-pipeline side.
Latter one is easier, but it make kernelci-pipeline at least somewhat patchwork-aware.

@yurinnick
Copy link
Author

yurinnick commented Jul 20, 2023

As for BPF CI, we already have CI setup and ready to send results into KCIDB. There are a few missing pieces there, mainly missing patch support in KCIDB and WebUI.

@gctucker
Copy link
Contributor

Going dedicated microservice route will certainly help with iterations and decoupling logic from the core part.

Yes, it's one of the key design principles of the new API. The legacy API had too many custom built-in features and that made it very hard to do anything with it. Testing the upstream kernel requires a very modular design as each use-case is different - here it's Patchwork for BPF but nearly every maintainer has some custom tooling.

But now I am thinking about how to handle sending information back to patchworks? We can store mapping patch_id<>pipeline_id on "patchwork webhook microservice" side and poll results from there. Or we can pass patch metadata into free-form data field

Yes, the latter. It's much better to keep the micro-service stateless, the Node.data field can be used for this kind of extra data.

and create special patchwork notifier on kernelci-pipeline side. Latter one is easier, but it make kernelci-pipeline at least somewhat patchwork-aware.

This is fine, in fact the micro-service doesn't even need to be in the kernelci-pipeline repository. It could be something hosted as part of BPF CI, or it could be in kernelci-pipeline with a separate docker-compose file like we have for the LAVA callback handler. Having a micro-service dedicated to BPF patch testing is absolutely the kind of thing the new API was designed for.

@gctucker
Copy link
Contributor

As for BPF CI, we already have CI setup and ready to send results into KCIDB. There are a few missing pieces there, mainly missing patch support in KCIDB and WebUI.

OK, I guess it'll be good to clarify what you want to cover with your existing CI and what you're expecting from KernelCI to be doing. If you want the KernelCI infrastructure to build kernels and run tests for BPF patches, then would it be duplicating some things you're already doing with your current BPF CI? Would it be a way to transition from BPF CI to KernelCI?

@yurinnick
Copy link
Author

yurinnick commented Jul 25, 2023

In kcidb/orm/data.py there patchset_hash and patchset_files fields for checkout and revision available in KCIDB. I can't find any mentions of revision structure in the docs though, so I assume it's not used.

Does KernelCI store any patchset information like mbox urls and/or title? If not, what's the best place to add these fields? KCIDB itself?

@yurinnick
Copy link
Author

If you want the KernelCI infrastructure to build kernels and run tests for BPF patches, then would it be duplicating some things you're already doing with your current BPF CI?

Part of the reason I want this functionality is to reduce BPF CI footprint. I'd like to rely on KernelCI as much as possible in the future, while keeping more Meta-workload specific BPF tests on BPF CI side.

Would it be a way to transition from BPF CI to KernelCI?

Once we'll have patch/patchset abstraction in KCIDB and Dashboards we'll publish data from BPF CI into KernelCI. Then we can start removing duplicated signals and move some BPF CI testing onto KernelCI infrastructure.

@gctucker
Copy link
Contributor

@yurinnick I think we need to clarify a few things, KCIDB is only a database for storing data from any CI system. It doesn't run builds or tests or anything itself. The point of it is to have a common email report (and dashboard) with results aggregated from many CI systems e.g. 0-Day, Syzbot, CKI, Gentoo, Linaro's TuxSuite as well as the "native" KernelCI results.

The "native" KernelCI results are the builds, tests and anything that is orchestrated by KernelCI itself. There's a legacy database and frontend for it and it's running with Jenkins. This is now being replaced with the new API & Pipeline which will have a new web dashboard too at some point.

So you could already choose to send your current BPF CI results to KCIDB. I'm not an expert regarding the schema and how it handles patch sets, that's more a question for @spbnick.

Then if you want to rely on KernelCI to run tests for BPF, that's not related to KCIDB. It's something we can do with the new API & Pipeline by setting up a dedicated pipeline service to make the bridge between the patch sets you have to test for BPF and KernelCI. I believe this is what you wanted to do, and offload your current CI system or maybe refocus it on some "internal" testing while KernelCI would be gradually covering all the upstream-oriented testing. Is that right?

I'm not familiar enough yet with your setup around Patchwork and GitHub but we can go through the details if integrating it with KernelCI is what you want to achieve.

@yurinnick
Copy link
Author

yurinnick commented Aug 1, 2023

@gctucker
Copy link
Contributor

gctucker commented Aug 1, 2023

Could you please explain which service is going to be receiving what kind of data from Patchwork to start with?

We basically need something to know the Git base revision and a list of patch files. Then the simplest way to deal with this would be to apply the patches on top of the git revision and create a source tarball "checkout" with it, and the rest can remain unchanged. Once we have this in place, I think we can gradually add more features specific to patches like keeping them as separate files, doing incremental builds etc. but that will require many development iterations I think.

@yurinnick
Copy link
Author

So I created kernelci-patchwork-webhook where I receive webhook data (exact structure of it is WIP) and transform it into a new node request here.

Once a new node created with data field containing patchwork information, I use it to apply patches on top of a specified kernel tree, pass it to the test flow, and include patch information into the end test report.

@spbnick
Copy link

spbnick commented Aug 2, 2023

I assume the new KernelCI API will be sending its testing results into KCIDB, so whatever you submit there would automatically end up in KCIDB. We have some data coming in already from the WIP. I only see checkouts at this point, though. @gctucker, please correct me if I misunderstood anything, and if not, do we already have more data to send from the API, perhaps?

If you have any data from your own CI system to send to KCIDB in addition to that, you'll be more than welcome. Reach out to me on Slack, or the maillist, and we can arrange that, no problem.

@gctucker
Copy link
Contributor

gctucker commented Aug 3, 2023

That's right, as discussed yesterday in the meeting the data from the new API will be forwarded to KCIDB and that can include patch sets.

@spbnick The staging API instance is not very stable and probably the bridge that sends data to KCIDB hasn't been very operational. That should improve with the Early Access instance which should be more production-like, so hopefully there'll be more data sent to KCIDB in September.

@yurinnick
Copy link
Author

@gctucker please let me know which pull requests we can review and merge, and which one I should rework.

@gctucker
Copy link
Contributor

For the record, here's a sample Patchwork page showing results of current BPF CI with a link to a GitHub PR that run the tests:

https://patchwork.kernel.org/project/netdevbpf/patch/20230727082536.1974154-1-xiangyu.chen@eng.windriver.com/

kernel-patches/bpf#5415

@gctucker
Copy link
Contributor

Thanks @yurinnick for this, it looks like a very good first step. It's mostly functional and ready to be added to the code base, I think the only issue is that we might need to update the API schema for node revisions or at least precisely define how to handle patchsets. Node revisions are important for the whole pipeline so we can't really just deal with patchsets only using the arbitrary .data field. The revision is used to group results together, if we don't follow this we'll have issues with reporting and inconsistencies in the data.

I'll create a GitHub issue to cover this part then we can discuss it and evaluate various options. Then once that's clarified I think it'll unblock things to land a first functional implementation.

@gctucker
Copy link
Contributor

Following up from kernelci/kernelci-pipeline#295, it would seem like a good starting point to look at how to manage patchsets via the API.

Right now, the Revision data model includes everything related to the kernel revision based on Git information. The checkout nodes have a source tarball which needs to be a plain version of the kernel source tree matching the revision data exactly. Now, how should we deal with patches?

One issue is if we keep the same Git information for a checkout node that has patches, querying the results for a particular kernel revision could be misleading. If patches are applied, it's not the same source code as the base Git revision so the results for the patched kernel shouldn't be found when looking for a particular Git revision.

So here's a proposal for how to deal with this in Node data, using some hash algorithm for the patch series like KCIDB or CKI (here some simplified YAML because it's easier):

# Root node for the base Git revision
id: 123
parent: null
name: checkout
revision:
  commit: 374a7f47bf401441edff0a64465e61326bf70a82
  patchset: null
artifacts:
  linux.tar.gz: https://some.storage.com/linux.tar.gz
---
# Build node for the base revision
id: 124
parent: 123
name: kbuild-gcc-10-x86
revision:
  commit: 374a7f47bf401441edff0a64465e61326bf70a82
  patchset: null
artifacts:
  bzImage: https://stome.storage.com/bzImage-1
---
# Patched revision
id: 125
parent: 123
name: patchset
revision:
  commit: 374a7f47bf401441edff0a64465e61326bf70a82
  patchset: 6b7c28b5b6fc0c8194b8b5fd54f2f57ecc2d73fa45ba97125e775c55cf6825cf
artifacts:
  0001-accel-ivpu-Use-struct_size.patch: https://some.storage.com/0001.patch
  0002-accel-ivpu-Remove-configuration-of-MMU-TBU1-and-TBU3.patch: https://some.storage.com/0002.patch
---
# Build node for the patched revision
id: 126
parent: 125
name: kbuild-gcc-10-x86
revision:
  commit: 374a7f47bf401441edff0a64465e61326bf70a82
  patchset: 6214ba5dcb3dbf46a5f479b087207f0cc83f11516c0a764bb590a76d00964e38
artifacts:
  bzImage: https://stome.storage.com/bzImage-2

A few things to note about this proposal:

  • The source tarball is only generated once, so it's more efficient and flexible than uploading another patched one.
  • It's easy to query for only base revisions or patched ones using the node name i.e. checkout or patchset (thes names can be adjusted if they're misleading).
  • Build jobs using the tarball + patches would typically apply each patch individually and do an incremental builds. The same build job could be producing the root checkout node with the base revision as well as the patchset one.

Some more investigation needs to be done for incremental builds, but I think this can provide the basis for a solution as a first step. We could have a hierarchy of patch nodes, each being the child of the previous one (e.g. the base checkout is the parent, then a child patch has a patch file 0001 which can then have its own child build nodes as well as a child with another patch file 0002 etc). Or we could stick to this proposal with a patchset node for the whole series as well as a single build node but then it would in turn have several child nodes with the steps of the incremental builds (and probably a same common job name to group the nodes as one set of results).

Additional information such as the patch author etc. should already be in the patch file itself, so while we might add it to the data via API it seems like an incremental improvement. We could easily store this in the .data field for example.

How does this sound?

@yurinnick
Copy link
Author

One importance part we haven't discussed yet, is how are we going to pick the base revision for patch or patchset. In my basic implementation I set commit: HEAD, so a latest commit from the main branch was picked up. However, I don't thing it would be possible with the current hierarchy implementation, since we have to know base revision in advance.

@yurinnick
Copy link
Author

A few question arise since the last week:

  1. Who suppose to create patch/patchset nodes? Should it be trigger service or kernelci-patchwork-webhook?
  2. How does KernelCI handles Node lifecycle? Currently I see services and jobs changing Nodes state and then another service takes it for processing. So I assume there's no defined state machine that exclusively handles Node state.

Given [2], what the best way to create patch service? Wait for tarball service to make Node available to take it from there?

@gctucker
Copy link
Contributor

It doesn't matter in this proposal where the nodes are created, this is only about how it would work from the API side initially. The patchset nodes would be child nodes of checkout nodes, and the checkout node has the information about the base git commit.

Surely a client could send a checkout node with the tarball and the base git revision data as well as a patchset child node with the patchset data? Or a client for patchsets could first look up base git revisions in checkout nodes, well there are many ways to do this depending on the use-case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment