Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] pcn_pkt_redirect doesn't work in xdp cubes with transparent services attached #280

Closed
FedeParola opened this issue Mar 11, 2020 · 9 comments · Fixed by #286
Closed
Labels
bug Something isn't working

Comments

@FedeParola
Copy link
Collaborator

FedeParola commented Mar 11, 2020

Describe the bug

pcn_pkt_redirect doesn't work in xdp standard cubes when a transparent service is attached to the destination interface

To Reproduce

Using router and transparenthelloworld (but the same applies to other services).
Setup the router:

polycubectl router add r1 type=xdp_drv loglevel=trace
polycubectl r1 ports add p1 peer=<physical_iface> ip=10.0.0.2/24

Trying to ping 10.0.0.2 from an host connected to the router port works fine.
Add a transparent service:

polycubectl transparenthelloworld add th1 type=xdp_drv
polycubectl attach th1 r1:p1

Clear the arp cache on the source host (since the arp reply in the router is the operation that requires a pcn_pkt_redirect):
sudo ip -s -s neigh flush all
Pinging 10.0.0.2 no longer works.

Expected behavior

Ping should work normally.

Please tell us about your environment:

  1. OS details: Ubuntu 18.04.4 LTS
  2. Kernel details: 5.5.0-050500-generic

Additional context

Looking at the log of the router, instead of being sent out of the interface, the ARP Reply packet is sent back to the router dataplane code, with a nonexistent port as in port:

[2020-03-11 10:24:31.957] [Transparenthelloworld] [th1] [debug] Ingress: passing packet
[2020-03-11 10:24:31.957] [Router] [r1] [trace] in_port: 0, proto: 0x806, mac_src: 00:e0:ed:22:ee:e4 mac_dst: ff:ff:ff:ff:ff:ff
[2020-03-11 10:24:31.957] [Router] [r1] [debug] somebody is asking for my address
[2020-03-11 10:24:31.957] [Router] [r1] [trace] in_port: 3, proto: 0x806, mac_src: 00:e0:ed:22:6a:46 mac_dst: 00:e0:ed:22:ee:e4
[2020-03-11 10:24:31.957] [Router] [r1] [error] received packet from non valid port: 3
[2020-03-11 10:24:31.957] [Router] [r1] [trace] in: 3 out: -- DROP
@FedeParola FedeParola added the bug Something isn't working label Mar 11, 2020
@sebymiano
Copy link
Collaborator

Is the issue only present when the service is in xdp mode?

@FedeParola
Copy link
Collaborator Author

Yes, both xdp_drv and xdp_skb

@FedeParola FedeParola changed the title [BUG] pcn_pkt_redirect doesn't work in xdp_drv cubes with transparent services attached [BUG] pcn_pkt_redirect doesn't work in xdp cubes with transparent services attached Mar 11, 2020
@FedeParola
Copy link
Collaborator Author

Think I found part of the problem, here:

PatchPanel::get_tc_instance(), level, type, attach) {

transparent cubes are created using the TC PatchPanel for egress programs, so the when the egress xdp code of the transparent service is injected its reference isn't added to the list of xdp_nodes

@sebymiano
Copy link
Collaborator

I have the fear that the problem is more complicated than that.
The fact that the egress code of the transparent service (in XDP) uses the tc patch panel is correct, since XDP does not have support for egress (or, at least, it did not have that support when we implemented it).

I have the impression, although I am not 100% sure of it, that the send_packet_out function of the port does not see as the next module the egress code of the attached transparent service.
Maybe there is a case missing in this function .
Can you have a look at it?

@FedeParola
Copy link
Collaborator Author

I have the fear that the problem is more complicated than that.
The fact that the egress code of the transparent service (in XDP) uses the tc patch panel is correct, since XDP does not have support for egress (or, at least, it did not have that support when we implemented it).

The egress code of xdp transparent services should be injected in both XDP and TC mode. If the service processes a packet received from the Linux networking stack then the TC program will be executed since it is the only one available (in future egress XDP could be used). On the other end, if the packet is getting out of a XDP cube (eg the router), then the XDP program should be executed, now this program is missing.

I have the impression, although I am not 100% sure of it, that the send_packet_out function of the port does not see as the next module the egress code of the attached transparent service.
Maybe there is a case missing in this function .
Can you have a look at it?

I don't think the send_packet_out function has something to do with it, since the problem appears without involving the slowpath

@FedeParola
Copy link
Collaborator Author

I was able to solve the problem replacing all TC operations for the egress program (compile, load, PatchPanel) with XDP ones, but this isn't a solution, since now my service wouldn't be able to handle egress packets coming from the networking stack

@sebymiano
Copy link
Collaborator

I have the fear that the problem is more complicated than that.
The fact that the egress code of the transparent service (in XDP) uses the tc patch panel is correct, since XDP does not have support for egress (or, at least, it did not have that support when we implemented it).

The egress code of xdp transparent services should be injected in both XDP and TC mode. If the service processes a packet received from the Linux networking stack then the TC program will be executed since it is the only one available (in future egress XDP could be used). On the other end, if the packet is getting out of a XDP cube (eg the router), then the XDP program should be executed, now this program is missing.

That's correct, @FedeParola. Nice catch :)
I initially though it was easier to send the packet back to the stack and then call the egress TC program, but this will loose all the advantages of using XDP.

I have the impression, although I am not 100% sure of it, that the send_packet_out function of the port does not see as the next module the egress code of the attached transparent service.
Maybe there is a case missing in this function .
Can you have a look at it?

I don't think the send_packet_out function has something to do with it, since the problem appears without involving the slowpath

You're right. I don't know why but I was thinking that we use the slow path to generate the ARP Reply :)

@sebymiano
Copy link
Collaborator

I was able to solve the problem replacing all TC operations for the egress program (compile, load, PatchPanel) with XDP ones, but this isn't a solution, since now my service wouldn't be able to handle egress packets coming from the networking stack

Unfortunately, the only solution now is to load the program in both modes.
The easiest way would be to use the XDP_EGRESS; however, it has not been merged into the kernel yet (and I don't know if it will be done in the future).

@FedeParola
Copy link
Collaborator Author

Ok then, I can work on a patch for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants