Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build_grpc error and vip configuration #110

Closed
WagleTanvi opened this issue Jan 4, 2021 · 11 comments
Closed

build_grpc error and vip configuration #110

WagleTanvi opened this issue Jan 4, 2021 · 11 comments

Comments

@WagleTanvi
Copy link

Hi, I am trying to install/build katran with Ubuntu 18.04 on a physical server using the latest commit of katran.

Katran gets successfully installed with build_katran.sh. However, when running, ./build_grpc_client.sh. I get this error:

+ get_goclient_deps
+ pushd .
~/katran/example_grpc ~/katran/example_grpc
+ cd goclient/src/katranc/main
+ go get
# katranc/katranc
../katranc/katranc.go:179:2: undefined: ok
../katranc/katranc.go:179:6: undefined: err
../katranc/katranc.go:180:13: undefined: err
../katranc/katranc.go:181:5: undefined: ok
../katranc/katranc.go:335:13: kc.GetVipFlags undefined (type *KatranClient has no field or method GetVipFlags)
../katranc/katranc.go:340:36: cannot use real.Flags (type int32) as type uint32 in argument to parseRealFlags

After some trial and error, when I checkout a previous commit like 92313218fe81aa5cc112a87a7a9493200a66d8ee , the build is successful.

With this build though, I am having issues with getting katran to respond to vip. I set everything up according to the instructions in example.md. As an initial setup, I have two physical servers in my topology each with one active link on same subnet. One server runs katran and second one has apache web server (REAL server).

Configuring Katran with VIP and Real

cd ~/katran
./katran_goclient -A -t 10.200.200.1:80
./katran_goclient -a -t 10.200.200.1:80 -r IP_OF_REAL_WEB_SERVER
# From Katran Server. Curl to the Real Server works. 
curl IP_OF_REAL_WEB_SERVER 

On Katran server (I tried on another server as well), when I try to curl the VIP I set up, it does not work. Nothing outputs. This is as if Katran is not responding to VIP.

curl 10.200.200.1 

Please advise. Thanks!

@udippant
Copy link
Contributor

udippant commented Jan 5, 2021

Thanks for reporting this issue. This issue was introduced in the commit f6e5cbc

I have a candidate fix going through review internally (for that missing function GetVipFlags in goclient/src/katranc/katranc/katranc.go and undeclared variable). Should be addressed by tomorrow.

@udippant
Copy link
Contributor

udippant commented Jan 5, 2021

For the seconds issue, what does the output of ./katran_goclient -l show? (This show list of configured services).

Also, do you have decapsulation support on the IP_OF_REAL_WEB_SERVER? If you tcpdump on that host, do you see any incoming packet for the vip? (Also, try tcpdump with additional filter proto 4 to see IPIP encapsulated packet).

@WagleTanvi
Copy link
Author

WagleTanvi commented Jan 5, 2021

Thank you @udippant.

Here is the output of ./katran_goclient -l ( I masked off part of the IP for security)

username@node-1:~/katran$ ./katran_goclient -l
2021/01/04 19:58:32 vips len 1
VIP:         10.200.200.1 Port:     80 Protocol: tcp
Vip's flags: 
 ->128.105.xxx.xxx  weight: 1
exiting

On the real server, (where apache is running), there is no output from tcpdump.

username@node-2:~$ sudo tcpdump -ni enp1s0f0 proto 4 or host 10.200.200.1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp1s0f0, link-type EN10MB (Ethernet), capture size 262144 bytes

Also, on this node-2 (the REAL), I executed following as mentioned in example.md.

sudo ip link add name ipip0 type ipip external
sudo ip link add name ipip60 type ip6tnl external
sudo ip link set up dev ipip0
sudo ip link set up dev ipip60
sudo ip a a 127.0.0.42/32 dev ipip0

for sc in $(sysctl -a | awk '/\.rp_filter/ {print $1}'); do  echo $sc ; sudo sysctl ${sc}=0; done

sudo ip a a 10.200.200.1/32 dev lo

Finally, just to make sure katran is running properly I ran os_run_tester.sh which showed that all tests passed. Except following message at the end:

…
I0104 20:04:09.550020 33857 BpfTester.cpp:220] Test: QUIC: short header w/ conn id. host id = 0. CH. LRU hit      result: Passed
I0104 20:04:09.550057 33857 BpfTester.cpp:220] Test: UDP: big packet of length 1515. trigger PACKET TOOBIG        result: Passed
I0104 20:04:09.550065 33857 BpfTester.cpp:220] Test: QUIC: short header w/ connection id. CIDv2                   result: Passed
I0104 20:04:09.550073 33857 BpfTester.cpp:220] Test: QUIC: short header w/ connection id but non-existing mapping. CIDv2 result: Passed
I0104 20:04:09.550079 33857 katran_tester.cpp:270] Testing counter's sanity. Printing on errors only
I0104 20:04:09.550211 33857 katran_tester.cpp:338] Testing of counters is complete
E0104 20:04:09.550237 33857 KatranSimulator.cpp:168] src and dst must have same address family
E0104 20:04:09.550243 33857 KatranSimulator.cpp:161] malformed src or dst ip address. src: aaaa dst: bbbb
E0104 20:04:09.550249 33857 BpfLoader.cpp:97] Can't find prog with name: cls-hc
I0104 20:04:09.550256 33857 katran_tester.cpp:192] Healthchecking not enabled. Skipping HC related tests

@udippant
Copy link
Contributor

udippant commented Jan 7, 2021

To see if Katran received the packets, can you see some stats? (such as goclient -s, -sum, -lru etc). xdpdump is another quite useful tool for this.

Also, can you check the mac?

@WagleTanvi
Copy link
Author

No output from golient -s -sum -lru (0 packets). Xdpdump was not complied with build_katran.sh, should it be or needs to be built separately?

I am using mac address of the default gateway for machine running katran (double checked again).

On another note, When I start katran, I do see this netlink message, is this normal or katran is having problem receiving traffic?

...
libbpf: elf: skipping relo section(26) .rel.eh_frame for section(25) .eh_frame
libbpf: elf: skipping unrecognized data section(16) .eh_frame
libbpf: elf: skipping relo section(17) .rel.eh_frame for section(16) .eh_frame
E0107 13:37:49.527056 22842 BpfAdapter.cpp:219] Error receiving netlink message: File exists [17]
Server listening on 0.0.0.0:50051

@udippant
Copy link
Contributor

udippant commented Jan 8, 2021

It looks like katran didn't even receive the packet. Do you see xdp-drop (e.g. with ethtool -S eth0 | grep xdp_drops ) while running the curl cmd?
Regarding the xdpdump tool, yeah, the build script doesn't build this tool by default. You'll need to build this target (https://github.com/facebookincubator/katran/blob/master/tools/xdpdump/CMakeLists.txt#L38). [As a quick workaround I was able to build it locally by simply adding add_subdirectory(tools) here . I'll add an integration to build this tool from the build-katran script separately).

That netlink message is likely while adding adding cls-act on the network interface. So shouldn't affect.

@WagleTanvi
Copy link
Author

Thanks. I ran ethtool while executing curl 10.200.200.1. However, It seems there is no correlation of xdp drop increase to running curl. The xdp_drop counters just increases slowly even without curl being run.

username@node-0:~/katran$ ethtool -S eno49 | grep xdp_drop
     rx_xdp_drop: 9
     rx0_xdp_drop: 0
     rx1_xdp_drop: 0
     rx2_xdp_drop: 9
     rx3_xdp_drop: 0
     rx4_xdp_drop: 0
     rx5_xdp_drop: 0
     rx6_xdp_drop: 0
     rx7_xdp_drop: 0
     rx8_xdp_drop: 0
     rx9_xdp_drop: 0
     rx10_xdp_drop: 0
     rx11_xdp_drop: 0
     rx12_xdp_drop: 0
     rx13_xdp_drop: 0
     rx14_xdp_drop: 0
     rx15_xdp_drop: 0
     rx16_xdp_drop: 0
     rx17_xdp_drop: 0
     rx18_xdp_drop: 0
     rx19_xdp_drop: 0

ip link output is as below:

@node-0:~/katran$ ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno49: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 98:f2:b3:c4:6b:60 brd ff:ff:ff:ff:ff:ff
    prog/xdp id 11
3: eno50: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
    link/ether 98:f2:b3:c4:6b:61 brd ff:ff:ff:ff:ff:ff
4: ens1f0: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
    link/ether 9c:dc:71:5d:d5:b0 brd ff:ff:ff:ff:ff:ff
5: ens1f1: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
    link/ether 9c:dc:71:5d:d5:b1 brd ff:ff:ff:ff:ff:ff
6: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
7: ipip0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
8: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/tunnel6 :: brd ::
9: ipip60@NONE: <NOARP,UP,LOWER_UP> mtu 1452 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/tunnel6 :: brd ::

Full katran command

root     30694 30693  0 16:17 pts/0    00:00:00 ./build/example_grpc/katran_server_grpc -hc_forwarding=false -balancer_prog ./deps/bpfprog/bpf/balancer_kern.o -default_mac 44:31:92:b8:57:40 -healthchecker_prog ./deps/bpfprog/bpf/healthchecking_ipip.o -intf=eno49 -ipip_intf=ipip0 -ipip6_intf=ipip60 -lru_size=10000 -map_path /sys/fs/bpf/jmp_eno49 -prog_pos=2

Mac address of the default router:

@node-0:~/katran$ ip n show
128.110.nn.nn dev eno49 lladdr 44:31:92:b8:57:40 REACHABLE

I will try to get xdpdump running shortly but wanted to give you above info to see if you can spot any obvious issues.

@WagleTanvi
Copy link
Author

I got xdpdump working. There was a linking issue, I had to remove https://github.com/facebookincubator/katran/blob/master/tools/xdpdump/CMakeLists.txt#L48. (Not sure if it was needed)

It seems katran is not advertising the VIP (10.200.200.1) so doing curl to the VIP on katran or from other servers (nn1/nn2) doesn't produce any traffic.

I tried to curl to base IP (128.110.nn.nn) where katran is running and I see the traffic in xdpdump. But curl to 10.200.200.1 doesn't produce any result in xdpdump.

@node-0:~$ sudo ./katran/_build/build/tools/xdpdump/xdpdump -map_path /sys/fs/bpf/jmp_eno49 -dport 80
src: 128.110.nn1.nn1 dst: 128.110.nn.nn
proto: 6 sport: 53026 dport: 80 pkt size: 74 chunk size: 74
src: 128.110.nn2.nn2 dst: 128.110.nn.nn
proto: 6 sport: 45876 dport: 80 pkt size: 74 chunk size: 74

Is there any way to check if katran is advertising the VIP?

@udippant
Copy link
Contributor

udippant commented Jan 9, 2021

Katran itself does not advertise the VIP. That part is not open-sourced, which also depends a lot on the environment it is running.
In a typical setup:

  • initialize katran and attach the xdp programs
  • setup your backend servers for vips (and start health-checks)
  • Update the states within the Katran (for e.g. via KatranLb interface itself within the same process or via rpc clients from a separate process)
  • advertise vips (or withdraw them) for example with bgp

@WagleTanvi
Copy link
Author

Ok sure. For now, I just defined a static route from my client to the katran VIP and everything seems to be working. Thanks.

facebook-github-bot pushed a commit that referenced this issue Jan 12, 2021
Summary:
This allows building of xdpdump via the central build-script. (follow up from
#110)

Reviewed By: avasylev

Differential Revision: D25877843

fbshipit-source-id: 7977e88a2251b4d0b98a23e90f521992c8c9bc08
@bienkma
Copy link

bienkma commented Dec 1, 2021

./katran/_build/build/tools/xdpdump/xdpdump

Hi @WagleTanvi how to build xdpdump? Could you please explain here? I need the tool to debug in my server. Thank you!

facebook-github-bot pushed a commit that referenced this issue Feb 16, 2022
Summary:
Pull Request resolved: facebook/sapling#110

Pull Request resolved: facebookexperimental/rust-shed#27

Make it so that changes to rust-shed or other common rust source are used locally vendored, so they don't need to be pushed to github before they are visible in a build.

There was already some support for cargo vendoring in getdeps, but it was limited to dependencies between manifests built with cargo builder.  This wasn't enough to build something like eden (cmake is main entry point, with later calls cargo) or eden_scm (make is main entry point, with later calls to cargo), so this diff adds a cargo prepare step for getdeps other primary build systems.

The cargo vendoring is done by using a cargo config file to point to the source files used by getdeps.  It has two modes:

1. per crate, existing mode which is already automatic for cargo to cargo manifest dependencies.  To use it for a non cargo build manifest, add crate.pathmap
2. per git url, existing mode which was only use for crates.io third-party crates, now can be enabled by setting cargo.cargo_config_file

Reviewed By: yancouto

Differential Revision: D33895469

fbshipit-source-id: 7b13c0b679532492a336ce217de875c25fe1be90
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants