Greetings!
I have deployed an OpenZiti environment for the purpose of data migration into AWS. One of the challenges we would like to overcome is how to make the system fully HA in our use case. We are using routers at each site, and treating them as LAN Router implementations. This is because we have storage appliances at both ends that can not use the OpenZiti SDK, because we have little control over the vendor. In my current implementation, I install 4 router instances in AWS, and I use entries in the VPC routing table to load balance between them by creating separate routes for each on-prem node of the storage cluster. These routes are manually distributed among the 4 router instances in AWS when the stack is deployed by either Cloudformation or Terraform. This is something that I would love to change, using something like an AWS Gateway Load Balancer. However, they require that whichever "virtual appliances" they balance across, use the Geneve protocol. I searched this site for GENEVE, and came up empty. Is this something that is on your roadmap, or can be?
1 Like
Hi @greggw01,
Welcome to the community! Thank you for trying out OpenZiti.
We have developed ebpf program to proxy the geneve tunnel in the AWS GLB scenario. It is documented under our NetFoundry Docs, but can be used with OpenZiti as well. In short, we attach this ebpf program at the tc ingress level on the main interface. This program strips the geneve header to expose the client ip packets, which in turn are directly forwarded to the tproxy sockets initated by the OpenZiti Router. The return packets are forwarded directly back to the client in this first release. Thus, GLB needs to be in the same VPC as the back end VMs (i.e. OpenZiti Routers). Link to the doc https://support.netfoundry.io/hc/en-us/articles/14206309681421-AWS-Cloud-Ingress-High-Availability
1 Like
FYI, the steps were written to use with our AWS Image. That image is optimized for CloudZiti, but this does not mean you can't manage it with OpenZiti. Here is README located in the ebpf module repo. zfw/README.md at main · netfoundry/zfw · GitHub.
If you look at the router listener section in the OpenZiti docs Router Configuration Reference | OpenZiti, there is this tunneler section.
listeners:
- binding: tunnel
options:
mode: tproxy
resolver: udp://127.0.0.1:53
dnsSvcIpRange: 100.80.0.0/12 # optionally customize the dynamic IP range used by Ziti DNS
Once you install zfw using the debian package, the user space binary will be installed in /opt/openziti/bin/zfw along with ebpf bytecode and other helper scripts. Then the configuration detail that is shown above ( tproxy with iptables), would need be updated to use ebpf bytcode instead of iptables to filter the incoming packets.
listeners:
- binding: tunnel
options:
mode: tproxy:/opt/openziti/bin/zfw
resolver: udp://127.0.0.1:53
dnsSvcIpRange: 100.80.0.0/12 # optionally customize the dynamic IP range used by Ziti DNS
The caveat here is that once you restart the router with this config, it will be looking for ebpf maps created by the kernel ebpf bytcode, which is also located in the same bin directory, i.e. /opt/openziti/bin/zfw_tc_ingress.o. You will need to attach this bytecode beforehand to the main interface, and then manage it using something like ip tc filter add
command.
Hopefully, this provides more color.
2 Likes
Hi Dariusz,
Thanks for your help. I am trying to get things working on Amazon Linux 2023. It appears that the ufw install went well, and the zfw compile worked.
The compile looked like this:
[root@ip-10-213-0-205 zfw]# cd src
[root@ip-10-213-0-205 src]# make all
clang -D BPF_MAX_ENTRIES=100000 -O1 -lbpf -o zfw zfw.c -I/usr/include/aarch64-linux-gnu/
clang -D BPF_MAX_ENTRIES=100000 -g -O2 -Wall -Wextra -target bpf -c zfw_tc_ingress.c -o zfw_tc_ingress.o -I/usr/include/aarch64-linux-gnu/
clang -O2 -g -Wall -target bpf -c zfw_xdp_tun_ingress.c -o zfw_xdp_tun_ingress.o -I/usr/include/aarch64-linux-gnu/
clang -g -O2 -Wall -Wextra -target bpf -c -o zfw_tc_outbound_track.o zfw_tc_outbound_track.c -I/usr/include/aarch64-linux-gnu/
clang -o zfw_tunnwrapper zfw_tunnel_wrapper.c -l json-c
[root@ip-10-213-0-205 src]#
[root@ip-10-213-0-205 src]# ./install.sh router
[root@ip-10-213-0-205 src]#
In order to make the install work, I had to make a directory and symlink for my router config file, so it could be found in /opt/openziti/ziti-router/config.yml.
However, I am running into this issue when I attempt to run
[root@ip-10-213-0-205 etc]# /opt/openziti/bin/start_ebpf_router.py
File already exist: /opt/openziti/etc/ebpf_config.json
Attempting to add ebpf ingress to: ens5
Ebpf not running no maps to clear
tc parent add : ens5
BTF debug data section '.BTF' rejected: Invalid argument (22)!
- Length: 20674
Verifier analysis:
Skipped 4691 bytes, use 'verb' option for the full verbose log.
[...]
on) size=40 vlen=5
type type_id=17 bits_offset=0
key_size type_id=59 bits_offset=64
value_size type_id=53 bits_offset=128
max_entries type_id=55 bits_offset=192
pinning type_id=1 bits_offset=256
[64] VAR tun_map type_id=63 linkage=1
[65] PTR (anon) type_id=66
[66] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=27
[67] PTR (anon) type_id=68
[68] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=262144
[69] STRUCT (anon) size=24 vlen=3
type type_id=65 bits_offset=0
max_entries type_id=67 bits_offset=64
pinning type_id=1 bits_offset=128
[70] VAR rb_map type_id=69 linkage=1
[71] PTR (anon) type_id=72
[72] STRUCT __sk_buff size=192 vlen=34
len type_id=73 bits_offset=0
pkt_type type_id=73 bits_offset=32
mark type_id=73 bits_offset=64
queue_mapping type_id=73 bits_offset=96
protocol type_id=73 bits_offset=128
vlan_present type_id=73 bits_offset=160
vlan_tci type_id=73 bits_offset=192
vlan_proto type_id=73 bits_offset=224
priority type_id=73 bits_offset=256
ingress_ifindex type_id=73 bits_offset=288
ifindex type_id=73 bits_offset=320
tc_index type_id=73 bits_offset=352
cb type_id=75 bits_offset=384
hash type_id=73 bits_offset=544
tc_classid type_id=73 bits_offset=576
data type_id=73 bits_offset=608
data_end type_id=73 bits_offset=640
napi_id type_id=73 bits_offset=672
family type_id=73 bits_offset=704
remote_ip4 type_id=73 bits_offset=736
local_ip4 type_id=73 bits_offset=768
remote_ip6 type_id=76 bits_offset=800
local_ip6 type_id=76 bits_offset=928
remote_port type_id=73 bits_offset=1056
local_port type_id=73 bits_offset=1088
data_meta type_id=73 bits_offset=1120
(anon) type_id=77 bits_offset=1152
tstamp type_id=79 bits_offset=1216
wire_len type_id=73 bits_offset=1280
gso_segs type_id=73 bits_offset=1312
(anon) type_id=81 bits_offset=1344
gso_size type_id=73 bits_offset=1408
tstamp_type type_id=83 bits_offset=1440
hwtstamp type_id=79 bits_offset=1472
[73] TYPEDEF __u32 type_id=74
[74] INT unsigned int size=4 bits_offset=0 nr_bits=32 encoding=(none)
[75] ARRAY (anon) type_id=73 index_type_id=4 nr_elems=5
[76] ARRAY (anon) type_id=73 index_type_id=4 nr_elems=4
[77] UNION (anon) size=8 vlen=1
flow_keys type_id=78 bits_offset=0
[78] PTR (anon) type_id=113
[79] TYPEDEF __u64 type_id=80
[80] INT unsigned long long size=8 bits_offset=0 nr_bits=64 encoding=(none)
[81] UNION (anon) size=8 vlen=1
sk type_id=82 bits_offset=0
[82] PTR (anon) type_id=112
[83] TYPEDEF __u8 type_id=84
[84] INT unsigned char size=1 bits_offset=0 nr_bits=8 encoding=(none)
[85] FUNC_PROTO (anon) return=2 args=(71 skb)
[86] FUNC bpf_sk_splice type_id=85
[87] PTR (anon) type_id=88
[88] STRUCT bpf_event size=48 vlen=14
tstamp type_id=80 bits_offset=0
ifindex type_id=73 bits_offset=64
tun_ifindex type_id=73 bits_offset=96
daddr type_id=73 bits_offset=128
saddr type_id=73 bits_offset=160
sport type_id=89 bits_offset=192
dport type_id=89 bits_offset=208
tport type_id=89 bits_offset=224
proto type_id=83 bits_offset=240
direction type_id=83 bits_offset=248
error_code type_id=83 bits_offset=256
tracking_code type_id=83 bits_offset=264
source type_id=91 bits_offset=272
dest type_id=91 bits_offset=320
[89] TYPEDEF __u16 type_id=90
[90] INT unsigned short size=2 bits_offset=0 nr_bits=16 encoding=(none)
[91] ARRAY (anon) type_id=84 index_type_id=4 nr_elems=6
[92] FUNC_PROTO (anon) return=0 args=(87 new_event)
[93] FUNC send_event type_id=92
[94] FUNC_PROTO (anon) return=2 args=(71 skb)
[95] FUNC bpf_sk_splice1 type_id=94
[96] FUNC_PROTO (anon) return=2 args=(71 skb)
[97] FUNC bpf_sk_splice2 type_id=96
[98] FUNC_PROTO (anon) return=2 args=(71 skb)
[99] FUNC bpf_sk_splice3 type_id=98
[100] FUNC_PROTO (anon) return=2 args=(71 skb)
[101] FUNC bpf_sk_splice4 type_id=100
[102] FUNC_PROTO (anon) return=2 args=(71 skb)
[103] FUNC bpf_sk_splice5 type_id=102
[104] CONST (anon) type_id=105
[105] INT char size=1 bits_offset=0 nr_bits=8 encoding=SIGNED
[106] ARRAY (anon) type_id=104 index_type_id=4 nr_elems=13
[107] VAR __license type_id=106 linkage=1
[108] VAR ifindex type_id=74 linkage=1
[109] DATASEC .bss size=0 vlen=1 size == 0
ELF contains non-{map,call} related relo data in entry 0 pointing to section 18! Compiler bug?!
Error fetching program/map!
Unable to load program
tc ingress filter action/0 not set : ens5
Cant attach ens5 to tc ingress with /opt/openziti/bin/zfw_tc_ingress.o
Not enough privileges or ebpf not enabled!
Run as "sudo" with ingress tc filter [filter -X, --set-tc-filter] set on at least one interface
Not enough privileges or ebpf not enabled!
Run as "sudo" with ingress tc filter [filter -X, --set-tc-filter] set on at least one interface
Not enough privileges or ebpf not enabled!
Run as "sudo" with ingress tc filter [filter -X, --set-tc-filter] set on at least one interface
Not enough privileges or ebpf not enabled!
Run as "sudo" with ingress tc filter [filter -X, --set-tc-filter] set on at least one interface
Skipping ziti-router.service conversion. File does not exist or is already converted to run ebpf!
Hi Gregg,
I assume you compiled zfw and bpf bytecode on Amazon Linux 2023. Based on this Amazon Linux 2023: A Comprehensive Overview of New Features and Updates, it sounds like it is mix of Fedora 34, 35, 36, and CentOS 9 Stream. What is the kernel version?
FYI, I successfully was able to run it in ubi8, but needed to install these to compile.
- OS/Platform: ubi8
yum install libbpf clang gcc glibc-devel.i686 json-c \
https://rpmfind.net/linux/centos/8-stream/BaseOS/x86_64/os/Packages/libzstd-devel-1.4.4-1.el8.x86_64.rpm \
https://rpmfind.net/linux/centos/8-stream/BaseOS/x86_64/os/Packages/zlib-devel-1.2.11-25.el8.x86_64.rpm \
https://rpmfind.net/linux/centos/8-stream/BaseOS/x86_64/os/Packages/elfutils-libelf-devel-0.189-3.el8.x86_64.rpm \
https://rpmfind.net/linux/centos/8-stream/PowerTools/x86_64/os/Packages/libbpf-devel-0.5.0-1.el8.x86_64.rpm \
https://rpmfind.net/linux/centos/8-stream/PowerTools/x86_64/os/Packages/bcc-devel-0.25.0-5.el8.x86_64.rpm \
https://rpmfind.net/linux/centos/8-stream/AppStream/x86_64/os/Packages/bcc-0.25.0-5.el8.x86_64.rpm \
https://rpmfind.net/linux/centos/8-stream/AppStream/x86_64/os/Packages/llvm-16.0.6-3.module_el8+605+fdf31679.x86_64.rpm \
https://rpmfind.net/linux/centos/8-stream/AppStream/x86_64/os/Packages/llvm-libs-16.0.6-3.module_el8+605+fdf31679.x86_64.rpm \
https://rpmfind.net/linux/centos/8-stream/BaseOS/x86_64/os/Packages/iproute-tc-6.2.0-5.el8.x86_64.rpm \
https://rpmfind.net/linux/centos/8-stream/PowerTools/x86_64/os/Packages/iproute-devel-6.2.0-5.el8.x86_64.rpm \
https://rpmfind.net/linux/centos/8-stream/BaseOS/x86_64/os/Packages/iproute-6.2.0-5.el8.x86_64.rpm \
https://rpmfind.net/linux/centos/8-stream/AppStream/x86_64/os/Packages/json-c-devel-0.13.1-3.el8.x86_64.rpm
I have not started on ubi9 to test yet
Hi Dariusz,
I forgot to mention, I am running on ARM64, and not x86_64. My kernel version is:
[greggw01@ip-10-213-0-70 ~]$ uname -a
Linux ip-10-213-0-70.us-east-2.compute.internal 6.1.79-99.164.amzn2023.aarch64 #1 SMP Tue Feb 27 18:02:03 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
-Gregg
Ok, Kernel is v6 but iproute is at v5.10. It looks like that version of iproute does not work well with BTF-defined BPF maps that we are using in our bytecode. The ubuntu version that we use has Kernel v6 as well but iproute is at v5.15. Not sure why Amazon Images with Kernel v6 would not have a newer version of iproute. If you want to use this image, then you would need to update the iproute to at least v.5.15.
I tested the Amazon Linux 2023 build and interface attachment of the bytecode with iproute-6.0.0-2.el9.aarch64 successfully after removing v5.10. Have not tested the data plane forwarding though.
But by removing v5.10, one of the dependencies removed is cloud-init among other things. You will be loosing something that you may need. Perhaps this is one of the why's they use v5.10
sudo yum remove iproute
Dependencies resolved.
=====================================================================================================================================================================================================================================
Package Architecture Version Repository Size
=====================================================================================================================================================================================================================================
Removing:
iproute aarch64 5.10.0-2.amzn2023.0.5 @System 3.3 M
Removing dependent packages:
amazon-chrony-config noarch 4.3-1.amzn2023.0.4 @System 5.9 k
amazon-ec2-net-utils noarch 2.4.1-1.amzn2023.0.1 @System 19 k
cloud-init noarch 22.2.2-1.amzn2023.1.12 @System 5.6 M
cloud-init-cfg-ec2 noarch 22.2.2-1.amzn2023.1.12 @System 237
Transaction Summary
=====================================================================================================================================================================================================================================
Remove 5 Packages