Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cilium as a CNI #33

Merged
merged 8 commits into from
Apr 11, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .vsls.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"$schema": "http://json.schemastore.org/vsls",
"gitignore": "none"
}
7 changes: 7 additions & 0 deletions REQUIREMENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Cluster node requirements

As per [Rancher's Installation Requirements](https://ranchermanager.docs.rancher.com/pages-for-subheaders/installation-requirements#operating-systems-and-container-runtime-requirements), all nodes need to have the `ntp` package installed and `firewalld` needs to be disabled and not running (it might be possible to have `firewalld` running as long as we configure it correctly). Since we'll be using `RKE2`, there is no need for `docker` or `containerd` to be installed.

## Port requirements

https://ranchermanager.docs.rancher.com/getting-started/installation-and-upgrade/installation-requirements/port-requirements
13 changes: 13 additions & 0 deletions Vagrantfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,24 @@ require 'yaml'

config = File.exist?('local-dev-cluster.yaml') ? YAML.load_file('local-dev-cluster.yaml') : YAML.load_file('dev-cluster.yaml')
$cluster_vm_ram = config["cluster"]["node"]["ram"]
$cluster_vm_cpus = config["cluster"]["node"]["cpu"] || 2
num_of_nodes = config["cluster"]["nodeCount"]
$router_ram = config["router"]["ram"]
$router_cpus = config["router"]["cpu"] || 1
router_count = config["router"]["count"] || 1
$bridge_interface = config["networking"] == nil ? nil : config["networking"]["bridgeInterface"]
$host_only = config["networking"] == nil ? false : (config["networking"]["hostOnly"] || false)
$ip = 2 # start with 2 because virtualbox adapter makes 10.10.0.1 reserved for the host

def configure_cpus(vm, cpus)
vm.vm.provider "virtualbox" do |v|
v.cpus = cpus
end
vm.vm.provider :libvirt do |l|
l.cpus = cpus
end
end

def configure_ram(vm, ram)
vm.vm.provider "virtualbox" do |v|
v.memory = ram
Expand Down Expand Up @@ -58,6 +69,7 @@ def configure_router(i, config)
# expose the router to your network
end

configure_cpus(router, $router_cpus)
configure_ram(router, $router_ram)
configure_private_network(router, true)
router.vm.provision "shell" do |s|
Expand All @@ -81,6 +93,7 @@ def configure_cluster_node(i, config)
s.inline = "hostnamectl set-hostname $1"
s.args = ["cluster"+i.to_s]
end
configure_cpus(clustervm, $cluster_vm_cpus)
configure_ram(clustervm, $cluster_vm_ram)
configure_private_network(clustervm, false)
clustervm.ssh.username = "ni"
Expand Down
2 changes: 2 additions & 0 deletions ansible.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[defaults]
host_key_checking = False
6 changes: 4 additions & 2 deletions deploy-playbook.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
---
- name: Create a set new SSH key for clusters and routers
ansible.builtin.import_playbook: networking/add-ssh-key-to-nodes-playbook.yaml
- name: Accept ssh keys for the first time
ansible.builtin.import_playbook: networking/accept-ssh-keys-playbook.yaml
- name: Pre-setup - get correct interfaces
ansible.builtin.import_playbook: networking/get-interface-playbook.yaml
- name: Networking - Router BGP
Expand All @@ -11,3 +9,7 @@
ansible.builtin.import_playbook: networking/router-vrrp-playbook.yaml
- name: Networking - Router Controlplane HA
ansible.builtin.import_playbook: networking/controlplane-ha-playbook.yaml
- name: Node - Pre-setup K8S
ansible.builtin.import_playbook: k8s/node-k8s-pre-setup-playbook.yaml
- name: Setup K8s on control-plane
ansible.builtin.import_playbook: k8s/rke-first-setup-playbook.yaml
2 changes: 2 additions & 0 deletions dev-cluster.yaml
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
cluster:
node:
ram: 4098
cpu: 2
nodeCount: 3
router:
ram: 512
cpu: 1
networking:
# specific networking options, only uncomment this if you need a specific setting below
# If you are using LibVirt, you need to specify a interface to bridge from (can't be a wireless interface)
Expand Down
2 changes: 1 addition & 1 deletion dev/node-networking.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/sh
sudo ip r del 0.0.0.0
sudo ip r del 0.0.0.0/0 dev ens5
sudo nmcli device modify ens5 ipv4.never-default yes
sudo nmcli con add type ethernet con-name main-network ifname ens6 ip4 10.10.0.$1/24 \
gw4 10.10.0.254
Expand Down
67 changes: 39 additions & 28 deletions dev/setup-kind-cluster.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,15 @@

KIND_EXECUTABLE=kind
KUBECTL_EXECUTABLE=kubectl
CILIUM_EXECUTABLE=cilium-cli
HELM_EXECUTABLE=helm

# first check if the kind executable exists

command -v $KIND_EXECUTABLE >/dev/null 2>&1 || { echo >&2 "I require '$KIND_EXECUTABLE' but it's not installed. Aborting."; exit 1; }
command -v $KUBECTL_EXECUTABLE >/dev/null 2>&1 || { echo >&2 "I require '$KUBECTL_EXECUTABLE' but it's not installed. Aborting."; exit 1; }
command -v $CILIUM_EXECUTABLE >/dev/null 2>&1 || { echo >&2 "I require '$CILIUM_EXECUTABLE' but it's not installed. Aborting."; exit 1; }
command -v $HELM_EXECUTABLE >/dev/null 2>&1 || { echo >&2 "I require '$CILIUM_EXECUTABLE' but it's not installed. Aborting."; exit 1; }

# Create "kind" network, deleting any old ones if they exist

Expand All @@ -29,33 +33,40 @@ docker network create "$KIND_NETWORK_NAME" \

$KIND_EXECUTABLE create cluster --config "$(dirname "$0")"/test-cluster.kind.yaml

# install MetalLB so services are assigned an IP address on creation.

$KUBECTL_EXECUTABLE apply -f https://github.com/metallb/metallb/raw/main/config/manifests/metallb-native.yaml

$KUBECTL_EXECUTABLE wait --namespace metallb-system \
--for=condition=ready pod \
--selector=app=metallb \
--timeout=120s

echo "\
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
# deploy cilium

$HELM_EXECUTABLE repo add cilium https://helm.cilium.io
$HELM_EXECUTABLE repo update
$HELM_EXECUTABLE upgrade --install cilium cilium/cilium \
--version 1.15.3\
--namespace kube-system\
--values $(dirname $0)/../services/cilium/values.yaml\
--set k8sServiceHost=niployments-test-cluster-external-load-balancer\
--set k8sServicePort=6443\
--set bgpControlPlane.enabled=false\
--set l2announcements.enabled=true\
--set ipam.mode=kubernetes\
--set ipv4NativeRoutingCIDR=172.28.0.0/16\
--set enableIPv4Masquerade=true\
--set autoDirectNodeRoutes=true\
--set routingMode=native

$CILIUM_EXECUTABLE status --wait

$KUBECTL_EXECUTABLE apply -f $(dirname $0)/../services/cilium/load-balancer-pool-dev.yaml

cat <<EOF | $KUBECTL_EXECUTABLE apply -f -
apiVersion: "cilium.io/v2alpha1"
kind: CiliumL2AnnouncementPolicy
metadata:
name: example
namespace: metallb-system
name: default
spec:
addresses:
- 172.28.255.200-172.28.255.250
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: empty
namespace: metallb-system" | \
$KUBECTL_EXECUTABLE apply -f -





nodeSelector:
matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: DoesNotExist
interfaces:
- ^eth[0-9]+
externalIPs: true
loadBalancerIPs: true
EOF
3 changes: 3 additions & 0 deletions dev/test-cluster.kind.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: niployments-test-cluster
networking:
disableDefaultCNI: true # do not install kindnet
kubeProxyMode: none # do not run kube-proxy
nodes:
- role: control-plane
- role: control-plane
Expand Down
25 changes: 25 additions & 0 deletions k8s/node-k8s-pre-setup-playbook.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
- name: Node k8s pre-setup
hosts: nodes
tasks:
- name: NetworkManager exclude CNI interfaces
become: true
ansible.builtin.copy:
src: templates/rke.conf
dest: /etc/NetworkManager/conf.d/rke.conf
mode: "644"
- name: Restart NetworkManager
become: true
ansible.builtin.systemd:
name: NetworkManager
state: reloaded
- name: Stop and disable firewalld
become: true
ansible.builtin.systemd:
name: firewalld
state: stopped
enabled: false
- name: Install RKE2 # noqa: command-instead-of-module
become: true
ansible.builtin.shell:
cmd: set -o pipefail && curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE="{{ "server" if "controlplane" in group_names else "agent" }}" sh -
creates: /usr/bin/rke2
68 changes: 68 additions & 0 deletions k8s/rke-first-setup-playbook.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
- name: Setup RKE on the first control-plane node
hosts: controlplane[0]
tasks:
- name: Create the rke2 config directory
become: true
ansible.builtin.file:
path: /etc/rancher/rke2
state: directory
mode: "755"
recurse: true
- name: Copy first server node yaml to the correct destination
become: true
ansible.builtin.copy:
src: templates/config-rke-first.yaml
dest: /etc/rancher/rke2/config.yaml
mode: "644"
- name: Start and Enable RKE2
become: true
ansible.builtin.systemd:
name: rke2-server
state: started
enabled: true
- name: Wait for RKE2 to start
become: true
ansible.builtin.wait_for:
port: 6443
delay: 5
timeout: 300
- name: Extract RKE2 token
become: true
ansible.builtin.command: cat /var/lib/rancher/rke2/server/node-token
changed_when: rke2_token.rc != 0 # if token was successfully extracted
register: rke2_token
- name: Set RKE2 token fact
ansible.builtin.set_fact:
cluster_token: "{{ rke2_token.stdout }}"

- name: Setup RKE on the remaining nodes
hosts: controlplane[1:]:workers
tasks:
- name: Create the rke2 config directory
become: true
ansible.builtin.file:
path: /etc/rancher/rke2
state: directory
mode: "755"
recurse: true
- name: Copy server node yaml to the correct destination
become: true
ansible.builtin.template:
src: templates/config-rke-additional.j2
dest: /etc/rancher/rke2/config.yaml
mode: "644"
- name: Start and Enable RKE2
become: true
timeout: 300
throttle: 1
retries: 2
ansible.builtin.systemd:
name: "{{ 'rke2-server' if inventory_hostname in groups['controlplane'] else 'rke2-agent' }}"
state: started
enabled: true
- name: Wait for RKE2 to start
become: true
ansible.builtin.wait_for:
port: 6443
delay: 5
timeout: 300
7 changes: 7 additions & 0 deletions k8s/templates/config-rke-additional.j2
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
token: {{ hostvars[groups['controlplane'][0]]['cluster_token'] }}
server: https://10.11.11.1:9345
selinux: true
tls-san:
- 10.11.11.1
cni: none
disable-kube-proxy: "true"
5 changes: 5 additions & 0 deletions k8s/templates/config-rke-first.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
selinux: true
tls-san:
- 10.11.11.1
cni: none
disable-kube-proxy: "true"
2 changes: 2 additions & 0 deletions k8s/templates/rke.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[keyfile]
unmanaged-devices=interface-name:cali*;interface-name:flannel*
10 changes: 0 additions & 10 deletions networking/accept-ssh-keys-playbook.yaml

This file was deleted.

1 change: 1 addition & 0 deletions networking/router-bgp-playbook.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
become: true
ansible.builtin.apt:
name: bird2
update_cache: true
- name: Configure bird2
become: true
ansible.builtin.template:
Expand Down
2 changes: 1 addition & 1 deletion networking/templates/router-bird.conf.j2
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
define myas = 65512;
router id 10.11.11.1;
router id 10.10.0.254;

protocol device {
scan time 10;
Expand Down
25 changes: 25 additions & 0 deletions networking/templates/router-haproxy.cfg.j2
Original file line number Diff line number Diff line change
Expand Up @@ -49,4 +49,29 @@ backend apiserver
server {{nodename}} {{ hostvars[nodename]["ansible_"~
hostvars[nodename]["ansible_facts"]["target_interface"]]['ipv4']['address']
}}:6443 check
{% endfor %}


#---------------------------------------------------------------------
# RKE2 supervisor server frontend which proxys to the control plane nodes
#---------------------------------------------------------------------
frontend supervisorserver
bind 10.11.11.1:9345
mode tcp
option tcplog
default_backend supervisorserver

#---------------------------------------------------------------------
# round robin balancing for RKE2 supervisor
#---------------------------------------------------------------------
backend supervisorserver
option httpchk GET /cacerts
http-check expect status 200
mode tcp
option ssl-hello-chk
balance roundrobin
{% for nodename in groups["controlplane"]%}
server {{nodename}} {{ hostvars[nodename]["ansible_"~
hostvars[nodename]["ansible_facts"]["target_interface"]]['ipv4']['address']
}}:9345 check
{% endfor %}
12 changes: 12 additions & 0 deletions services/cilium/bgp-peering-policy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
apiVersion: "cilium.io/v2alpha1"
kind: CiliumBGPPeeringPolicy
metadata:
name: niployments-bgp
spec:
virtualRouters:
- localASN: 65512
exportPodCIDR: true
neighbors:
- peerAddress: "10.10.0.254/32"
peerASN: 65512
12 changes: 12 additions & 0 deletions services/cilium/deploy.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#! /bin/bash

helm repo add cilium https://helm.cilium.io
helm repo update
helm upgrade --install cilium cilium/cilium \
--version 1.15.3\
--namespace kube-system\
--values $(dirname $0)/values.yaml

cilium-cli status --wait

kubectl apply -f $(dirname $0)/bgp-peering-policy.yaml
8 changes: 8 additions & 0 deletions services/cilium/load-balancer-pool-dev.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
apiVersion: "cilium.io/v2alpha1"
kind: CiliumLoadBalancerIPPool
metadata:
name: "niployments-lb-pool"
spec:
blocks:
- start: "172.28.255.200"
stop: "172.28.255.250"
Loading
Loading