Openstack - Layer2 Gateway(VXLAN -> Real world bridge)
This article is the culmination of 100's of hours of work, I hope it can save others some time.
Here are some super useful articles that got me across the line:
- https://networkop.co.uk/blog/2016/05/21/neutron-l2gw
- https://wiki.openstack.org/wiki/Ovs-flow-logic
- http://kimizhang.com/neutron-l2-gateway-hp-5930-switch-ovsdb-integration/
- https://drive.google.com/file/d/0Bx8nDIFktlzBRm0tV3pmYURnZ3M/view
- https://github.com/openstack/networking-l2gw
Setting up and Openvswitch VTEP
Step1 - Kill all Openvswitch processes
use ps ax | grep ovs
to find any ovs processes that are running and kill them all
Step 2 - Bring up Openvswitch as a VTEP0
Now configure the script below to suit
This process will kill any OVS config you have in place, if you like your config... well... do something else!
Here we use ens4 as out 'trunk port' and the name of our 'physical switch'(Actually OpenvSwitch running on a server\VM) is switch-l2gw02
172.0.0.170 is the IP of the machine running OVS(Presumable the machine running this script)
1#!/bin/bash
2modprobe openvswitch
3ip link set up dev ens4
4rm etc/openvswitch/*
5ovsdb-tool create /etc/openvswitch/vtep.db /usr/share/openvswitch/vtep.ovsschema
6ovsdb-tool create /etc/openvswitch/vswitch.db /usr/share/openvswitch/vswitch.ovsschema
7mkdir /var/run/openvswitch/
8ovsdb-server --pidfile --detach --log-file --remote ptcp:6632:172.0.0.170 \\
9 --remote punix:/var/run/openvswitch/db.sock --remote=db:hardware_vtep,Global,managers \\
10 /etc/openvswitch/vswitch.db /etc/openvswitch/vtep.db
11ovs-vswitchd --log-file --detach --pidfile unix:/var/run/openvswitch/db.sock
12ovs-vsctl add-br switch-l2gw02
13vtep-ctl add-ps switch-l2gw02
14vtep-ctl set Physical_Switch switch-l2gw02 tunnel_ips=172.0.0.170
15ovs-vsctl add-port switch-l2gw02 ens4
16vtep-ctl add-port switch-l2gw02 ens4
17/usr/share/openvswitch/scripts/ovs-vtep \\
18 --log-file=/var/log/openvswitch/ovs-vtep.log \\
19 --pidfile=/var/run/openvswitch/ovs-vtep.pid \\
20 --detach switch-l2gw02
Install and Configure the Neutron L2 Agent
For me the l2agent was available in the APT repo, so the installation was nice and simple.
This is a configuration agent, it doesn't move any packets itself but it just orchestrates the required changes to the VTEP's, So i run this on the same VM as my Neutron server services
Set the following line in l2gateway_agent.ini
1ovsdb_hosts ='l2gw01:172.0.0.169:6632,l2gw02:172.0.0.170:6632'
Inbound ARP bug
This is a biggie and it's a giant PIA
Inbound ARP requests will hit your VTEP but will not be forwarded on, even if the VTEP was to forward them on, it does have a ovs table suitable for sending broadcast packets(That is a table that speciifies an output port of every VXLan endpoint)
So to achieve this we use a bit of a workaround, first set a kind of 'failover' for all multicast packets on the VTEP to forward these unknown packets(inbound ARP requests) to one of the 'network nodes' that is a Neutron node that’s got a line in "ovs-vfctl dump-flows br-tun"
that looks like this
1table=22, n_packets=15, n_bytes=1030, idle_age=11531, priority=1,dl_vlan=9 actions=strip_vlan,load:0x3fa->NXM_NX_TUN_ID[],output:9,output:2,output:4,output:13
This is a broadcoast rule, anything that hits it will be sent to all relevant VXLAN endpoints. (I say relevant because it seems that it out outputs to ports that have devices on the other end on the same VXLAN, E.g If you have a compute node that doesn’t have any VM's using that VXLAN network, the output port entry for that vxlan tunnel wont appear)
To configure this 'failover' run
1sudo vtep-ctl add-mcast-remote 818b4779-645c-49bb-ae4a-aa9340604019 unknown-dst 10.0.3.10
Where the UUID is the result of vtep-ctl list-ls
and the IP address is the IP of the neutron network node with table22 in place
Helpful hint: To find the names and numbers of the ports use this command
1ovs-vsctl -f table -- --columns=name,ofport,external-ids,options list interface
Ok, so now the ARP packets are heading to the Network node but we aren't quite done, we need to convince the network node to shunt all ARP requests out to table 22 and table 10(See here for a more detailed explanation from someone who actually knows what they are talking about https://networkop.co.uk/blog/2016/05/21/neutron-l2gw/ under the heading "Programming Network Node as BUM replication service node<")
To achieve this we need to add the following rule to br-tun on table 4
table=4,arp,tun_id=
0x3fa
priority=2,actions=mod_vlan_vid:
9
,resubmit(,10),resubmit(,22)
Where 0x3fa is the segmentation id of our network in HEX format and vlan9 is the vlan used on THAT node for processing, you can find this ID by running ovs-ofctl dump-flows br-tun | grep 0x3fa
You'll see a few entries and they'll all share then same vlan_id.. That’s what we are after for our custom rule When you have all the info run
1ovs-ofctl add-flow br-tun "table=4,arp,tun_id=0x3fa,priority=2,actions=mod_vlan_vid:9,resubmit(,10),resubmit(,22)"
And that’s it!, not he ARP requests form the outside world hit the VTEP, overflow to the network node which kindly broadcasts them out to the VXLAN endpoints for us.
I hope this has helped you in some way to join your Openstack VXLAN networks to the real world.