OVS Bridge on Linux bond - Deployement fails

Hello there,

I using maas 3.1.0 deployed via snap :

maas 3.1.0-10901-g.f1f8f1505 18199 3.1/stable canonical✓ -

I have some trouble to deploy nodes using two network interfaces in a bond, and an OVS bridge over that. The purpose is to deploy an openstack platform.
Docs about the process : Here

The only things I ‘customized’ regarding the docs are :

  • Using MAAS 3.1.0
  • Install MAAS in a production way, in other words, I created the PostgreSQL database mannually

I tested some other config which works fine :

  • OVS bridge on a single physical interface -> OK
  • Bond ( balanced alb ) using two physical interfaces -> OK
  • Linux bridge on top of Bond -> OK

The problem only occurs when using OVS as bridge type over a bond.

enp2s0f0 ---┐
            │----> bond0  ------> br0
enp2s0f1 ---┘

When that config is deployed, server is booted, install seems to run correctly, but then when the server reboots, it doesn’t have a functional networking so I can’t SSH node and MAAS is waiting for the node to confirm installation succeeds.

I wonder : Cloud that be related with an potential non-presence of OVS package in the Ubuntu image hosted by maas.io or something like that ?

Is someone else impacted by that issue ?

Regards,

Hello again,

The problem seems to be related to the bond MAC address switching process (alb mode).

When MAAS tries to ping the node, it’s ARP cache is filled with the enp2s0f1 MAC address regarding the Node’s IP, but the bridge uses the enp2s0f0 MAC address inherited from bond0, originality inherited from enp2s0f0.

So after the installation, node reboots, but it can’t join MAAS correctly.
It seems that the bonding driver does not changes the destination MAC address of the received packet by enp2s0f1 to match the bond0 MAC, so the packet doesn’t go to the br0 interface.

not sure – there could still be a MAAS bug – but off the top of my head, it sounds like something is either misconfigured or working as intended. i’ve seen one situation where one of the MAC addresses made it into the bridge forwarding db, but the other one didn’t, and nothing works right after that. have you checked this OvS bonding explanation?

i’ll also ask around the engineering team and see if anyone has a better idea.

everybody else i ask is also thinking something is misconfigured in your network setup. can you try reading through the OvS link above and see if you can see it? if not, bounce back and i’ll try to gather more information and help, if i can.

one easy thing to try is to see whether you can ssh to br0.<hostname>.maas. that should tell you something right away.

@berthe01, did you figure this out yet?

Hello @billwear,

I read the docs you send me and if I understand correctly, SLB mode is applied when I select alb mode for an OVS bridge in MAAS. Is that right ?

Sorry for the response time, I deployed my nodes with a active/backup bond so I ran my tests on that setup. I’ll try again to deploy with alb mode in my lab next week and I’ll tell you what I can see.

cool. i’ll leave this one open for now.