Connect-X 7 Card - Connection Refused after fetching boot-kernel and boot-initrd

I’m not really sure if this is a MaaS issue but I figured I’d post here to see if anyone has any insight.
I have a machine with a ConnectX-7 card using FlexBoot and I am running into an issue where during commissioning the machine successfully grabs boot-kernel and boot-initrd. However, once it tries to fetch the squashfs image it gets a “Connection Refused” error and drops into a initramfs shell and stops the commissioning progress. I do not see the Connection Refused error even in debug logs.

I don’t believe it’s a firewall issue as there are no firewalls and it successfully fetches the other files beforehand. Further, I am able to fetch the squashfs image from another machine while the one I am trying to commission gets the connection refused. And finally, if I let the initramfs shell sit for awhile (5+ min) eventually, I will stop getting connection refused and I am able to wget the squashfs image.

I am thinking it has something to do with it using the mlx5_core driver or something after it gets the boot files, but I am not really sure.

Any advice would be appreciated.

Hi @rwalsh0975, if I understand correctly the problem occurs in the host of the ConnectX-7 card.

  • Does the host commission correctly without the card?
  • What MAAS version are you using and what OS image are you using?

If you need Mellanox kernel modules you should use jammy hwe-22.04 or higher.

Hi @javier-fs,

Yes, the problem occurs when the host is trying to fetch the squashfs image from the MaaS server.

The ConnectX-7 card is currently the only one that I can hook up to the correct network as we are waiting for dongles for the other interfaces as a fallback. So I have not had a chance to commission it with another card.

I have tried 3.6.2 and 3.7 beta with 24.04 GA and HWE as the kernels.

Thanks,
Ryan

24.04 should have the modules needed to use ConnectX-7

Can you check the card and upgrade the firmware?