MAAS database can't deal with infiniband's 20-octet long MAC addresses

I’m installing MAAS 3.0 (regiond+rackd) on a HPC management node (Ubuntu 18.04). The infiniband network interfaces immediately caused some problems because its MAC-address is 20-octet long, not 6-octet long. But the MAAS DB uses type macaddr for column mac_address in table maasserver_interface, which assumes 6-octet.

2021-07-21 07:37:31 maasserver.start_up: [error] Database error during start-up
Traceback (most recent call last):
  File "/snap/maas/15003/usr/lib/python3/dist-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
psycopg2.errors.InvalidTextRepresentation: invalid input syntax for type macaddr: "20:00:19:07:fe:80:00:00:00:00:00:00:b8:83:03:ff:ff:7f:31:d5"
LINE 1: ...15:fe', '00:11:0a:6c:25:4d', '14:02:ec:33:2b:75', '20:00:19:...
                                                         ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/snap/maas/15003/lib/python3.8/site-packages/maasserver/start_up.py", line 68, in start_up
yield deferToDatabase(inner_start_up, master=master)
  File "/snap/maas/15003/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 250, in inContext
   result = inContext.theWork()
  File "/snap/maas/15003/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 266, in <lambda>
inContext.theWork = lambda: context.call(ctx, func, *args, **kw)
  File "/snap/maas/15003/usr/lib/python3/dist-packages/twisted/python/context.py", line 122, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/snap/maas/15003/usr/lib/python3/dist-packages/twisted/python/context.py", line 85, in callWithContext
return func(*args,**kw)
  File "/snap/maas/15003/lib/python3.8/site-packages/provisioningserver/utils/twisted.py", line 870, in callInContext
return func(*args, **kwargs)
  File "/snap/maas/15003/lib/python3.8/site-packages/provisioningserver/utils/twisted.py", line 202, in wrapper
result = func(*args, **kwargs)
  File "/snap/maas/15003/lib/python3.8/site-packages/maasserver/utils/orm.py", line 706, in call_with_connection
return func(*args, **kwargs)
  File "/snap/maas/15003/lib/python3.8/site-packages/maasserver/utils/__init__.py", line 194, in call_with_lock
return func(*args, **kwargs)
  File "/snap/maas/15003/lib/python3.8/site-packages/maasserver/utils/orm.py", line 751, in call_within_transaction
return func_outside_txn(*args, **kwargs)
  File "/snap/maas/15003/lib/python3.8/site-packages/maasserver/utils/orm.py", line 554, in retrier
return func(*args, **kwargs)
  File "/usr/lib/python3.8/contextlib.py", line 75, in inner
return func(*args, **kwds)
  File "/snap/maas/15003/lib/python3.8/site-packages/maasserver/start_up.py", line 121, in inner_start_up
node = RegionController.objects.get_or_create_running_controller()
  File "/snap/maas/15003/lib/python3.8/site-packages/maasserver/models/node.py", line 741, in get_or_create_running_controller
node = self._find_or_create_running_controller()
  File "/snap/maas/15003/lib/python3.8/site-packages/maasserver/models/node.py", line 778, in _find_or_create_running_controller
node = self._find_running_node()
  File "/snap/maas/15003/lib/python3.8/site-packages/maasserver/models/node.py", line 798, in _find_running_node
return get_one(nodes.distinct())
  File "/snap/maas/15003/lib/python3.8/site-packages/maasserver/utils/orm.py", line 112, in get_one
retrieved_items = tuple(islice(items, 0, 2))
  File "/snap/maas/15003/usr/lib/python3/dist-packages/django/db/models/query.py", line 274, in __iter__
self._fetch_all()
  File "/snap/maas/15003/usr/lib/python3/dist-packages/django/db/models/query.py", line 1242, in _fetch_all
self._result_cache = list(self._iterable_class(self))
  File "/snap/maas/15003/usr/lib/python3/dist-packages/django/db/models/query.py", line 55, in __iter__
results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/snap/maas/15003/usr/lib/python3/dist-packages/django/db/models/sql/compiler.py", line 1140, in execute_sql
cursor.execute(sql, params)
  File "/snap/maas/15003/usr/lib/python3/dist-packages/django/db/backends/utils.py", line 67, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/snap/maas/15003/usr/lib/python3/dist-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
return executor(sql, params, many, context)
  File "/snap/maas/15003/usr/lib/python3/dist-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
  File "/snap/maas/15003/usr/lib/python3/dist-packages/django/db/utils.py", line 89, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/snap/maas/15003/usr/lib/python3/dist-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
django.db.utils.DataError: invalid input syntax for type macaddr: "20:00:19:07:fe:80:00:00:00:00:00:00:b8:83:03:ff:ff:7f:31:d5"
LINE 1: ...15:fe', '00:11:0a:6c:25:4d', '14:02:ec:33:2b:75', '20:00:19:...
                                                         ^

Is there a way to restrict the network interfaces that maas will read? To exclude infiniband intefaces (ib0 & ib1 in my case), for example.

Or use a different type (character varying) for column mac_address in the database table?

Thanks,
Yu

Hi, unfortunately infiniband interfaces are not currently supported by maas.

Since you have such a setup, mind running sudo /snap/maas/current/usr/share/maas/machine-resources/amd64 and pasting the output?

It’s really long. I’ll post the parts related to network. The infiniband adapters are for Lustre file system. The mellanox drivers provided by the kernel (version below) is not up-to the task. So I installed the latest OFED drivers from the Mellanox website and enabled IP on IB.

{
"api_extensions": [
    "resources",
    "resources_cpu_socket",
    "resources_gpu",
    "resources_numa",
    "resources_v2",
    "resources_disk_sata",
    "resources_network_firmware",
    "resources_disk_id",
    "resources_usb_pci",
    "resources_cpu_threads_numa",
    "resources_cpu_core_die",
    "api_os",
    "resources_system",
    "resources_pci_iommu",
    "resources_network_usb",
    "resources_disk_address"
],
"api_version": "1.0",
"environment": {
    "kernel": "Linux",
    "kernel_architecture": "x86_64",
    "kernel_version": "4.15.0-76-generic",
    "os_name": "ubuntu",
    "os_version": "18.04",
    "server": "maas-machine-resources",
   ...
    },
    "network": {
        "cards": [
            {
                "driver": "ixgbe",
                "driver_version": "5.1.0-k",
                "ports": [
                    {
                        "id": "e10g0",
                        "address": "00:11:0a:6c:25:4d",
                        "port": 0,
                        "protocol": "ethernet",
                        "supported_modes": [
                            "10000baseT/Full"
                        ],
                        "supported_ports": [
                            "fibre"
                        ],
                        "port_type": "fibre",
                        "transceiver_type": "internal",
                        "auto_negotiation": false,
                        "link_detected": true,
                        "link_speed": 10000,
                        "link_duplex": "full"
                    }
                ],
                "sriov": {
                    "current_vfs": 0,
                    "maximum_vfs": 63,
                    "vfs": null
                },
            {
                "driver": "mlx5_core",
                "driver_version": "4.7-3.2.9",
                "ports": [
                    {
                        "id": "ib0",
                        "address": "20:00:11:07:fe:80:00:00:00:00:00:00:b8:83:03:ff:ff:7f:31:d4",
                        "port": 0,
                        "protocol": "infiniband",
                        "port_type": "other",
                        "transceiver_type": "internal",
                        "auto_negotiation": false,
                        "link_detected": true,
                        "link_speed": 100000,
                        "link_duplex": "full",
                        "infiniband": {
                            "issm_name": "issm0",
                            "issm_device": "231:64",
                            "mad_name": "umad0",
                            "mad_device": "231:0",
                            "verb_name": "uverbs0",
                            "verb_device": "231:192"
                        }
                    }
                ],
                "sriov": {
                    "current_vfs": 0,
                    "maximum_vfs": 16,
                    "vfs": null
                },
                "numa_node": 1,
                "pci_address": "0000:88:00.0",
                "vendor": "Mellanox Technologies",
                "vendor_id": "15b3",
                "product": "MT27800 Family [ConnectX-5]",
                "product_id": "1017",
                "firmware_version": "16.26.1040 (HPE0000000009)"
            },
            {
                "driver": "mlx5_core",
                "driver_version": "4.7-3.2.9",
                "ports": [
                    {
                        "id": "ib1",
                        "address": "20:00:19:07:fe:80:00:00:00:00:00:00:b8:83:03:ff:ff:7f:31:d5",
                        "port": 0,
                        "protocol": "infiniband",
                        "auto_negotiation": false,
                        "link_detected": false,
                        "infiniband": {
                            "issm_name": "issm1",
                            "issm_device": "231:65",
                            "mad_name": "umad1",
                            "mad_device": "231:1",
                            "verb_name": "uverbs1",
                            "verb_device": "231:193"
                        }
                    }
                ],
                "sriov": {
                    "current_vfs": 0,
                    "maximum_vfs": 16,
                    "vfs": null
                },
                "numa_node": 1,
                "pci_address": "0000:88:00.1",
                "vendor": "Mellanox Technologies",
                "vendor_id": "15b3",
                "product": "MT27800 Family [ConnectX-5]",
                "product_id": "1017",
                "firmware_version": "16.26.1040 (HPE0000000009)"
            }
        ],
        "total": 8
   ...

Hi did you get this working?

Is there support for infiniband in maas 3.1?

Hi @sircam, we also have infiniband devices in our deployment. Last I tested (MaaS 3.1 beta 3), this is still a breaking issue. I created a ticket on it, but it hasn’t gotten any traction yet: