MAAS 2.6 storage tests cant be executed failing with HTTP 500


#1

Hi guys,
I’m new to MAAS, and I can’t figure out what is causing this problem.
MAAS version: 2.6.0
Hardware: 2x HP Proliant DL380p Gen 8

I Log in to the UI. I start the machines, and they are discovered.
Commissioning phase passes and all hardware tests are passing except storage once.
The storage tests are failing, but in the UI it seems like they are pending (anyway that is another issue I guess).

I didn’t find exceptions messages in the UI, I opened the KVM console for the machines, and I saw that:
Fio, smartctl-short, smartctl-validate are failing with HTTP 500.

badblocks is failing with no authorized ssh keys for user

During the install, no ssh keys were imported, that can explain maybe one failure but the other ones …

Can someone help me with this?


#2

Can you post the MAAS logs found in /var/log/maas? According to that error message the machine is unable to signal to MAAS that the test is starting which is what is causing the failure.


#3

Hello ttsvetkov,

Does the machine have a Internet?


#4

Thanks for the responses guys.
Yes, it does have internet, the test is passing


#5

@ltrager I couldn’t upload the files here but here is a link to archive of the folder.
Both archives are the same just different format.
Interesting if that is the case how the other test pass, don’t they use the same API?


#6

Thanks for the logs. This appears to be a bug in how event are emitted. However I suspect that there may be an issue detecting storage devices.

  1. Are storage devices detected after commissioning runs? You should see devices on the storage tab in the UI?
  2. Can you post the output of
    maas $PROFILE node-script-result read $SYSTEM_ID current-testing

#7

@ltrager

  1. No, storage is not detected, don’t see it in the UI

Storage devices are Samsung SSDs. All are detected in the ILO (ILO v4, firmware 2.61)

  1. The output - command is invalid

tuser@maas:~$ maas $PROFILE node-script-result read $SYSTEM_ID current-testing
usage: maas [-h] COMMAND …

optional arguments:
-h, --help show this help message and exit

drill down:
COMMAND
login Log in to a remote API, and remember its description and
credentials.
logout Log out of a remote API, purging any stored credentials.
list List remote APIs that have been logged-in to.
refresh Refresh the API descriptions of all profiles.
init Initialize controller.
apikey Used to manage a user’s API keys. Shows existing keys unless
–generate or --delete is passed.
configauth Configure external authentication.
createadmin Create a MAAS administrator account.
changepassword
Change a MAAS user’s password.

argument COMMAND: invalid choice: ‘node-script-result’ (choose from ‘login’, ‘logout’, ‘list’, ‘refresh’, ‘init’, ‘apikey’, ‘configauth’, ‘createadmin’, ‘changepassword’)


#8

You are using the non-standard port 5248, I’m assuming you changed that somewhere? Hopefully that’s not the problem.


#9

I’ve posted a patch to fix LP:1840181 however that isn’t the root of your problem. MAAS isn’t detecting that the your system has any storage devices. If you click on the ‘Commissioning’ tab what is the output of the script 00-maas-07-block-devices?

You may want to try selecting the option ‘Allow SSH access and prevent machine from powering off’ when commissioning so you can login and see if your storage devices are available. If not you can also look at dmesg for any errors.


#10

ltrager,
you need to create a cli user by using this below command
Syntax:
maas-region apikey --username= > <key_file_path>
Example:
maas-region apikey --username=admin > /root/admin_apikey

Then Login the cli user by using the below command
Syntax:
maas login - < <key_file_path>
Example:
maas login admin http://192.168.4.43:5240/MAAS/api/2.0/ - < /root/admin_apikey

then you need to run the (maas $PROFILE node-script-result read $SYSTEM_ID current-testing) command

Example:
maas admin node-script-result read hff7gp current-testing

you can take the system_id like this

Sample Output:
root@gk-VM:~# maas admin node-script-result read hff7gp current-testing
Success.
Machine-readable output follows:
{
“type”: 2,
“id”: 204,
“status”: 2,
“status_name”: “Passed”,
“type_name”: “Testing”,
“ended”: “Wed, 14 Aug 2019 19:02:32 -0000”,
“runtime”: “0:00:00”,
“started”: “Wed, 14 Aug 2019 19:02:31 -0000”,
“system_id”: “hff7gp”,
“results”: [
{
“id”: 873,
“created”: “Wed, 14 Aug 2019 18:58:22 -0000”,
“updated”: “Wed, 14 Aug 2019 19:02:32 -0000”,
“name”: “smartctl-validate”,
“status”: 9,
“status_name”: “Skipped”,
“exit_status”: 0,
“started”: “Wed, 14 Aug 2019 19:02:31 -0000”,
“ended”: “Wed, 14 Aug 2019 19:02:32 -0000”,
“runtime”: “0:00:00”,
“starttime”: 1565789551.820075,
“endtime”: 1565789552.650411,
“estimated_runtime”: “0:00:00”,
“parameters”: {
“storage”: {
“argument_format”: “{path}”,
“type”: “storage”,
“value”: {
“id_path”: “/dev/vda”,
“model”: “”,
“name”: “sda”,
“physical_blockdevice_id”: 17,
“serial”: “”
}
}
},
“script_id”: 1,
“script_revision_id”: 1,
“suppressed”: false
}
],
“last_ping”: “Wed, 14 Aug 2019 19:02:32 -0000”,
“resource_uri”: “/MAAS/api/2.0/nodes/hff7gp/results/204/”
}


#11

@jdelaros1 No, I haven’t change anything that is default communication port https://maas.io/docs/maas-communcation. The installations that I have is really as basic


#12

@ltrager
You were right I tuns out that the raid controllers on these machines were not configured correctly and 00-maas-07-block-devices was empty.

Now it is showing the devices :

[ { “NAME”: “sda”, “RO”: “0”, “RM”: “0”, “MODEL”: “LOGICAL VOLUME”, “ROTA”: “0”, “MAJ:MIN”: “8:0”, “PATH”: “/dev/sda”, “DEVPATH”: “/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host2/target2:1:0/2:1:0:0/block/sda”, “FIRMWARE_VERSION”: “8.32”, “SERIAL”: “600508b1001cf4efd32f026f95297dd9”, “ID_PATH”: “/dev/disk/by-id/wwn-0x600508b1001cf4efd32f026f95297dd9”, “SIZE”: “128001807360”, “BLOCK_SIZE”: “1024” }, { “NAME”: “sdb”, “RO”: “0”, “RM”: “0”, “MODEL”: “LOGICAL VOLUME”, “ROTA”: “0”, “MAJ:MIN”: “8:16”, “PATH”: “/dev/sdb”, “DEVPATH”: “/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host2/target2:1:0/2:1:0:1/block/sdb”, “FIRMWARE_VERSION”: “8.32”, “SERIAL”: “600508b1001cd825ea48aba397fdf698”, “ID_PATH”: “/dev/disk/by-id/wwn-0x600508b1001cd825ea48aba397fdf698”, “SIZE”: “128001807360”, “BLOCK_SIZE”: “1024” }, { “NAME”: “sdc”, “RO”: “0”, “RM”: “0”, “MODEL”: “LOGICAL VOLUME”, “ROTA”: “0”, “MAJ:MIN”: “8:32”, “PATH”: “/dev/sdc”, “DEVPATH”: “/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host2/target2:1:0/2:1:0:2/block/sdc”, “FIRMWARE_VERSION”: “8.32”, “SERIAL”: “600508b1001cb06cb204a5076403a165”, “ID_PATH”: “/dev/disk/by-id/wwn-0x600508b1001cb06cb204a5076403a165”, “SIZE”: “128001807360”, “BLOCK_SIZE”: “1024” }, { “NAME”: “sdd”, “RO”: “0”, “RM”: “0”, “MODEL”: “LOGICAL VOLUME”, “ROTA”: “0”, “MAJ:MIN”: “8:48”, “PATH”: “/dev/sdd”, “DEVPATH”: “/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host2/target2:1:0/2:1:0:3/block/sdd”, “FIRMWARE_VERSION”: “8.32”, “SERIAL”: “600508b1001cb8d0e25eb4b0440928df”, “ID_PATH”: “/dev/disk/by-id/wwn-0x600508b1001cb8d0e25eb4b0440928df”, “SIZE”: “128001807360”, “BLOCK_SIZE”: “1024” }, { “NAME”: “sde”, “RO”: “0”, “RM”: “0”, “MODEL”: “LOGICAL VOLUME”, “ROTA”: “0”, “MAJ:MIN”: “8:64”, “PATH”: “/dev/sde”, “DEVPATH”: “/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host2/target2:1:0/2:1:0:4/block/sde”, “FIRMWARE_VERSION”: “8.32”, “SERIAL”: “600508b1001c842fab611c5e6da4b0ca”, “ID_PATH”: “/dev/disk/by-id/wwn-0x600508b1001c842fab611c5e6da4b0ca”, “SIZE”: “128001807360”, “BLOCK_SIZE”: “1024” }]

But the 500 is still there.

@antonyjohnson
Thank you for the guide I will try and post the results back


#13

Any particular reason you’re using IP addresses tagged for public use?
20.1.0.0/24


#14

Nope, just a test environment and the other side it is easy to remember for me personally … do an network with 10…, 20… and so forth.
If that was a real IP I was going tho think twice before posting it on a public forum :slight_smile:


#15

I got the storage showing and the test passing but really it is a bit odd.
How we got it to work is in “Default Ubuntu release used for commissioning” I changed the boot image to Ubuntu 16.04 LTS Xenial. And all went green! I will do some more testing whit 18.04 image this days.

Any idea or a suggestion one image is working and the other is not ?