tl;dr:
The Automated OS Image Testing uses temporal workflows to build images according to instructions in packer-maas, test them according to our system-tests, and report the results of those tests in the results repo, which are duplicated in the results page. You can add more tests by extending test_full_circle and adding the correct reporting details to parse_test_results. Additional images can be added by creating the correct packer-maas instructions and inserting those details into image_mapping.yaml.
As a note: Tests are slow. A rough performance figure is 20-30 minutes to test each configuration of each feature for every machine for every image in a test. Without setup overhead and assuming all tests pass, three architectures with one machine each testing two features with three configurations each should take between 6 and 9 hours per image tested, for example.
Execution
The end-to-end pipeline has been fully implemmented as a single temporal workflow that should be called when running image tests. This handles requesting the correct jobs from Jenkins with the correct parameters, and passing the correct information to different child-workflows to get the desired set of results in our results repo.
How to start an image test
temporal
If you already have a temporal server you can request workflows on somewhere, great, you can skip this step.
If not, you can launch a local instance using temporal-server
Quickly summaried:
- Download binary at https://docs.temporal.io/cli#manual
- extract binary to /usr/local/bin to add to path
- done
By default we use the image-testing
namespace, so that we can coexist on other temporal servers:
This can be started with:
temporal server start-dev --namespace image-testing
worker(s)
Start the monolithic worker to execute all tasks in the image tests.
python3 monolithic_worker localhost:7233
.
You can additiopnally pass a namespace argument, if not using the default “image-tetsing”
python3 monolithic_worker localhost:7233 --namespace <namespace>
.
Note:
If, instead, you would like fine grain control over how many resoures go to each workflow, you can start seperate workers for each: (e2e_worker
, image_building_worker
, image_testing_worker
, image_reporting_worker
) This requires additionally passing the "use_seperate_queues": true
when requesting image tests.
workflow
To request image(s) to be tested requires only a single call to the python script test_images.py
. It will contain help text describing each of the parameters (see below). Executing may look as:
test_images.py centos7 --snap-channel 3.3/stable --url $jenkins_url --user $jenkins_user --pass $jenkins_pass
a help script is available as per standard:
$ python3 test_images.py -h
Older method
Calling a workflow with temporal still works, ie: calling e2e_workflow
, such as:
temporal workflow start -t e2e_tests --type e2e_workflow -w 'centos_tests' -i '{"image_name": ["centos7", "centos8"], "maas_snap_channel": "3.3/stable", "jenkins_url": $jenkins_url, "jenkins_user": $jenkins_user, "jenkins_pass": $jenkins_pass}'
where -t
defines the task queue, --type
defines the workflow being executed, -w
defines the workflow id, and -i
defines the input parameters to the workflow. -t
must be mono_queue
, as that is the task queue e2e_worker
listens on by default.
(If the use_seperate_queues
was set as True
when calling an image test, -t
should instead be set to e2e_tests
)
The workflow will then run to completion, calling it’s child workflows to build, test, and report on the image. Navigating to the UI of your temporal server (if it has one) will show you the status as the image tests progress.
There are a number of parameters, both required and optional, that can be given:
Required
-
image_name
- The name, or list of names, of images to test. -
Jenkins details
-
jenkins_url
- The url of the Jenkins server where image tests are located. -
jenkins_user
- The username to use to login to the Jenkins server. -
jenkins_pass
- The password to use to login to the Jenkins server.
-
Optional
-
Filepaths
-
image_mapping
- The filepath of the image mapping YAML distributed as part of MAAS-Integration-CI, defaults asimage_mapping.yaml
in the current working directory. -
repo_location
- The filepath of the location where the image results repo is to be cloned.
-
-
Test instances
-
maas_snap_channel
- The snap channel to use when installing MAAS in image tests, defaults aslatest/edge
. -
system_test_repo
- The url of the system-tests repo to use for building and testing images, defaults ashttps://git.launchpad.net/~maas-committers/maas-ci/+git/system-tests
. -
system_test_branch
- The branch in the system-test repo to use for building and tetsing images, defaults asmaster
. -
packer_maas_repo
- The url of the PackerMAAS repo to use for building images, defaults ashttps://github.com/canonical/packer-maas.git
. -
packer_maas_branch
- The branch in the PackerMAas repo to use for building images, defaults asmain
. -
parallel_tests
- A flag to request a single image test build for all images, rather than a test build per image, defaults asFalse
. -
overwite_results
- A flag to request new results overwrite old results rather than combining with them, defaults asFalse
. -
use_seperate_queues
- A flag to determine whether we are using mono queue or seperate queues for each workflow. defaults asFalse
.
-
-
Retries
-
max_retry_attempts
- How many times workflow activities should retry before throwing an exception, defaults as10
-
heartbeat_delay
- How many seconds between heartbeats for long running workflow activities, defaults as15
-
-
Timeouts
-
Timeouts given are in seconds, and are passed to temporal as
start_to_close
, which defines the maximum execution time of a single invocation. -
default_timeout
- How long a workflow activity can run before being timed out, defaults as300
. This is used in place of any timeouts below that are not set. -
jenkins_login_timeout
- How long we wait to log into the Jenkins server. -
return_status_timeout
- How long we wait for an activity to fetch the status of a Jenkins build. -
get_results_timeout
- How long we wait for the results of a Jenkins build to be available. -
fetch_results_timeout
- How long we wait for an activity to fetch the results of a Jenkins build, and perform some operation on them. -
log_details_timeout
- How long we wait for an activity to fetch logs from a Jenkins build, and perform some operation on them. -
request_build_timeout
- How long we wait for an activity to request a Jenkins build. -
build_complete_timeout
- How long we wait for a Jenkins build to complete, defaults as7200
.
-
Specific to the python call script:
-
ip
- The ip address of the temporal server, defaults aslocalhost
. -
port
- The port used to communicate with the temporal server, defaults as7233
. -
namespace
- The namespace used on the temporal server, defaults asimage-testing
.