Lifecycle Management

As an administrator, you want to manage the deployment of Cumulus Networks product software onto your network devices (servers, appliances, and switches) in the most efficient way and with the most information about the process as possible. With this release, NetQ expands the lifecycle management (LCM) feature to include the discovery of Cumulus Linux switches that are not running NetQ, and a workflow for installation and upgrade of NetQ on switches in the LCM inventory.

LCM enables you to:

  • Manage Cumulus Linux and Cumulus NetQ images in a local repository
  • Configure switch access credentials (required for installations and upgrades)
  • Manage Cumulus Linux switches
  • Create snapshots of the network state at various times
  • Create Cumulus NetQ configuration profiles
  • Upgrade Cumulus NetQ (Agents and CLI) on Cumulus Linux switches with Cumulus NetQ Agents version 2.4.x or later
  • Install or upgrade Cumulus NetQ (Agents and CLI) on Cumulus Linux switches with or without Cumulus NetQ Agents; all in a single job
  • Upgrade Cumulus Linux on switches with Cumulus NetQ Agents version 2.4.x or later (includes upgrade of NetQ to 3.0.0 or 3.1.0)

This feature is fully enabled for on-premises deployments and fully disabled for cloud deployments. Contact your local Cumulus Networks sales representative or submit a support ticket to activate LCM on cloud deployments.

Access Lifecycle Management Features

You can access the lifecycle management features from several places in NetQ. All of them take you to the same location:

  • Click Main Menu (Main Menu) and select Upgrade Switches
  • If you have a workbench open:
    • Click (Switches) in the workbench header, then click Manage switches
    • Click (Upgrade) in the workbench header (this option is planned for removal in later releases)

The first time you open the Manage Switch Assets view, it provides a summary card for switch inventory, uploaded Cumulus Linux images, uploaded NetQ images, NetQ configuration profiles, and switch access settings. Additional cards appear after that based on your activity.

Manage Cumulus Linux and NetQ Images

You can manage both Cumulus Linux and Cumulus NetQ images with LCM. They are managed in a similar manner.

Cumulus Linux binary images can be uploaded to a local LCM repository for upgrade of your switches. Cumulus NetQ debian packages can be uploaded to the local LCM repository for installation or upgrade. You can upload images from an external drive.

The Linux and NetQ images are available in several variants based on the software version (x.y.z), the CPU architecture (ARM, x86), platform (based on ASIC vendor, Broadcom or Mellanox), SHA Checksum, and so forth. When LCM discovers Cumulus Linux switches running NetQ 2.x or later in your network, it extracts the meta data needed to select the appropriate image for a given switch. Similarly, LCM discovers and extracts the meta data from NetQ images.

The Cumulus Linux Images and NetQ Images cards provide a summary of image status in LCM. They show the total number of images in the repository, a count of missing images, and the starting points for adding and managing your images.

Default Cumulus Linux or Cumulus NetQ Version Assignment

You can assign a specific Cumulus Linux or Cumulus NetQ version as the default version to use during installation or upgrade of switches. It is recommended that you choose the newest version that you intend to install or upgrade on all, or the majority, of your switches. The default selection can be overridden during individual installation and upgrade job creation if an alternate version is needed for a given set of switches.

Missing Images

You should upload images for each variant of Cumulus Linux and Cumulus NetQ currently installed on the switches in your inventory if you want to support rolling back to a known good version should an installation or upgrade fail. LCM prompts you to upload any missing images to the repository.

For example, if you have both Cumulus Linux 3.7.3 and 3.7.11 versions, some running on ARM and some on x86 architectures, then LCM verifies the presence of each of these images. If only the 3.7.3 x86, 3.7.3 ARM, and 3.7.11 x86 images are in the repository, LCM would list the 3.7.11 ARM image as missing. For Cumulus NetQ, you need both the netq-apps and netq-agent packages for each release variant.

If you have specified a default Cumulus Linux and/or Cumulus NetQ version, LCM also verifies that the necessary versions of the default image are available based on the known switch inventory, and if not, lists those that are missing.

While it is not required that you upload images that NetQ determines to be missing, it may cause failures when you attempt to upgrade your switches.

Upload Images

For fresh installations of NetQ 3.x, no images have yet been uploaded to the LCM repository. If you are upgrading from NetQ 3.0.0, the Cumulus Linux images you have previously added are still present.

In preparation for Cumulus Linux upgrades, the recommended image upload flow is:

StepTaskInstructions
1In a fresh NetQ install, add images that match your current inventoryUpload Missing Images
2Add images you want to use for upgradeUpload Upgrade Images
3Optionally specify a default version for upgradesSpecify a Default Upgrade Image

In preparation for Cumulus NetQ installation or upgrade, the recommended image upload flow is:

StepTaskInstructions
1Add images you want to use for installation or upgradeUpload Upgrade Images
2Add any missing images based on NetQ discoveryUpload Missing Images
3Optionally specify a default version for installation or upgradeSpecify a Default Upgrade Image

Upload Missing Images

Use the following instructions to upload missing images:

  1. On the Cumulus Linux Images card, click the View missing CL images link to see what images you need. This opens the list of missing images.

If you have already specified a default image, you must click Manage and then Missing to see the missing images.

  1. Select one of the missing images and make note of the version, ASIC Vendor, and CPU architecture.
  1. Click (Add Image) above the table.
  1. Provide the .bin file from an external drive that matches the criteria for the selected image, either by dragging and dropping it onto the dialog or by selecting it from a directory.

  2. Click Import.

On successful completion, you receive confirmation of the upload.
If the upload was not successful, an Image Import Failed message is shown. Close the Import Image dialog and try uploading the file again.
  1. Click Done.

  2. Click Uploaded tab to verify the image is in the repository.

  3. Repeat Steps 1-7 until all of the missing images are uploaded to the repository. When all of the missing images have been uploaded, the Missing list will be empty.

  4. Click to return to the LCM dashboard.

    The Cumulus Linux Images card now shows the number of images you uploaded.

  1. On the NetQ Images card, click the View missing NetQ images link to see what images you need. This opens the list of missing images.

If you have already specified a default image, you must click Manage and then Missing to see the missing images.

  1. Select one of the missing images and make note of the OS version, CPU architecture, and image type. Remember that you need both image types for NetQ to perform the installation or upgrade.
  1. Click (Add Image) above the table.
  1. Provide the .deb file from an external drive that matches the criteria for the selected image, either by dragging and dropping it onto the dialog or by selecting it from a directory.

  2. Click Import.

On successful completion, you receive confirmation of the upload.
If the upload was not successful, an Image Import Failed message is shown. Close the Import Image dialog and try uploading the file again.
  1. Click Done.

  2. Click Uploaded tab to verify the image is in the repository.

  3. Repeat Steps 1-7 until all of the missing images are uploaded to the repository. When all of the missing images have been uploaded, the Missing list will be empty.

  4. Click to return to the LCM dashboard.

    The NetQ Images card now shows the number of images you uploaded.

Upload Upgrade Images

To upload the Cumulus Linux or Cumulus NetQ images that you want to use for upgrade:

  1. Click Add Image on the Cumulus Linux Images or NetQ Images card.

  2. Provide an image from an external drive, either by dragging and dropping it onto the dialog or by selecting it from a directory.

  3. Click Import.

  4. Monitor the progress until it completes. Click Done.

  5. Repeat Steps 1-4 to upload additional images as needed.

    For example, if you are upgrading switches with different ASIC vendors or CPU architectures, you will need more than one image. For NetQ, you need both the netq-apps and netq-agent packages for each variant.

Specify a Default Upgrade Version

Lifecycle management does not have a default Cumulus Linux or Cumulus NetQ upgrade version specified automatically. You must specify the version that is appropriate for your network.

To specify a default Cumulus Linux or Cumulus NetQ version:

  1. Click the Click here to set the default CL version link in the middle of the Cumulus Linux Images card, or click the Click here to set the default NetQ version link in the middle of the NetQ Images card.

  2. Select the version you want to use as the default for switch upgrades.

  3. Click Save. The default version is now displayed on the relevant Images card.

After you have specified a default version, you have the option to change it.

To change the default Cumulus Linux or Cumulus NetQ version:

  1. Click change next to the currently identified default image on the Cumulus Linux Images or NetQ Images card.

  2. Select the image you want to use as the default version for upgrades.

  3. Click Save.

Export Images

You can export the image listings for reference.

To export image listings:

  1. Open the LCM dashboard.

  2. Click Manage on the Cumulus Linux Images or NetQ Images card.

  3. Optionally, use the filter option above the table on the Uploaded tab to narrow down a large listing of images.

  4. Click above the table.

  5. Choose the export file type and click Export.

Remove Images from Local Repository

Once you have upgraded all of your switches beyond a particular release of Cumulus Linux or NetQ, you may want to remove those images from the LCM repository to save space on the server.

To remove images:

  1. Open the LCM dashboard.

  2. Click Manage on the Cumulus Linux Images or NetQ Images card.

  3. On the Uploaded tab, select the images you want to remove. Use the filter option above the table to narrow down a large listing of images.

  4. Click .

Manage Switch Access Credentials

Switch access credentials are needed for performing upgrades. You can choose between basic authentication (SSH password) and SSH (Public/Private key) authentication. These credentials apply to all switches. If you have switches with varying access credentials you will have to work with one set at a time and change the credentials as needed.

Specify Switch Credentials

Switch access credentials are not specified by default. You must add these.

To specify access credentials:

  1. Open the LCM dashboard.

  2. Click the Click here to add switch access link on the Access card.

  3. Select the authentication method you want to use; SSH or Basic Authentication. Basic authentication is selected by default.

  1. Enter a username.

  2. Enter a password.

  3. Click Save.

    The Access card now indicates your credential configuration.

You must have sudoer permission to properly configure switch access for the SSH Key method.

  1. Enter the username of the user(s) that has access to switches for configuration.

  2. Create a pair of SSH private and public keys.

    ssh-keygen -t rsa -C "<USER>"
    
  3. Copy the SSH public key to each switch that you want to upgrade using one of the following methods:

    • Manually copy the the SSH public key to the /home/<USER>/.ssh/authorized_keys file on each switch, or
    • Run ssh-copy-id USER@<switch_ip> on the server where the SSH key pair was generated for each switch
  4. Copy the SSH private key into the text box in the Create Switch Access card.

For security, your private key is stored in an encrypted format, and only provided to internal processes while encrypted.

The Access card now indicates your credential configuration.

Modify Switch Credentials

You can modify your switch access credentials at any time. You can change between authentication methods or change values for either method.

To change your access credentials:

  1. Open the LCM dashboard.

  2. On the Access card, click the Click here to change access mode link in the center of the card.

  3. Select the authentication method you want to use; SSH or Basic Authentication. Basic authentication is selected by default.

  4. Based on your selection:

    • Basic: Enter a new username and/or password
    • SSH: Enter a new username and/or SSH private key

    Refer to Specify Switch Credentials for details.

  5. Click Save.

Manage Switches

This lifecycle management feature provides an inventory of switches that have been automatically discovered by NetQ and are available for software installation or upgrade through NetQ. This includes all Cumulus Linux switches with or without Cumulus NetQ Agent 2.4 or later installed in your network. You assign network roles to switches and select switches for software installation and upgrade from this inventory listing.

A count of the switches NetQ was able to discover and the Cumulus Linux versions that are running on those switches is available from the LCM dashboard.

To view a list of all switches known to LCM, click Manage on the Switches card.

Review the list, filtering as needed (click Filter Switch List) to determine if the switches you want to upgrade are included.

If the switches you are looking to upgrade are not present in the final list, you can:

  • Work with the list you have and add them later
  • Verify the missing switches are reachable using ping
  • Verify NetQ Agent is fresh for switches that already have the agent installed (click Main Menu, then click Agents or run netq show agents)
  • Install NetQ on the switch (refer to Install NetQ)

After all of the switches you want to upgrade are contained in the list, you can assign roles to them.

Role Management

Four pre-defined switch roles are available based on a CLOS architecture:

  • Superspine
  • Spine
  • Leaf
  • Exit

With this release, you cannot create your own roles.

Switch roles are used to:

  • Identify switch dependencies and determine the order in which switches are upgraded
  • Determine when to stop the process if a failure is encountered

When roles are assigned, the upgrade process begins with switches having the superspine role, then continues with the spine switches, leaf switches, exit switches, and finally switches with no role assigned. All switches with a given role must be successfully upgraded before the switches with the closest dependent role can be upgraded.

For example, a group of seven switches are selected for upgrade. Three are spine switches and four are leaf switches. After all of the spine switches are successfully upgraded, then the leaf switches are upgraded. If one of the spine switches were to fail the upgrade, the other two spine switches are upgraded, but the upgrade process stops after that, leaving the leaf switches untouched, and the upgrade job fails. The spine switch that failed to upgrade is rolled back to its original release if that option is chosen in the upgrade job.

When only some of the selected switches have roles assigned in an upgrade job, the switches with roles are upgraded first and then all the switches with no roles assigned are upgraded.

While role assignment is optional, using roles can prevent switches from becoming unreachable due to dependencies between switches or single attachments. And when MLAG pairs are deployed, switch roles avoid upgrade conflicts. For these reasons, Cumulus Networks highly recommends assigning roles to all of your switches.

Assign Switch Roles

  1. Open the LCM dashboard.

  2. On the Switches card, click Manage.

  3. Select one switch or multiple switches that should be assigned to the same role.

  4. Click Assign Role.

  5. Select the role that applies to the selected switch(es).

  6. Click Assign.

    Note that the Role column is updated with the role assigned to the selected switch(es).

  7. Continue selecting switches and assigning roles until most or all switches have roles assigned.

A bonus of assigning roles to switches is that you can then filter the list of switches by their roles by clicking the appropriate tab.

Change the Role of a Switch

If you accidentally assign an incorrect role to a switch, it can easily be changed to the correct role.

To change a switch role:

  1. Open the LCM dashboard.

  2. On the Switches card, click Manage.

  3. Select the switch(es) with the incorrect role from the list.

  4. Click Assign Role.

  5. Select the correct role.

  6. Click Assign.

Export List of Switches

Using the Switch Management feature you can export a listing of all or a selected set of switches.

To export the switch listing:

  1. Open the LCM dashboard.

  2. On the Switches card, click Manage.

  3. Select one or more switches, filtering as needed, or select all switches (click ).

  4. Click .

  5. Choose the export file type and click Export.

Configuration Management

With the NetQ 3.1.0 release, you can set up a configuration profile to indicate how you want NetQ configured when it is installed or upgraded on your Cumulus Linux switches.

The default configuration profile, NetQ default config, is set up to run in the management VRF and provide info level logging. Both WJH and CPU Limiting are disabled.

You can view, add, and remove NetQ configuration profiles at any time.

View Cumulus NetQ Configuration Profiles

To view existing profiles:

  1. Click (Switches) in the workbench header, then click Manage switches, or click Main Menu (Main Menu) and select Upgrade Switches.

  2. Click Manage on the NetQ Configurations card.

    Note that the initial value on first installation of NetQ shows one profile. This is the default profile provided with NetQ.

  3. Review the profiles.

Create Cumulus NetQ Configuration Profiles

You can specify four options when creating NetQ configuration profiles:

  • Basic: VRF assignment and Logging level
  • Advanced: CPU limit and what just happened (WJH)

To create a profile:

  1. Click (Switches) in the workbench header, then click Manage switches, or click Main Menu (Main Menu) and select Upgrade Switches.

  2. Click Manage on the NetQ Configurations card.

  3. Click Add Config Profile (Add Config).

  4. Enter a name for the profile.

  5. If you do not want NetQ Agent to run in the management VRF, select either Default or Custom. The Custom option lets you enter the name of a user-defined VRF.

  6. Optionally enable WJH.

    Refer to WJH for information about this feature. WJH is only available on Mellanox switches.

  7. To set a logging level, click Advanced, then choose the desired level.

  8. Optionally set a CPU usage limit for the NetQ Agent. Click Enable and drag the dot to the desired limit. Refer to this Knowledge Base article for information about this feature.

  9. Click Add to complete the configuration or Close to discard the configuration.

    This example shows the addition of a profile with the CPU limit set to 75 percent.

Remove Cumulus NetQ Configuration Profiles

To remove a NetQ configuration profile:

  1. Click (Switches) in the workbench header, then click Manage switches, or click Main Menu (Main Menu) and select Upgrade Switches.

  2. Click Manage on the NetQ Configurations card.

  3. Select the profile(s) you want to remove and click (Delete).

Upgrade Cumulus NetQ

LCM enables you to upgrade to Cumulus NetQ 3.1.0 on switches with an existing NetQ Agent 2.4.x or 3.0.0 release. You can upgrade the entire application or only the NetQ Agent. Up to five jobs can be run simultaneously; however, a given switch can only be contained in one running job at a time.

The upgrade workflow includes the following steps:

Upgrades can be performed from NetQ 2.4.x and 3.0.0 releases to the NetQ 3.1.0 release. Lifecycle management does not support upgrades from NetQ 2.3.1 or earlier releases; you must perform a new installation in these cases.

Prepare for a Cumulus NetQ Upgrade

In preparation for Cumulus NetQ upgrade on switches, perform the following steps:

  1. Click Main Menu (Main Menu) and select Upgrade Switches, or click (Switches) in the workbench header, then click Manage switches.

  2. Add the upgrade images.

  3. Optionally, specify a default upgrade version.

  4. Optionally, create a new configuration profile.

Your LCM dashboard should look similar to this after you have completed the above steps:

Perform a Cumulus NetQ Upgrade

To upgrade Cumulus NetQ on switches:

  1. Click Manage on the Switches card.

  2. Select the individual switches (or click to select all switches) with older NetQ releases that you want to upgrade. If needed, use the filter to narrow the listing and find the relevant switches.

  3. Click above the table.

    From this point forward, the software walks you through the upgrade process, beginning with a review of the switches that you selected for upgrade.

  4. Verify that the number of switches selected for upgrade matches your expectation.

  5. Enter a name for the upgrade job. The name can contain a maximum of 22 characters.

  6. Review each switch:

    • Is the NetQ version 2.4.x or 3.0.0? If not, this switch can only be upgraded through the switch discovery process.
    • Is the configuration profile the one you want to apply? If not, click Change config, then select an alternate profile to apply to all selected switches.

    You can apply different profiles to switches in a single upgrade job by selecting a subset of switches (click checkbox for each switch) and then choosing a different profile. You can also change the profile on a per switch basis by clicking the current profile link and selecting an alternate one.

    Scroll down to view all selected switches or use Search to find a particular switch of interest.

  7. After you are satisfied with the included switches, click Next.

  8. Review the summary indicating the number of switches and the configuration profile to be used. If either is incorrect, click Back and review your selections.

  9. Select the version of NetQ for upgrade. If you have designated a default version, keep the Default selection. Otherwise, select an alternate version by clicking Custom and selecting it from the list.

    By default, the NetQ Agent and CLI are upgraded on the selected switches. If you do not want to upgrade the NetQ CLI, click Advanced and change the selection to No.

  10. Click Next.

  11. Three checks are performed to eliminate preventable problems during the upgrade process.

    The first check verifies that the selected switches are not currently scheduled for, or in the middle of, a Cumulus Linux or NetQ upgrade. The second check verifies that the selected versions of Cumulus Linux and NetQ are valid upgrade paths. And the final check verifies that all mandatory parameters have valid values.

    If any of the pre-checks fail, review the error messages and take appropriate action.

    If all of the pre-checks pass, click Upgrade to initiate the upgrade job.

Analyze the NetQ Upgrade Results

After starting the upgrade you can monitor the progress from the preview page or the Upgrade History page.

From the preview page, a green circle with rotating arrows is shown on each switch as it is working. Alternately, you can close the detail of the job and see a summary of all current and past upgrade jobs on the NetQ Install and Upgrade History page. The job started most recently is shown at the top, and the data is refreshed periodically.

If you are disconnected while the job is in progress, it may appear as if nothing is happening. Try closing (click ) and reopening your view (click ), or refreshing the page.

Monitor the NetQ Upgrade Job

Several viewing options are available for monitoring the upgrade job.

  • Monitor the job with full details open:

  • Monitor the job with only summary information in the NetQ Install and Upgrade History page. Open this view by clicking in the full details view; useful when you have multiple jobs running simultaneously

    • Monitor the job through the NetQ Install and Upgrade History card on the LCM dashboard. Click twice to return to the LCM dashboard.

Sample Successful NetQ Upgrade

This example shows that all four of the selected switches were upgraded successfully. You can see the results in the Switches list as well.

Sample Failed NetQ Upgrade

This example shows that an error has occurred trying to upgrade two of the four switches in a job. The error indicates that the access permissions for the switches are invalid. In this case, you need to modify the switch access credentials and then create a new upgrade job.

If you were watching this job from the LCM dashboard view, click View on the NetQ Install and Upgrade History card to return to the detailed view to resolve any issues that occurred.

Reasons for NetQ Upgrade Failure

Upgrades can fail at any of the stages of the process, including when backing up data, upgrading the Cumulus NetQ software, and restoring the data. Failures can occur when attempting to connect to a switch or perform a particular task on the switch.

Some of the common reasons for upgrade failures and the errors they present:

ReasonError Message
Switch is not reachable via SSHData could not be sent to remote host “192.168.0.15”. Make sure this host can be reached over ssh: ssh: connect to host 192.168.0.15 port 22: No route to host
Switch is reachable, but user-provided credentials are invalidInvalid/incorrect username/password. Skipping remaining 2 retries to prevent account lockout: Warning: Permanently added ‘<hostname-ipaddr>’ to the list of known hosts. Permission denied, please try again.
Switch is reachable, but a valid Cumulus Linux license is not installed1587866683.880463 2020-04-26 02:04:43 license.c:336 CRIT No license file. No license installed!
Upgrade task could not be runFailure message depends on the why the task could not be run. For example: /etc/network/interfaces: No such file or directory
Upgrade task failedFailed at- <task that failed>. For example: Failed at- MLAG check for the peerLink interface status
Retry failed after five attemptsFAILED In all retries to process the LCM Job

Use Switch Discovery to Install and Upgrade NetQ

When you want to update Cumulus NetQ on both Cumulus Linux switches with and without NetQ installed, NetQ provides the LCM switch discovery feature. The feature browses your network to find all Cumulus Linux Switches, with and without NetQ currently installed and determines the versions of Cumulus Linux and NetQ installed. The results of switch discovery are then used to install or upgrade NetQ on all discovered switch in a single procedure rather than in two steps. Up to five jobs can be run simultaneously; however, a given switch can only be contained in one running job at a time.

The upgrade workflow includes the following steps:

Upgrades can be performed from NetQ 2.4.x and 3.0.0 releases to the NetQ 3.1.0 release. Lifecycle management does not support upgrades from NetQ 2.3.1 or earlier releases; you must perform a new installation in these cases.

If all of your Cumulus Linux switches already have NetQ 2.4.x or later installed, you can upgrade them directly. Refer to Upgrade Cumulus NetQ.

To discover Cumulus Linux switches and install or upgrade NetQ on them:

  1. Click Main Menu (Main Menu) and select Upgrade Switches, or click (Switches) in the workbench header, then click Manage switches.

  2. On the Switches card, click Discover.

  3. Enter a name for the scan.

  4. Choose whether you want to look for switches by entering IP address ranges OR import switches using a comma-separated values (CSV) file.

    If you do not have a switch listing, then you can manually add the address ranges where your switches are located in the network. This has the advantage of catching switches that may have been missed in a file.

    A maximum of 50 addresses can be included in an address range. If necessary, break the range into smaller ranges.

    To discover switches using address ranges:

    1. Enter an IP address range in the IP Range field.

      Ranges can be contiguous, for example 192.168.0.24-64, or non-contiguous, for example 192.168.0.24-64,128-190,235, but they must be contained within a single subnet.

    2. Optionally, enter another IP address range (in a different subnet) by clicking .

      For example, 198.51.100.0-128 or 198.51.100.0-128,190,200-253.

    3. Add additional ranges as needed. Click to remove a range if needed.

    If you decide to use a CSV file instead, the ranges you entered will remain if you return to using IP ranges again.

    If you have a file of switches that you want to import, then it can be easier to use that, than to enter the IP address ranges manually.

    To import switches through a CSV file:

    1. Click Browse.

    2. Select the CSV file containing the list of switches.

      The CSV file must include a header containing hostname, ip, and port. They can be in any order you like, but the data must match that order. For example, a CSV file that represents the Cumulus reference topology could look like this:

    or this:

    You must have an IP address in your file, but the hostname is optional and if the port is blank, NetQ uses switch port 22 by default.

    Click Remove if you decide to use a different file or want to use IP address ranges instead. If you had entered ranges prior to selecting the CSV file option, they will have remained.

  5. Note that the switch access credentials defined in Credentials Management are used to access these switches. If you have issues accessing the switches, you may need to update your credentials.

  6. Click Next.

    When the network discovery is complete, NetQ presents the number of Cumulus Linux switches it has found. They are displayed in categories:

    • Discovered without NetQ: Switches found without NetQ installed
    • Discovered with NetQ: Switches found with some version of NetQ installed
    • Discovered but Rotten: Switches found that are unreachable
    • Incorrect Credentials: Switches found that cannot be reached because the provided access credentials do not match those for the switches
    • OS not Supported: Switches found that are running Cumulus Linux version not supported by the LCM upgrade feature
    • Not Discovered: IP addresses which did not have an associated Cumulus Linux switch

    If no switches are found for a particular category, that category is not displayed.

  7. Select which switches you want to upgrade from each category by clicking the checkbox on each switch card.

  8. Click Next.

  9. Verify the number of switches identified for upgrade and the configuration profile to be applied is correct.

  10. Accept the default NetQ version or click Custom and select an alternate version.

  11. By default, the NetQ Agent and CLI are upgraded on the selected switches. If you do not want to upgrade the NetQ CLI, click Advanced and change the selection to No.

  12. Click Next.

  13. Three checks are performed to eliminate preventable problems during the install process.

    The first check verifies that the selected switches are not currently scheduled for, or in the middle of, a Cumulus Linux or NetQ upgrade. The second check verifies that the selected versions of Cumulus Linux and NetQ are valid upgrade paths. And the final check verifies that all mandatory parameters have valid values.

    If any of the pre-checks fail, review the error messages and take appropriate action.

    If all of the pre-checks pass, click Install to initiate the job.

  14. Monitor the job progress.

    After starting the upgrade you can monitor the progress from the preview page or the Upgrade History page.

    From the preview page, a green circle with rotating arrows is shown on each switch as it is working. Alternately, you can close the detail of the job and see a summary of all current and past upgrade jobs on the NetQ Install and Upgrade History page. The job started most recently is shown at the top, and the data is refreshed periodically.

    If you are disconnected while the job is in progress, it may appear as if nothing is happening. Try closing (click ) and reopening your view (click ), or refreshing the page.

    Several viewing options are available for monitoring the upgrade job.

    • Monitor the job with full details open:

    • Monitor the job with only summary information in the NetQ Install and Upgrade History page. Open this view by clicking in the full details view; useful when you have multiple jobs running simultaneously

    • Monitor the job through the NetQ Install and Upgrade History card on the LCM dashboard. Click twice to return to the LCM dashboard.

  15. Investigate any failures and create new jobs to reattempt the upgrade.

Upgrade Cumulus Linux

LCM enables you to upgrade to Cumulus Linux on switches with an existing NetQ Agent 2.4.x or 3.0.0 release. As part of the Cumulus Linux upgrade, if a NetQ Agent 2.4.x release is installed, that is also upgraded. Up to five jobs can be run simultaneously; however, a given switch can only be contained in one running job at a time.

The upgrade workflow includes the following steps:

Upgrades can be performed between Cumulus Linux 3.x releases, and between Cumulus Linux 4.x releases. Lifecycle management does not support upgrades from Cumulus Linux 3.x to 4.x releases.

Prepare for a Cumulus Linux Upgrade

In preparation for switch installation or upgrade, first perform the following steps:

  1. Click Main Menu (Main Menu) and select Upgrade Switches, or click (Switches) in the workbench header, then click Manage switches.

  2. Upload the Cumulus Linux and NetQ upgrade images.

  3. Optionally, specify a default upgrade version.

  4. Verify the switches you want to manage are running NetQ Agent 2.4 or later. Refer to Switch Management.

  5. Optionally, create a new NetQ configuration profile.

  6. Configure switch access credentials.

  7. Assign each switch a role (optional, but recommended). Refer to Role Management.

Your LCM dashboard should look similar to this after you have completed these steps:

Perform a Cumulus Linux Upgrade

To upgrade switches:

  1. Click Main Menu (Main Menu) and select Upgrade Switches, or click (Switches) in the workbench header, then click Manage switches.

  2. Click Manage on the Switches card.

  3. Select the individual switches (or click to select all switches) that you want to upgrade. If needed, use the filter to the narrow the listing and find the relevant switches.

  4. Click (Upgrade CL) above the table.

    From this point forward, the software walks you through the upgrade process, beginning with a review of the switches that you selected for upgrade.

  5. Give the upgrade job a name. This is required.

    The name can be a maximum of 22 characters and contain spaces and special characters.

  6. Verify that the switches you selected are included, and that they have the correct IP address and roles assigned.

    • If you accidentally included a switch that you do NOT want to upgrade, hover over the switch information card and click to remove it from the upgrade job.
    • If the role is incorrect or missing, click to select a role for that switch, then click . Click to discard a role change.

    In this example, some of the selected switches do not have roles assigned.

  7. When you are satisfied that the list of switches is accurate for the job, click Next.

  8. Verify that you want to use the default Cumulus Linux or NetQ version for this upgrade job. If not, click Custom and select an alternate image from the list.

    Default CL Version Selected

    Default CL Version Selected

    Custom CL Version Selected

    Custom CL Version Selected

  9. Note that the switch access authentication method, Using global access credentials, indicates you have chosen either basic authentication with a username and password or SSH key-based authentication for all of your switches. Authentication on a per switch basis is not currently available.

  10. Click Next.

  11. Verify the upgrade job options.

    By default, NetQ takes a network snapshot before the upgrade and then one after the upgrade is complete. It also performs a roll back to the original Cumulus Linux version on any server which fails to upgrade.

    You can exclude selected services and protocols from the snapshots. By default, node and services are included, but you can deselect any of the other items. Click on one to remove it; click again to include it. This is helpful when you are not running a particular protocol or you have concerns about the amount of time it will take to run the snapshot. Note that removing services or protocols from the job may product unequivalent results compared with prior snapshots.

    While these options provide a smoother upgrade process and are highly recommended, you have the option to disable these options by clicking No next to one or both options.

  12. Click Next.

  13. After the pre-checks have completed successfully, click Preview.

    If one or more of the pre-checks fail, resolve the related issue and start the upgrade again. Expand the following dropdown to view common failures, their causes and corrective actions.

    Pre-check Failure Messages
  14. Review the job preview.

    • When all of your switches have roles assigned, this view displays the chosen job options (top center), the pre-checks status (top right and left in Pre-Upgrade Tasks), the order in which the switches are planned for upgrade (center; upgrade starts from the left), and the post-upgrade tasks status (right).

      Roles assigned

      Roles assigned

    • When none of your switches have roles assigned (or they are all of the same role), this view displays the chosen job options (top center), the pre-checks status (top right and left in Pre-Upgrade Tasks), a list of switches planned for upgrade (center), and the post-upgrade tasks status (right).

      No roles assigned

      No roles assigned

    • When some of your switches have roles assigned, any switches without roles are upgraded last and are grouped under the label Stage1.

      Some roles assigned

      Some roles assigned

  15. When you are happy with the job specifications, click Start Upgrade.

  16. Confirm the upgrade request.

Analyze Cumulus Linux Results

After starting the upgrade you can monitor the progress from the preview page or the Upgrade History page.

From the preview page, a green circle with rotating arrows is shown above each step set of switches (if roles are configured) and on each switch as as the job is working. Alternately, you can close the detail of the job and see a summary of all current and past upgrade jobs on the Upgrade History page. The job started most recently is shown at the top, and the data is refreshed periodically.

Switches are displayed in the order of upgrade, by role/category and within roles/categories. Switches that are planned for upgrade first are listed first. You can scroll down within a role or category to see the additional switches to be upgraded.

If you are disconnected while the job is in progress, it may appear as if nothing is happening. Try closing (click ) and reopening your view (click ), or refreshing the page.

Monitoring the Cumulus Linux Upgrade

Several viewing options are available for monitoring the upgrade job.

  • Monitor the job with full details open:

    Single role or no roles

    Single role or no roles

    Multiple roles and some without roles

    Multiple roles and some without roles

    Each switch goes through a number of steps. To view these steps, click and scroll down as needed. Click to close the detail.

  • Monitor the job with summary information only in the CL Upgrade History page. Open this view by clicking in the full details view:

    This view is refreshed automatically. Click to view what stage the job is in.

    Click to view the detailed view.

  • Monitor the job through the CL Upgrade History card on the LCM dashboard. Click twice to return to the LCM dashboard. As you perform more upgrades the graph displays the success and failure of each job.

    Click View to return to the Upgrade History page as needed.

After either a successful or failed upgrade attempt has been performed, the CL Upgrade History card is updated on your LCM dashboard.

Sample Successful Upgrade

On successful completion, you can:

  • Compare the network snapshots taken before and after the upgrade.

    Click Compare Snapshots in the detail view.

    Refer to Interpreting the Comparison Data for information about analyzing these results.

  • Download details about the upgrade in the form of a JSON-formatted file, by clicking Download Report.

  • View the changes on the Switches card of the LCM dashboard.

    Click (Switches) in the workbench header, then click Manage switches.

    In our example, all spine switches have been upgraded to Cumulus Linux 3.7.13. Leaf and other switches have not been upgraded, so both Cumulus Linux versions 3.7.12 and 3.7.13 are shown.

Upgrades can be considered successful and still have post-check warnings. For example, the OS has been updated, but not all services are fully up and running after the upgrade. If one or more of the post-checks fail, warning messages are provided in the Post-Upgrade Tasks section of the preview. Click on the warning category to view the detailed messages. Sometimes waiting another few minutes will clear service-related warnings.

Expand the following dropdown to view common failures, their causes and corrective actions.

Post-check Failure Messages

Sample Failed Upgrade

If an upgrade job fails for any reason, you can view the associated error(s):

  1. From the Upgrade History dashboard, find the job of interest.

  2. Click .

  3. Click .

    In this example, all of the pre-upgrade tasks were successful, but the spine switches were unreachable. Checking the status of the switches, they were rotten.

  4. Double-click on an error to view a more detailed error message.

    This example, shows that the upgrade failure was due to bad switch access credentials. You would need to fix those and then create a new upgrade job.

    This example shows that only one spine switch was upgraded and three failed to be upgraded and failed to roll back to the original release.

Reasons for Upgrade Failure

Upgrades can fail at any of the stages of the process, including when backing up data, upgrading the Cumulus Linux software, and restoring the data. Failures can occur when attempting to connect to a switch or perform a particular task on the switch.

Some of the common reasons for upgrade failures and the errors they present:

ReasonError Message
Switch is not reachable via SSHData could not be sent to remote host “192.168.0.15”. Make sure this host can be reached over ssh: ssh: connect to host 192.168.0.15 port 22: No route to host
Switch is reachable, but user-provided credentials are invalidInvalid/incorrect username/password. Skipping remaining 2 retries to prevent account lockout: Warning: Permanently added ‘<hostname-ipaddr>’ to the list of known hosts. Permission denied, please try again.
Switch is reachable, but a valid Cumulus Linux license is not installed1587866683.880463 2020-04-26 02:04:43 license.c:336 CRIT No license file. No license installed!
Upgrade task could not be runFailure message depends on the why the task could not be run. For example: /etc/network/interfaces: No such file or directory
Upgrade task failedFailed at- <task that failed>. For example: Failed at- MLAG check for the peerLink interface status
Retry failed after five attemptsFAILED In all retries to process the LCM Job

Create and Compare Network Snapshots

Creating and comparing network snapshots can be useful to validate that the network state has not changed. Snapshots are typically created when you upgrade or change the configuration of your switches in some way. This section describes the Snapshot card and content, as well as how to create and compare network snapshots at any time. Snapshots can be automatically created during the upgrade process for Cumulus Linux or NetQ. Refer to Image Installation and Upgrade.

Create a Network Snapshot

It is simple to capture the state of your network currently or for a time in the past using the snapshot feature.

To create a network snapshot:

  1. From any workbench, click in the workbench header.

  2. Click Create Snapshot.

  3. Enter a name for the snapshot.

  4. Choose the time for the snapshot:

    • For the current network state, click Now.

    • For the network state at a previous date and time, click Past, then click in Start Time field to use the calendar to step through selection of the date and time. You may need to scroll down to see the entire calendar.

  5. Choose the services to include in the snapshot.

    In the Choose options field, click any service name to remove that service from the snapshot. This would be appropriate if you do not support a particular service, or you are concerned that including that service might cause the snapshot to take an excessive amount of time to complete if included. The checkmark next to the service and the service itself is grayed out when the service is removed. Click any service again to re-include the service in the snapshot. The checkmark is highlighted in green next to the service name and is no longer grayed out.

    The Node and Services options are mandatory, and cannot be selected or unselected.

    If you remove services, be aware that snapshots taken in the past or future may not be equivalent when performing a network state comparison.

    This example removes the OSPF and Route services from the snapshot being created.

  6. Optionally, scroll down and click in the Notes field to add descriptive text for the snapshot to remind you of its purpose. For example: “This was taken before adding MLAG pairs,” or “Taken after removing the leaf36 switch.”

  7. Click Finish.

    A medium Snapshot card appears on your desktop. Spinning arrows are visible while it works. When it finishes you can see the number of items that have been captured, and if any failed. This example shows a successful result.

    If you have already created other snapshots, Compare is active. Otherwise it is inactive (grayed out).

Click Dismiss to close the snapshot. The snapshot is not deleted, merely removed from the workbench.

Compare Network Snapshots

You can compare the state of your network before and after an upgrade or other configuration change to validate that the changes have not created an unwanted change in your network state.

To compare network snapshots:

  1. Create a snapshot (as described in previous section) before you make any changes.

  2. Make your changes.

  3. Create a second snapshot.

  4. Compare the results of the two snapshots.

    Depending on what, if any, cards are open on your workbench:

    • If you have the two desired snapshot cards open:

      • Simply put them next to each other to view a high-level comparison.
      • Scroll down to see all of the items.
      • To view a more detailed comparison, click Compare on one of the cards. Select the other snapshot from the list.
    • If you have only one of the cards open:

      • Click Compare on the open card.
      • Select the other snapshot to compare.
    • If no snapshot cards are open (you may have created them some time before):

      • Click .
      • Click Compare Snapshots.
      • Click on the two snapshots you want to compare.
      • Click Finish. Note that two snapshots must be selected before Finish is active.

    In the latter two cases, the large Snapshot card opens. The only difference is in the card title. If you opened the comparison card from a snapshot on your workbench, the title includes the name of that card. If you open the comparison card through the Snapshot menu, the title is generic, indicating a comparison only. Functionally, you have reached the same point.

    Scroll down to view all element comparisons.

Interpreting the Comparison Data

For each network element that is compared, count values and changes are shown:

In this example, a change was made to the VLAN. The snapshot taken before the change (17Apr2020) had a total count of 765 neighbors. The snapshot taken after the change (20Apr2020) had a total count of 771 neighbors. Between the two totals you can see the number of neighbors added and removed from one time to the next, resulting in six new neighbors after the change.

The red and green coloring indicates only that items were removed (red) or added (green). The coloring does not indicate whether the removal or addition of these items is bad or good.

From this card, you can also change which snapshots to compare. Select an alternate snapshot from one of the two snapshot dropdowns and then click Compare.

View Change Details

You can view additional details about the changes that have occurred between the two snapshots by clicking View Details. This opens the full screen Detailed Snapshot Comparison card.

From this card you can:

  • View changes for each of the elements that had added and/or removed items, and various information about each; only elements with changes are presented
  • Filter the added and removed items by clicking
  • Export all differences in JSON file format by clicking

The following table describes the information provided for each element type when changes are present:

ElementData Descriptions
BGP
  • Hostname: Name of the host running the BGP session
  • VRF: Virtual route forwarding interface if used
  • BGP Session: Session that was removed or added
  • ASN: Autonomous system number
CLAG
  • Hostname: Name of the host running the CLAG session
  • CLAG Sysmac: MAC address for a bond interface pair that was removed or added
Interface
  • Hostname: Name of the host where the interface resides
  • IF Name: Name of the interface that was removed or added
IP Address
  • Hostname: Name of the host where address was removed or added
  • Prefix: IP address prefix
  • Mask: IP address mask
  • IF Name: Name of the interface that owns the address
Links
  • Hostname: Name of the host where the link was removed or added
  • IF Name: Name of the link
  • Kind: Bond, bridge, eth, loopback, macvlan, swp, vlan, vrf, or vxlan
LLDP
  • Hostname: Name of the discovered host that was removed or added
  • IF Name: Name of the interface
MAC Address
  • Hostname: Name of the host where MAC address resides
  • MAC address: MAC address that was removed or added
  • VLAN: VLAN associated with the MAC address
Neighbor
  • Hostname: Name of the neighbor peer that was removed or added
  • VRF: Virtual route forwarding interface if used
  • IF Name: Name of the neighbor interface
  • IP address: Neighbor IP address
Node
  • Hostname: Name of the network node that was removed or added
OSPF
  • Hostname: Name of the host running the OSPF session
  • IF Name: Name of the associated interface that was removed or added
  • Area: Routing domain for this host device
  • Peer ID: Network subnet address of router with access to the peer device
Route
  • Hostname: Name of the host running the route that was removed or added
  • VRF: Virtual route forwarding interface associated with route
  • Prefix: IP address prefix
Sensors
  • Hostname: Name of the host where sensor resides
  • Kind: Power supply unit, fan, or temperature
  • Name: Name of the sensor that was removed or added
Services
  • Hostname: Name of the host where service is running
  • Name: Name of the service that was removed or added
  • VRF: Virtual route forwarding interface associated with service

Manage Network Snapshots

You can create as many snapshots as you like and view them at any time. When a snapshot becomes old and no longer useful, you can remove it.

To view an existing snapshot:

  1. From any workbench, click in the workbench header.

  2. Click View/Delete Snapshots.

  3. Click View.

  4. Click one or more snapshots you want to view, then click Finish.

    Click Back or Choose Action to cancel viewing of your selected snapshot(s).

To remove an existing snapshot:

  1. From any workbench, click in the workbench header.

  2. Click View/Delete Snapshots.

  3. Click Delete.

  4. Click one or more snapshots you want to remove, then click Finish.

    Click Back or Choose Action to cancel the deletion of your selected snapshot(s).