Monitor Network Elements

In addition to network performance monitoring, the Cumulus NetQ UI provides a view into the current status and configuration of the network elements in a tabular, network-wide view. These are helpful when you want to see all data for all of a particular element in your network for troubleshooting, or you want to export a list view.

Some of these views provide data that is also available through the card workflows, but these views are not treated like cards. They only provide the current status; you cannot change the time period of the views, or graph the data within the UI.

Access these tables through the Main Menu (), under the Network heading.

If you do not have administrative rights, the Admin menu options are not available to you.

Tables can be manipulated using the settings above the tables, shown here and described in Table Settings.

Pagination options are shown when there are more than 25 results.

View All NetQ Agents

The Agents view provides all available parameter data about all NetQ Agents in the system.

ParameterDescription
HostnameName of the switch or host
TimestampDate and time the data was captured
Last ReinitDate and time that the switch or host was reinitialized
Last Update TimeDate and time that the switch or host was updated
LastbootDate and time that the switch or host was last booted up
NTP StateStatus of NTP synchronization on the switch or host; yes = in synchronization, no = out of synchronization
Sys UptimeAmount of time the switch or host has been continuously up and running
VersionNetQ version running on the switch or host

View All Events

The Events view provides all available parameter data about all events in the system.

ParameterDescription
HostnameName of the switch or host that experienced the event
TimestampDate and time the event was captured
MessageDescription of the event
Message TypeNetwork service or protocol that generated the event
SeverityImportance of the event. Values include critical, warning, info, and debug.

View All MACs

The MACs (media access control addresses) view provides all available parameter data about all MAC addresses in the system.

ParameterDescription
HostnameName of the switch or host where the MAC address resides
TimestampDate and time the data was captured
Egress PortPort where traffic exits the switch or host
Is RemoteIndicates if the address is
Is StaticIndicates if the address is a static (true) or dynamic assignment (false)
MAC AddressMAC address
NexthopNext hop for traffic hitting this MAC address on this switch or host
OriginIndicates if address is owned by this switch or host (true) or by a peer (false)
VLANVLAN associated with the MAC address, if any

View All VLANs

The VLANs (virtual local area networks) view provides all available parameter data about all VLANs in the system.

ParameterDescription
HostnameName of the switch or host where the VLAN(s) reside(s)
TimestampDate and time the data was captured
If NameName of interface used by the VLAN(s)
Last ChangedDate and time when this information was last updated
PortsPorts on the switch or host associated with the VLAN(s)
SVISwitch virtual interface associated with a bridge interface
VLANsVLANs associated with the switch or host

View IP Routes

The IP Routes view provides all available parameter data about all IP routes. The list of routes can be filtered to view only the IPv4 or IPv6 routes by selecting the relevant tab.

ParameterDescription
HostnameName of the switch or host where the VLAN(s) reside(s)
TimestampDate and time the data was captured
Is IPv6Indicates if the address is an IPv6 (true) or IPv4 (false) address
Message TypeNetwork service or protocol; always Route in this table
NexthopsPossible ports/interfaces where traffic can be routed to next
OriginIndicates if this switch or host is the source of this route (true) or not (false)
PrefixIPv4 or IPv6 address prefix
PriorityRank of this route to be used before another, where the lower the number, less likely is to be used; value determined by routing protocol
ProtocolProtocol responsible for this route
Route TypeType of route
Rt Table IDThe routing table identifier where the route resides
SrcPrefix of the address where the route is coming from (the previous hop)
VRFAssociated virtual route interface associated with this route

View IP Neighbors

The IP Neighbors view provides all available parameter data about all IP neighbors. The list of neighbors can be filtered to view only the IPv4 or IPv6 neighbors by selecting the relevant tab.

ParameterDescription
HostnameName of the neighboring switch or host
TimestampDate and time the data was captured
IF IndexIndex of interface used to communicate with this neighbor
If NameName of interface used to communicate with this neighbor
IP AddressIPv4 or IPv6 address of the neighbor switch or host
Is IPv6Indicates if the address is an IPv6 (true) or IPv4 (false) address
Is RemoteIndicates if the address is
MAC AddressMAC address of the neighbor switch or host
Message TypeNetwork service or protocol; always Neighbor in this table
VRFAssociated virtual route interface associated with this neighbor

View IP Addresses

The IP Addresses view provides all available parameter data about all IP addresses. The list of addresses can be filtered to view only the IPv4 or IPv6 addresses by selecting the relevant tab.

ParameterDescription
HostnameName of the neighboring switch or host
TimestampDate and time the data was captured
If NameName of interface used to communicate with this neighbor
Is IPv6Indicates if the address is an IPv6 (true) or IPv4 (false) address
MaskHost portion of the address
PrefixNetwork portion of the address
VRFVirtual route interface associated with this address prefix and interface on this switch or host

View What Just Happened

The What Just Happened (WJH) feature, available on Mellanox switches, streams detailed and contextual telemetry data for analysis. This provides real-time visibility into problems in the network, such as hardware packet drops due to buffer congestion, incorrect routing, and ACL or layer 1 problems. You must have Cumulus Linux 4.0.0 or later and NetQ 2.4.0 or later to take advantage of this feature.

If your switches are sourced from a vendor other than Mellanox, this view is blank as no data is collected.

When WJH capabilities are combined with Cumulus NetQ, you have the ability to hone in on losses, anywhere in the fabric, from a single management console. You can:

  • View any current or historic drop information, including the reason for the drop
  • Identify problematic flows or endpoints, and pin-point exactly where communication is failing in the network

By default, Cumulus Linux 4.0.0 provides the NetQ 2.3.1 Agent and CLI. If you installed Cumulus Linux 4.0.0 on your Mellanox switch, you need to upgrade the NetQ Agent and optionally the CLI to release 2.4.0 or later (preferably the latest release).

cumulus@<hostname>:~$ sudo apt-get update
cumulus@<hostname>:~$ sudo apt-get install -y netq-agent
cumulus@<hostname>:~$ netq config restart agent
cumulus@<hostname>:~$ sudo apt-get install -y netq-apps
cumulus@<hostname>:~$ netq config restart cli

Configure the WJH Feature

WJH is enabled by default on Mellanox switches and no configuration is required in Cumulus Linux 4.0.0; however, you must enable the NetQ Agent to collect the data in NetQ 2.4.0 or later.

To enable WJH in NetQ:

  1. Configure the NetQ Agent on the Mellanox switch.

    cumulus@switch:~$ netq config add agent wjh
    
  2. Restart the NetQ Agent to start collecting the WJH data.

    cumulus@switch:~$ netq config restart agent
    

When you are finished viewing the WJH metrics, you might want to disable the NetQ Agent to reduce network traffic. Use netq config del agent wjh followed by netq config restart agent to disable the WJH feature on the given switch.

Using wjh_dump.py on a Mellanox platform that is running Cumulus Linux 4.0 and the NetQ 2.4.0 agent causes the NetQ WJH client to stop receiving packet drop call backs. To prevent this issue, run wjh_dump.py on a different system than the one where the NetQ Agent has WJH enabled, or disable wjh_dump.py and restart the NetQ Agent (run netq config restart agent).

View What Just Happened Metrics

The What Just Happened view displays events based on conditions detected in the data plane. The most recent 1000 events from the last 24 hours are presented for each drop category.

TabDescription
L1 DropsDisplays the reason why a port is in the down state. By default, the listing is sorted by Last Timestamp. The tab provides the following additional data about each drop event:
  • Hostname: Name of the Mellanox server
  • Port Down Reason: Reason why the port is down
    • Port admin down: Port has been purposely set down by user
    • Auto-negotiation failure: Negotiation of port speed with peer has failed
    • Logical mismatch with peer link: Logical mismatch with peer link
    • Link training failure: Link is not able to go operational up due to link training failure
    • Peer is sending remote faults: Peer node is not operating correctly
    • Bad signal integrity: Integrity of the signal on port is not sufficient for good communication
    • Cable/transceiver is not supported: The attached cable or transceiver is not supported by this port
    • Cable/transceiver is unplugged: A cable or transceiver is missing or not fully plugged into the port
    • Calibration failure: Calibration failure
    • Port state changes counter: Cumulative number of state changes
    • Symbol error counter: Cumulative number of symbol errors
    • CRC error counter: Cumulative number of CRC errors
  • Corrective Action: Provides recommend action(s) to take to resolve the port down state
  • First Timestamp: Date and time this port was marked as down for the first time
  • Ingress Port: Port accepting incoming traffic
  • CRC Error Count: Number of CRC errors generated by this port
  • Symbol Error Count: Number of Symbol errors generated by this port
  • State Change Count: Number of state changes that have occurred on this port
  • OPID: Operation identifier; used for internal purposes
  • Is Port Up: Indicates whether the port is in an Up (true) or Down (false) state
L2 DropsDisplays the reason for a link to be down. By default, the listing is sorted by Last Timestamp. The tab provides the following additional data about each drop event:
  • Hostname: Name of the Mellanox server
  • Source Port: Port ID where the link originates
  • Source IP: Port IP address where the link originates
  • Source MAC: Port MAC address where the link originates
  • Destination Port: Port ID where the link terminates
  • Destination IP: Port IP address where the link terminates
  • Destination MAC: Port MAC address where the link terminates
  • Reason: Reason why the link is down
    • MLAG port isolation: Not supported for port isolation implemented with system ACL
    • Destination MAC is reserved (DMAC=01-80-C2-00-00-0x): The address cannot be used by this link
    • VLAN tagging mismatch: VLAN tags on the source and destination do not match
    • Ingress VLAN filtering: Frames whose port is not a member of the VLAN are discarded
    • Ingress spanning tree filter: Port is in Spanning Tree blocking state
    • Unicast MAC table action discard: Currently not supported
    • Multicast egress port list is empty: No ports are defined for multicast egress
    • Port loopback filter: Port is operating in loopback mode; packets are being sent to itself (source MAC address is the same as the destination MAC address
    • Source MAC is multicast: Packets have multicast source MAC address
    • Source MAC equals destination MAC: Source MAC address is the same as the destination MAC address
  • First Timestamp: Date and time this link was marked as down for the first time
  • Aggregate Count : Total number of dropped packets
  • Protocol: ID of the communication protocol running on this link
  • Ingress Port: Port accepting incoming traffic
  • OPID: Operation identifier; used for internal purposes
Router DropsDisplays the reason why the server is unable to route a packet. By default, the listing is sorted by Last Timestamp. The tab provides the following additional data about each drop event:
  • Hostname: Name of the Mellanox server
  • Reason: Reason why the server is unable to route a packet
    • Non-routable packet: Packet has no route in routing table
    • Blackhole route: Packet received with action equal to discard
    • Unresolved next-hop: The next hop in the route is unknown
    • Blackhole ARP/neighbor: Packet received with blackhole adjacency
    • IPv6 destination in multicast scope FFx0:/16: Packet received with multicast destination address in FFx0:/16 address range
    • IPv6 destination in multicast scope FFx1:/16: Packet received with multicast destination address in FFx1:/16 address range
    • Non-IP packet: Cannot read packet header because it is not an IP packet
    • Unicast destination IP but non-unicast destination MAC: Cannot read packet with IP unicast address when destination MAC address is not unicast (FF:FF:FF:FF:FF:FF)
    • Destination IP is loopback address: Cannot read packet as destination IP address is a loopback address (dip=>127.0.0.0/8)
    • Source IP is multicast: Cannot read packet as source IP address is a multicast address (ipv4 SIP => 224.0.0.0/4)
    • Source IP is in class E: Cannot read packet as source IP address is a Class E address
    • Source IP is loopback address: Cannot read packet as source IP address is a loopback address ( ipv4 => 127.0.0.0/8 for ipv6 => ::1/128)
    • Source IP is unspecified: Cannot read packet as source IP address is unspecified (ipv4 = 0.0.0.0/32; for ipv6 = ::0)
    • Checksum or IP ver or IPv4 IHL too short: Cannot read packet due to header checksum error, IP version mismatch, or IPv4 header length is too short
    • Multicast MAC mismatch: For IPv4, destination MAC address is not equal to {0x01-00-5E-0 (25 bits), DIP[22:0]} and DIP is multicast. For IPv6, destination MAC address is not equal to {0x3333, DIP[31:0]} and DIP is multicast.
    • Source IP equals destination IP: Packet has a source IP address equal to the destination IP address
    • IPv4 source IP is limited broadcast: Packet has broadcast source IP address
    • IPv4 destination IP is local network (destination = 0.0.0.0/8): Packet has IPv4 destination address that is a local network (destination=0.0.0.0/8)
    • IPv4 destination IP is link local: Packet has IPv4 destination address that is a local link
    • Ingress router interface is disabled: Packet destined to a different subnet cannot be routed because ingress router interface is disabled
    • Egress router interface is disabled: Packet destined to a different subnet cannot be routed because egress router interface is disabled
    • IPv4 routing table (LPM) unicast miss: No route available in routing table for packet
    • IPv6 routing table (LPM) unicast miss: No route available in routing table for packet
    • Router interface loopback: Packet has destination IP address that is local. For example, SIP = 1.1.1.1, DIP = 1.1.1.128.
    • Packet size is larger than MTU: Packet has larger MTU configured than the VLAN
    • TTL value is too small: Packet has TTL value of 1
Tunnel DropsDisplays the reason for a tunnel to be down. By default, the listing is sorted by Last Timestamp. The tab provides the following additional data about each drop event:
  • Hostname: Name of the Mellanox server
  • Reason: Reason why the tunnel is down
    • Overlay switch - source MAC is multicast: Overlay packet's source MAC address is multicast
    • Overlay switch - source MAC equals destination MAC: Overlay packet's source MAC address is the same as the destination MAC address
    • Decapsulation error: Decapsulation produced incorrect format of packet. For example, encapsulation of packet with many VLANs or IP options on the underlay can cause decapsulation to result in a short packet.
Buffer DropsDisplays the reason for the server buffer to be drop packets. By default, the listing is sorted by Last Timestamp. The tab provides the following additional data about each drop event:
  • Hostname: Name of the Mellanox server
  • Reason: Reason why the buffer dropped packet
    • Tail drop: Tail drop is enabled, and buffer queue is filled to maximum capacity
    • WRED: Weighted Random Early Detection is enabled, and buffer queue is filled to maximum capacity or the RED engine dropped the packet as of random congestion prevention.
ACL DropsDisplays the reason for an ACL to drop packets. By default, the listing is sorted by Last Timestamp. The tab provides the following additional data about each drop event:
  • Hostname: Name of the Mellanox server
  • Reason: Reason why ACL dropped packets
    • Ingress port ACL: ACL action set to deny on the ingress port
    • Ingress router ACL: ACL action set to deny on the ingress router interface
    • Egress port ACL: ACL action set to deny on the egress port
    • Egress router ACL: ACL action set to deny on the egress router interface

View Sensors

The Sensors view provides all available parameter data provided by the power supply units (PSUs), fans, and temperature sensors in the system. Select the relevant tab to view the data.

PSU ParameterDescription
HostnameName of the switch or host where the power supply is installed
TimestampDate and time the data was captured
Message TypeType of sensor message; always PSU in this table
PIn(W)Input power (Watts) for the PSU on the switch or host
POut(W)Output power (Watts) for the PSU on the switch or host
Sensor NameUser-defined name for the PSU
Previous StateState of the PSU when data was captured in previous window
StateState of the PSU when data was last captured
VIn(V)Input voltage (Volts) for the PSU on the switch or host
VOut(V)Output voltage (Volts) for the PSU on the switch or host
Fan ParameterDescription
HostnameName of the switch or host where the fan is installed
TimestampDate and time the data was captured
Message TypeType of sensor message; always Fan in this table
DescriptionUser specified description of the fan
Speed (RPM)Revolution rate of the fan (revolutions per minute)
MaxMaximum speed (RPM)
MinMinimum speed (RPM)
MessageMessage
Sensor NameUser-defined name for the fan
Previous StateState of the fan when data was captured in previous window
StateState of the fan when data was last captured
Temperature ParameterDescription
HostnameName of the switch or host where the temperature sensor is installed
TimestampDate and time the data was captured
Message TypeType of sensor message; always Temp in this table
CriticalCurrent critical maximum temperature (°C) threshold setting
DescriptionUser specified description of the temperature sensor
Lower CriticalCurrent critical minimum temperature (°C) threshold setting
MaxMaximum temperature threshold setting
MinMinimum temperature threshold setting
MessageMessage
Sensor NameUser-defined name for the temperature sensor
Previous StateState of the fan when data was captured in previous window
StateState of the fan when data was last captured
Temperature(Celsius)Current temperature (°C) measured by sensor

View Digital Optics

The Digital Optics view provides all available parameter data provided by any digital optics modules in the system. View Laser power and bias current for a given interface and channel on a switch, and temperature and voltage for a given module. Select the relevant tab to view the data.

Laser ParameterDescription
HostnameName of the switch or host where the digital optics module resides
TimestampDate and time the data was captured
If NameName of interface where the digital optics module is installed
UnitsMeasurement unit for the power (mW) or current (mA)
Channel 1–8Value of the power or current on each channel where the digital optics module is transmitting
Module ParameterDescription
HostnameName of the switch or host where the digital optics module resides
TimestampDate and time the data was captured
If NameName of interface where the digital optics module is installed
Degree CCurrent module temperature, measured in degrees Celsius
Degree FCurrent module temperature, measured in degrees Fahrenheit
UnitsMeasurement unit for module voltage; Volts
ValueCurrent module voltage