Multiple Queues per Dispatcher

Overview

This feature is introduced in SE DP DPDK mode to achieve higher PPS in environments where the size of the NIC queue is limited.

The following are the supported NICs:

  • VIRTIO - KVM, Openstack

  • ENA - AWS

You achieve higher PPS with shallow rings by utilising more than one queue per dispatcher, which provides increased packet burst ability.

The following are the two modes of operation based on operating environment:

  • Compact Mode - Single dispatcher manages all the queues of the VNIC

  • Distributed Mode - Multiple dispatchers manages a subset of queues of the VNIC

You can configure the maximum number of queues per VNIC using max_queues_per_vnic parameter in SE-group properties.

The max_queues_per_vnic parameter supports the following values:

  • Zero (Reserved) — Auto (deduces optimal number of queues per dispatcher based on the NIC and operating environment)

  • One (Reserved) — One Queue per NIC (Default)

  • Integer Value — Power of 2; maximum limit is 16.

Note: You should set max_queues_per_vnic to 0 (auto) for non-DPDK mode of operation to utilise multiple dispatchers.

The migration routine ensures that the max_queues_per_vnic parameter is set to num_dispatcher_cores if the distribute_queues is enabled, else max_queues_per_vnic will be set to 1.

The migration routine ensures if the distribute_queues and num_dispatcher_cores value is set, then max_queue_per_vnic should be set to num_dispatcher_cores. If the distribute_queues is set and num_dispatcher_cores is not set, then the number of queues will be the dispatcher cores.

The following are the environment specific behaviour upon setting the max_queues_per_vnic value to 0 (auto):

  • OpenStack, AWS, KVM (DPDK mode) — The number of queues can be more than the dispatchers to utilise more than one queue per dispatcher.

  • Baremetal (DPDK mode) — The number of queues is same as the number of dispatchers. Utilises one queue per dispatcher.

  • Azure, AWS (Non-DPDK mode) — The number of queues is same as the number of dispatchers. Utilises one queue per dispatcher.

Note: You need to enable se_image_property hw_vif_multiqueue_enabled parameters in OpenStack to utilise max_queues_per_vnic. This ensures that the number of queues are equal to number of vCPUs.

Code Description
se_dispatcher_cores Total number of SE cores handled by the dispatcher
g_num_queues_per_dispatcher Total number of queues handled by the dispatcher
g_num_queue_per_vnic Total number of queues per vnic

The max_queues_per_vnic value is derived from the service_engine group property where in the environments which use VIRTIO (Openstack, KVM, excluding GCP), the queue size is 256, and one dispatcher core that can handle all the queues of the VNIC. This variable is known as se_dispatcher_cores.

In AWS the ring size is 1024 and the number of queues is equal to the number of cores, hence the queues are distributed across dispatcher cores.

This value is derived automatically based on environments where:

se_dispatcher_cores = max_queues_per_vnic/ g_num_queues_per_dispatcher.

In VIRTIO (excluding GCP) g_num_queues_per_dispatcher = max_queues_per_vnic

In AWS g_num_queues_per_dispatcher = max_queues_per_vnic/ num_dispatcher_cores_available

If se_dispatcher_cores is more than 1 in ipstk_drv_send, you will receive the queue number. However, you need to determine the core handling this queue.

The g_rss_queue_to_core_table value is populated during the init of the se_dp process. Indexing this array with the queue number will give the corresponding core and the queue number is assigned to the m_rsshash field in the mbuf. Since the dispatcher handles multiple queues it should know the queue number to send this packet out in the rte_eth_tx_burst.

If the number of dispatchers is equal to 1 in ipstk_drv_send, you get the queue number and assign the same in the m_rsshash field event, though the packet will be sent out only by one core, i.e., the owner core. Ensure that the packets belonging to a particular flow will use the same queue which indirectly makes sure that all the queues handled by the owner core are uniformly loaded.

num_dispatcher_cores ENA LSC VIRTIO Comments
Auto (0) Auto(0) / N Distributed Distributed Compact If num_dispatcher_cores is N and max_queues_per_vnic is 0, then max_queues_per_vnic is also N.
N Auto(0) / N Distributed Distributed Compact

The number of queues per dispatcher will be indicated in show serviceengine <se> se_agent.

With migration routine, max_queues_per_vnic will be equal to num_dispatcher_cores (if num_dispatcher_cores > 0).

When max_queues_per_vnic is auto, you need to bound the maximum number of queues per dispatcher by first determining the least common maximum ring size and accordingly cap the maximum number queues per dispatcher. The intention is to use enough number is queues to realise an aggregate ring size of 4096 whenever possible. This ensures that CPU resources are preserved and optimal pps/burst ability is achieved.

"ring-size-max-qs-per-disp": {
    "128": 32,
    "256": 16,
    "512": 8,
    "1024": 4,
    "2048": 2,
    "4096": 1,
    "8192": 1,
    "default": 1
}