Cluster scheduler support has been queued for landing in Linux kernel 5.16 for AArch64 and x86_64 systems to improve processor scheduler behavior for systems with core clusters of processor.
Cluster scheduler support in this context is to enhance the Linux kernel scheduler for systems where sets of processor cores share an L2 cache or other mid-level caches / resources.
This cluster planner work stems from HiSilicon and Huawei work to improve Linux performance of Kunpeng 920 server chip. This HiSilicon SoC has six or eight clusters per NUMA node with four processor cores per cluster and L3 cache share. Along with the cluster scheduler fixes, they were able to improve overall system performance and also improve efficiency.
Although it started as an ARM / AArch64 job, the cluster scheduler code was also already suitable for x86_64 systems. In particular, Intel has added the cluster scheduler level for hardware like its Jacobsville family where the L2 cache is shared between a cluster of cores. Intel’s own testing with Jacobsville showed up to 25% improvement, with the load being better balanced between the L2 clusters. Intel was testing with SPECrate’s MCF test while recognizing that not all workloads would benefit from this cluster scheduler job.
On the ARM side, there have been benchmark Stream memory improvements of up to 19%, PBZIP2 performance of up to 8%, PIXZ performance of up to a few percent, as well as a number of SPECrate wins. .
The work of the cluster planner has been resumed today via sched / core with the topology changes, the cluster planner for ARM64 and the x86 cluster planner level.
Now that the work has been resumed in sched / core, it should debut in the mainline once the Linux 5.16 merge window opens in November. The bracket is tucked behind a new SCHED_CLUSTER Kconfig switch on x86 / x86_64 and ARM64.