System Performance: Multi-Tasking

One of the key drivers of advancements in computing systems is multi-tasking. On mobile devices, this is quite lightweight - cases such as background email checks while the user is playing a mobile game are quite common. Towards optimizing user experience in those types of scenarios, mobile SoC manufacturers started integrating heterogeneous CPU cores - some with high performance for demanding workloads, while others were frugal in terms of both power consumption / die area and performance. This trend is now slowly making its way into the desktop PC space.

Multi-tasking in typical PC usage is much more demanding compared to phones and tablets. Desktop OSes allow users to launch and utilize a large number of demanding programs simultaneously. Responsiveness is dictated largely by the OS scheduler allowing different tasks to move to the background. Intel's Alder Lake processors work closely with the Windows 11 thread scheduler to optimize performance in these cases. Keeping these aspects in mind, the evaluation of multi-tasking performance is an interesting subject to tackle.

We have augmented our systems benchmarking suite to quantitatively analyze the multi-tasking performance of various platforms. The evaluation involves triggering a VLC transcoding task to transform 1716 3840x1714 frames encoded as a 24fps AVC video (Blender Project's 'Tears of Steel' 4K version) into a 1080p HEVC version in a loop. VLC internally uses the x265 encoder, and the settings are configured to allow the CPU usage to be saturated across all cores. The transcoding rate is monitored continuously. One complete transcoding pass is allowed to complete before starting the first multi-tasking workload - the PCMark 10 Extended bench suite. A comparative view of the PCMark 10 scores for various scenarios is presented in the graphs below. Also available for concurrent viewing are scores in the normal case where the benchmark was processed without any concurrent load, and a graph presenting the loss in performance.

UL PCMark 10 Load Testing - Digital Content Creation Scores

 

UL PCMark 10 Load Testing - Productivity Scores

 

UL PCMark 10 Load Testing - Essentials Scores

 

UL PCMark 10 Load Testing - Gaming Scores

 

UL PCMark 10 Load Testing - Overall Scores

The presence of a transcoding workload in the CPU cores makes handling other tasks an uphill task for low-power PCs. The PCMark 10 workloads above bring out that aspect. The ECS LIVA Z2 and LIVA Z3 are able to only handle the 'Productivity' and 'Essentials' workload components, while ending up with a timeout on others. Other than the 'Gaming' component, we see the June Canyon NUC being most effective at handling multi-tasking due to its actively cooled nature - it has the least performance loss across almost all PCMark 10 components.

Following the completion of the PCMark 10 benchmark, a short delay is introduced prior to the processing of Principled Technologies WebXPRT4 on MS Edge. Similar to the PCMark 10 results presentation, the graph below show the scores recorded with the transcoding load active. Available for comparison are the dedicated CPU power scores and a measure of the performance loss.

Principled Technologies WebXPRT4 Load Testing Scores (MS Edge)

The June Canyon's WebXPRT4 scores are well behind that of the Jasper Lake-based units under normal conditions. However, addition of the transcoding workload results in significant loss in performance for the latter set. The June Canyon has limited performance loss, with its active cooling probably allowing it to go the extra mile in the presence of heavy sustained workloads.

The final workload tested as part of the multitasking evaluation routine is CINEBENCH R23.

3D Rendering - CINEBENCH R23 Load Testing - Single Thread Score

 

3D Rendering - CINEBENCH R23 Load Testing - Multiple Thread Score

The June Canyon NUC with its active cooling comes out on top with the transcoding load active.

After the completion of all the workloads, we let the transcoding routine run to completion. The monitored transcoding rate throughout the above evaluation routine (in terms of frames per second) is tabulated below.

VLC Transcoding Rate (Multi-Tasking Test) - Frames per Second
  Enc. Pass #1 PCMark 10 WebXPRT4 Cinebench Enc. Pass #2
ECS LIVA Z3
(Pentium Silver N6000)
0.1541 0.1071 0.1294 0.1476 0.1424
ECS JSLM-MINI
(Pentium Silver N6000)
0.2223 0.1635 0.1635 0.2092 0.2216
ZOTAC ZBOX CI331 nano
(Celeron N5100)
0.3016 0.1943 0.1859 0.2025 0.1864

The transcoding rates drop down with simultaneous loading, as expected. For the JSLM-MINI, the first pass and second pass rates are pretty much equal, pointing to the absence of throttling. However, both the LIVA Z3 and the ZBOX CI331 nano suffer from reduced rates in the second pass - the internal temperatures are high enough for the CPU to  be throttled after extended sustained loading.

GPU Performance HTPC Credentials
Comments Locked

52 Comments

View All Comments

  • mode_13h - Saturday, July 9, 2022 - link

    > tldr both benches would have been a wash one way of the other.

    Huh? If old Skylake is 50% faster, and Jasper Lake is 3.5x as fast as Pi 4 Model B (which seems rather generous), then it wouldn't be "a wash", which is defined as:

    13. an action or situation in which the gains and losses are
    equal, or closely compensate each other.
    (source: http://dict.org/bin/Dict?Form=Dict2&Database=g... )

    or

    8: any enterprise in which losses and gains cancel out; "at the
    end of the year the accounting department showed that it was
    a wash"
    (source: http://dict.org/bin/Dict?Form=Dict2&Database=w... )

    Since both comparisons are projected to be substantially lopsided, I think what you meant to call it is a "washout"?
  • abufrejoval - Thursday, July 14, 2022 - link

    I have a PI4 with 8GB of RAM in a metal case that supports a 2GHz overclock without active cooling: pretty much the best PI you can have these days.

    I also have an Nvidia Tegra based Jetson Nano with 4GB of RAM.

    At 2GHz the PI reaches 272/648 on Geekbench 4, the Tegra has to make do with 206/718 at 1.4GHz. The N6005 Jasper Lake reaches 781/2540 very similar to a Sandy Bridge i7-2600 at 3.8GHz Turbo.

    The Jetson Nano actually does reasonably well on my 43" 4k desktop for basic 2D work, because it has a GPU with 128 Maxwell cores. Of course its CPU power is at the level of a Snapdragon 800 mobile phone.

    The PI struggles badly at 4k, because the GPU has much less muscle. The slightly faster CPU is hard to notice.

    Actually it was when Tom's hardware did a report on a PI compute cluster, that I wanted to retort just how stupid that project was, because you could get a single Jasper Lake Atoms for much less money, that would run rings around that cluster and could in fact simulate it all in software via VMs.

    And that's when I found that finally a Jasper Lake NUC was available for purchase at €200 (including VAT) and immediately ordered one of the first and last ever sold here.

    And yes, it runs rings around both with roughly 4x the CPU power, 64GB of RAM expandability and quite a reasonable GPU performance on a 4k display.

    My favorite usability test is to use the "3D Globe View" on Google Maps under a Chrome based browser on Windows and to then tilt and turn a city landscape there. It's about the most efficient 3D graphics pipeline I've ever seen (puts Flight Simulator to total shame!) and performs quite reasonable on such a Jasper Lake NUC. With Firefox it's much worse on these low power devices, but with a beefy PC you'd never notice.

    After quite a bit of tweaking I managed to get it to work on both the PI and the Tegra at 1920x1080 and the Tegra even gave a bit of interactivity thanks to its much stronger GPU. But on the PI that was about one frame a minute.

    The PI and Nano are toys and ok for the €100 I spent on each.

    A Jasper Lake NUC is quite a reasonable desktop machine and even an interesting micro server for some real workloads.

    At €200 (without RAM or storage) the price/performance ratio is very hard to beat, but evidently none of the vendors really want you to know or buy that. I think it's the major reason you never could.
  • mode_13h - Thursday, July 14, 2022 - link

    > At 2GHz the PI reaches 272/648 on Geekbench 4, the Tegra ... 206/718 at 1.4GHz.

    Keep in mind that Jetson Nano has ostensibly 2x the memory bandwidth of the Pi v4. That surely helps offset the difference in raw CPU performance, as well as with 4k display performance.

    Oh, and if that test was with the machines driving a 4k display, then merely refreshing your monitor will have been using a non-insignificant amount of the Pi's memory bandwidth (about 1 GB/s).

    > N6005 Jasper Lake reaches 781/2540

    Wow! Dual-channel memory configuration, I presume?

    > on the PI that was about one frame a minute.

    Uh... that sure sounds like you were using a software rendering path. The Pi's GPU is trash, but that's simply atrocious!

    > evidently none of the vendors really want you to know or buy that.
    > I think it's the major reason you never could.

    I'm reasonably confident it's actually just supply chain-related. Intel has been steering its limited fab capacity towards more profitable models and probably steering its limited supply of Jasper Lakes to chromebooks, where they're probably desperate not to lose market share.
  • timecop1818 - Friday, July 8, 2022 - link

    There are Chinese mini PCs with
    Intel Celeron N5100 that are like 250$ with 16G ram and 256gb sata SSD.

    https://www.lazada.com.my/products/walkfish-m6-11t...

    there's like 5 different "brands" selling same thing on AliExpress etc. it runs win 10 just fine and is enough for 1080p Minecraft and basic office computing. great deal. most models have Intel 2.5G Ethernet too.
  • Jorgp2 - Friday, July 8, 2022 - link

    I just want a Jasper lake motherboard with plenty of sata and a PCI-E slot
  • mode_13h - Friday, July 8, 2022 - link

    You could get SATA, but not PCIe. According to this, Jasper Lake and Elkhart Lake have only x8 PCIe 3.0 and x2 SATA ports.

    Most boards are probably going to give you a x4 NVMe slot. Then, they could use a 3rd Party SATA controller to give you 4 more ports. Then, if they compromise on the bandwidth to that SATA controller, you can have a second Ethernet port and then a x1 PCIe slot that just might be open-ended (but probably not), to support a graphics card.

    Sorry, but they really kneecapped this platform relative to what it could've been. You might do better with some equivalent Atom-branded CPUs. Atom C-series (Parker Ridge) has 16 integrated SATA ports, x32 PCIe 3.0 lanes, and up to 8 cores. P-series (Snow Ridge) has the same, but up to 24 cores.

    * https://ark.intel.com/content/www/us/en/ark/produc...
    * https://ark.intel.com/content/www/us/en/ark/produc...
  • mode_13h - Friday, July 8, 2022 - link

    Oops, forgot the link for Jasper Lake. For good measure, here's Elkhart Lake, as well.

    * https://ark.intel.com/content/www/us/en/ark/produc...
    * https://ark.intel.com/content/www/us/en/ark/produc...
  • Thala - Friday, July 8, 2022 - link

    Interestingly my 3 years old Surface Pro X scores higher than any of the tested devices in Cinebench R23 under x64 emulation!
  • mode_13h - Saturday, July 9, 2022 - link

    The Surface Pro X from 2019 has a Microsoft SQ1 SoC, which is basically a Snapdragon 8cx and consists of 4x Kryo 495 Gold @ 3 GHz+ 4x Kryo 495 Silver @ 1.80 GHz (manufactured on TSMC 7 nm). According to wikichip, these are tweaked A76 and A55 cores. So, that seems credible, if not exactly an outcome I'd have presumed.

    Something to keep in mind is that Jasper Lake is meant cheap chromebooks. Like, sub-$200 cheap, whereas Snapdragon 8cx is a premium part.
  • nandnandnand - Saturday, July 9, 2022 - link

    They wanted it to be thought of as premium, it's more of an expensive joke. Like Lakefield but with no excuses.

    https://semiaccurate.com/2021/12/01/qualcomm-8cx-g...

    https://www.gizchina.com/2022/01/04/qualcomm-blame...

    Snapdragon 7c (Gen 1?) should be more comparable in price to Jasper Lake. I think I've seen that as low as $170-200. Also, the Apcsilmic Dot 1 and ECS LIVA Mini Box QC710 mini PCs recently launched with the 7c starting at around $219.

    If the leaks about Alder Lake-N are true, it will shake things up, if the price is right.

Log in

Don't have an account? Sign up now