GPU Performance

In moving from Gemini Lake to Jasper Lake, the integrated GPU didn't get as much attention as the CPU did. While retaining the same microarchitecture, the shift to 10nm allowed for integrating more execution units and slight improvements in the maximum clocks. The systems we are looking at today come with different variants of the same GPU microarchitecture:

  • Intel June Canyon and ECS LIVA Z2 (Gemini Lake): 18EU @ 750 MHz
  • ECS LIVA Z3 / JSLM-MINI (Jasper Lake): 32EU @ 850 MHz
  • ZOTAC ZBOX CI331 nano (Jasper Lake): 24EU @ 850 MHz

Based on these specifications alone, we expect the JSLM-MINI / LIVA Z3 to handily best the other systems in GPU performance. However, there are a few caveats to consider:

  • Power budget / PL1 limit is higher for the ZBOX CI331 nano compared to the JSLM-MINI
  • The June Canyon NUC is actively cooled and is not affected by thermal throttling
  • The LIVA Z3 and JSLM-MINI operate with DDR4-2666 SODIMMs, while the ZBOX operates  with DDR4-2933 SODIMMs

Keeping these aspects in mind, the GPU performance has to be evaluated in the context of each workload. We put the systems through some standard 3D workloads to get an idea of what they have to offer for GPU-intensive tasks.

GFXBench

The DirectX 12-based GFXBench tests from Kishonti are cross-platform, and available all the way down to smartphones. As such, they are not very taxing for discrete GPUs and modern integrated GPUs. We processed the offscreen versions of the 'Aztec Ruins' benchmark.

GFXBench 5.0: Aztec Ruins Normal 1080p Offscreen

GFXBench 5.0: Aztec Ruins High 1440p Offscreen

The ZBOX CI331 nano's iGPU has extra EUs compared to the June Canyon NUC. It also has faster RAM access. Though the number of EUs is lesser than the iGPU's in the LIVA Z3 / JSLM-MINI, the higher PL1 limit and faster RAM access help the ZBOX to emerge as the leader in both GFXBench workloads.

UL 3DMark

Four different workload sets were processed in 3DMark - Fire Strike, Time Spy, Night Raid, and Wild Life.

3DMark Fire Strike

The Fire Strike benchmark has three workloads. The base version is meant for high-performance gaming PCs. It uses DirectX 11 (feature level 11) to render frames at 1920 x 1080. The Extreme version targets 1440p gaming requirements, while the Ultra version targets 4K gaming system, and renders at 3840 x 2160. The graph below presents the overall score for the Fire Strike Extreme and Fire Strike Ultra benchmark across all the systems that are being compared.

UL 3DMark - Fire Strike Workloads

The Extreme workload sees the CI331 nano come out comfortably on top for the same reasons as the ones discussed in the GFXBench subsection - the higher PL1 limits, extra EUs compared to June Canyon, and faster DRAM. The Ultra workload (which doesn't make much sense for UCFF PCs based on low-power processors like Jasper Lake anyway) sees both the LIVA Z3 and the ZBOX CI331 nano get timed out - in all probability due to thermal throttling.

3DMark Time Spy

The Time Spy workload has two levels with different complexities. Both use DirectX 12 (feature level 11). However, the plain version targets high-performance gaming PCs with a 2560 x 1440 render resolution, while the Extreme version renders at 3840 x 2160 resolution. The graphs below present both numbers for all the systems that are being compared in this review.

UL 3DMark - Time Spy Workloads

The LIVA Z3 is thermally limited to the extent that neither Time Spy workload completes. The extra EUs and faster RAM help the ZBOX in the normal Time Spy workload. However, at higher resolutions (Extreme), the ZBOX gets thermally limited and its lowered power budget is insufficient to see it get past the June Canyon and JSLM-MINI.

3DMark Wild Life

The Wild Life workload was initially introduced as a cross-platform GPU benchmark in 2020. It renders at a 2560 x 1440 resolution using Vulkan 1.1 APIs on Windows. It is a relatively short-running test, reflective of mobile GPU usage. In mid-2021, UL released the Wild Life Extreme workload that was a more demanding version that renders at 3840 x 2160 and runs for a much longer duration reflective of typical desktop gaming usage.

UL 3DMark - Wild Life Workloads

The Wild Life workload was again a mixed bag for the ZBOX, with thermal behavior causing timeouts. The LIVA Z3 failed in both components. Active cooling and a consistent power budget actually see the Gemini Lake-based June Canyon NUC on the leaderboard this time.

3DMark Night Raid

The Night Raid workload is a DirectX 12 benchmark test. It is less demanding than Time Spy, and is optimized for integrated graphics. The graph below presents the overall score in this workload for different system configurations.

UL 3DMark Fire Strike Extreme Score

Power budget seems to be the primary factor for the  Night Raid workload. The June Canyon NUC is at the top despite its limited EUs and slower RAM. The JSLM-MINI is able to sustain a 6W PL1 for extended durations, and coupled with the extra EUs over the CI331 nano, it handily bests the ZBOX in this workload.

System Performance: Miscellaneous Workloads System Performance: Multi-Tasking
Comments Locked

52 Comments

View All Comments

  • ganeshts - Wednesday, July 13, 2022 - link

    Are you aware of any boards / PCs with Elkhart Lake that supports in-band ECC? Vendors I talk to seem to indicate that there is some other feature X that gets disabled if you do in-band.. and that feature X is more important for their target market compared to in-band ECC. So, they do not enable in-band ECC in their products even if the processor supports it.
  • mode_13h - Thursday, July 14, 2022 - link

    > Are you aware of any boards / PCs with Elkhart Lake that supports in-band ECC?

    Sorry, I've not seriously investigated the matter.

    > Vendors I talk to seem to indicate that there is some other feature X
    > that gets disabled if you do in-band.

    Wow. I'd love to know more! I figured the main tradeoff was just one of performance (and probably a less significant hit on memory capacity). I wonder why they don't just make it a user-configurable option.

    TBH, I don't know specifics about how Intel implements it. I *assume* they simply set aside a chunk of physical address space to hold the ECC bits for the rest of the address space, but that's just a guess.
  • ganeshts - Tuesday, July 12, 2022 - link

    Looks like you will leave me with nothing to write about for the Atlas Canyon review coming up later this week :)
  • abufrejoval - Thursday, July 14, 2022 - link

    If you keep me updated on the things in your pipeline, I'll make sure not to spoil things ;-)
  • mode_13h - Thursday, July 14, 2022 - link

    But I love your posts! I'll bet < 1% of the article readers look this deep into the comments.
  • mode_13h - Wednesday, July 13, 2022 - link

    Nice review!

    I had one of those ASRock Apollo Lake boards, but never got it to work. It's possible the RAM I got was incompatible, but it was decent quality (Crucial, IIRC) and their website claimed it worked with that board.

    I was sad to see ASRock has no Jasper Lake-based successor, but TBH I'd rather have Elkhart Lake and its in-band ECC-support. I'm just now noticing that Asrock Industrial has some tasty looking options, there. Now, if I can just figure out where to buy a IMB-1003D...
  • abufrejoval - Thursday, July 14, 2022 - link

    NUCs can take quite a while for the initial boot, even the Core based models.

    If I hadn’t been distracted at the time, I’d have already given up on the Jasper Lake NUC working with 64GB: I had been ready to turn it off by the time it showed the logo! Must have lasted something like 30 seconds or so, just to test and tune the RAM, which was DDR4-3200 after all and not quite the DDR4-2900 specs it officially prefers.

    Actually, I really hate that vendors increasingly just program at most 2 settings into DIMMs these days, so you can’t recycle them on a different machine.

    Once you realized that bits can rot, it’s very difficult to forego ECC. The very first IBM-PCs had parity and I’m not sure when it got dropped from mainline. Once I started running PCs as home servers, I’ve tried to make sure they had ECC memory. My workstations are also all 128GB ECC.

    I bought the Atoms mostly to run QA for oVirt, not as a “production” platform: low cost and low power was key, ECC simply not an economically viable option.

    They have been running non-stop for years now, with a collective 128GB of RAM and no glitch that I have ever noticed...

    The first time I ever heard of inline ECC was in one of your posts here. After a short moment of “bug-eyed disbelief” it seemed to make sense in an era, when little ever happens in RAM below the granularity of a cache line: the days of truly random RAM where all accesses were equally …slow are long past us, I believe the original Compaq 386 was the first to exploit static column RAM.

    I believe RAM compression was also implemented by an IBM server chipset many years ago, memory encryption is available on every modern laptop, so inline ECC seems very believable and not extremely costly: I’d just love to have the choice!

    As a matter of fact, this gets me asking: Core chips seem to employ ECC practically everywhere on internal registers, caches and data paths, but do Atoms do likewise? I’d guess they would have to for the server variants, so leaving that out for the entry level chips seems almost extra effort, yet I can’t recall hearing any mention one way or another.

    I’ve been trying to buy an Alder Lake replacement for a Haswell Xeon server with ASRocks IMB-X1712 mainboard mentioned here that supports DDR4-3200 ECC RAM. Unfortunately that’s another phantom product that never seems available for sale.
  • mode_13h - Thursday, July 14, 2022 - link

    > Must have lasted something like 30 seconds or so, just to test and tune the RAM

    Does the BIOS have an option to disable it, or at least a "fast boot" option?

    > Once you realized that bits can rot, it’s very difficult to forego ECC.

    The places where you really want ECC are those where a memory error can get persisted in data of non-trivial value. On fileservers and database servers, it's a must (unless the data is virtually disposable or they're simply providing read-only access).

    In the worst case, a memory error can actually cause filesystem corruption. It's unlikely, but the thing to remember about memory errors is that they're not entirely random or isolated. A DRAM chip could conceivably fail in a way that suddenly results a large number of memory errors. This will usually crash the machine (if not using ECC), but you could plausibly suffer data corruption just before that happens.

    > I bought the Atoms mostly to run QA for oVirt

    My ASRock board was meant to replace my Raspberry Pi as a streaming media server, for in-home use.

    > The first time I ever heard of inline ECC was in one of your posts here.

    I'm pretty sure the first I'd heard of it was on here, as well. I had a similar reaction as yours, but the more I thought about it, the more sense it made. It'd be hard for me to prefer it when I could have the real deal, but not a bad compromise on something like an Atom-tier platform.

    > little ever happens in RAM below the granularity of a cache line

    Yeah, you could implement it by blocking off 1/8th of RAM (in truth, you'd only need 1/9th, but 1/8th would keep things aligned more nicely) and associating 8 bytes of ECC information per 64-bytes of physical address space. Depending how you implement it, the hit to memory bandwidth could be as little as 11%, for linear accesses.

    TBH, I'm a little more mystified by the concept of memory compression. I guess it'd have to be block based, perhaps decompressing whole pages at a time? Then, when you page fault, some kind of index tells you where the page starts. There'd no doubt be some padding or unused space between the pages (or whatever granularity the blocks are). Perhaps the more interesting aspect would be deciding where to write newly-compressed pages.

    Of the three, memory encryption seems the most straight-forward. You would likely have a 1:1 mapping, so the only tricky part is one of key management.

    > Unfortunately that’s another phantom product that never seems available for sale.

    I'll bet availability is being hampered by just a couple key components being extremely hard to source. I heard some motherboard vendors have been unable to source certain Ethernet MACs. Another example I've heard is RAID controllers.
  • abufrejoval - Thursday, July 14, 2022 - link

    >> Must have lasted something like 30 seconds or so, just to test and tune the RAM

    >Does the BIOS have an option to disable it, or at least a "fast boot" option?

    That was only ever an issue for the initial boot with that RAM. Once it has figured out the RAM speed settings any normal boot is at reasonable speeds.

    I've research the Elkhart Lake Atoms a bit and they seem quite hard to find. Embedded systems with them sell for eye watering prices.

    ZFS was always the typical example for why bit flips could have catastrophical consequences when you cache aggressively and keep key data structure in RAM for months or longer.

    I use GlusterFS with VDO de-dup and compression on the Atoms, where a single bit flip could have similarly drastic consequences, but so far I've noticed no issue.

    It seems that getting a low power ECC platform is intentionally made difficult, closest I've recently got was with Ryzen 5750G APUs, which isn't that low power nor that cheap.

    DDR5 with real ECC seems even worse which is why the ASRock board with the W680 chipset and DDR4 support seems so attractive... and unavailable!

    RAM compression: It definitely requires OS support, but other than that seems not too difficult to do. I saw a demo booth at the HiPEAC 2020 conference in Bologna from a Swedish startup I believe, that tries to sell the IP e.g. for integration in RISC-V.
  • mode_13h - Friday, July 15, 2022 - link

    > Elkhart Lake Atoms a bit and they seem quite hard to find.

    They exist, if expensive and uncommon: https://www.newegg.com/p/1JW-003Z-00026

    According to the manufacturer's site, it even seems to support in-band ECC:

    https://www.mitacmct.com/IndustrialMotherboard=PD1...

    However, that would seem to require the PD10EHI-X6413E model, which is *not* so readily available.

    > ASRock board with the W680 chipset

    Yeah, I was starting to browse for W680 boards, recently. I wish I could find an ATX (or micro-ATX) with 2x DDR5 slots, but every one I've found is either DDR4 or 4x DDR5 slots. Anyway, I'm not really in a hurry.

    > RAM compression: ... seems not too difficult to do.
    > ...a Swedish startup ... that tries to sell the IP

    Okay, think about that for a second. Someone thought it offered enough value and is sufficiently hard that they started a company around it!

Log in

Don't have an account? Sign up now