It’s been nearly 10 years since Arm had first announced the Armv8 architecture in October 2011, and it’s been a quite eventful decade of computing as the instruction set architecture saw increased adoption through the mobile space to the server space, and now starting to become common in the consumer devices market such as laptops and upcoming desktop machines. Throughout the years, Arm has evolved the ISA with various updates and extensions to the architecture, some important, some maybe glanced over easily.

Today, as part of Arm’s Vision Day event, the company is announcing the first details of the company’s new Armv9 architecture, setting the foundation for what Arm hopes to be the computing platform for the next 300 billion chips in the next decade.

The big question that readers will likely be asking themselves is what exactly differentiates Armv9 to Armv8 to warrant such a large jump in the ISA nomenclature. Truthfully, from a purely ISA standpoint, v9 probably isn’t an as fundamental jump as v8 was over v7, which had introduced a completely different execution mode and instruction set with AArch64, which had larger microarchitectural ramifications over AArch32 such as extended registers, 64-bit virtual address spaces and many more improvements.

Armv9 continues the usage of AArch64 as the baseline instruction set, however adds in a few very important extensions in its capabilities that warrants an increment in the architecture numbering, and probably allows Arm to also achieve a sort of software re-baselining of not only the new v9 features, but also the various v8 extensions we’ve seen released over the years.

The three new main pillars of Armv9 that Arm sees as the main goals of the new architecture are security, AI, and improved vector and DSP capabilities. Security is a very big topic for v9 and we’ll go into the new details of the new extensions and features into more depth in a bit, but getting DSP and AI features out of the way first should be straightforward.

Probably the biggest new feature that is promised with new Armv9 compatible CPUs that will be immediately visible to developers and users is the baselining of SVE2 as a successor to NEON.

Scalable Vector Extensions, or SVE, in its first implementation was announced back in 2016 and implemented for the first time in Fujitsu’s A64FX CPU cores, now powering the world’s #1 supercomputer Fukagu in Japan. The problem with SVE was that this first iteration of the new variable vector length SIMD instruction set was rather limited in scope, and aimed more at HPC workloads, missing many of the more versatile instructions which still were covered by NEON.

SVE2 was announced back in April 2019, and looked to solve this issue by complementing the new scalable SIMD instruction set with the needed instructions to serve more varied DSP-like workloads that currently still use NEON.

The benefit of SVE and SVE2 beyond addition various modern SIMD capabilities is in their variable vector size, ranging from 128b to 2048b, allowing variable 128b granularity of vectors, irrespective of what the actual hardware is running on. Purely from a view of vector processing and programming, it means that a software developer would only ever have to compile his code once, and if in the future a CPU would come out with say native 512b SIMD execution pipelines, the code would be able to already take advantage of the full width of the units. Similarly, the same code would be able to run on more conservative designs with a lower hardware execution width capability, which is important to Arm as they design CPUs from IoT, to mobile, to datacentres. It also does this all whilst remaining within the 32b encoding space of the Arm architecture, whereas alternative implementations such as on x86 have to add on new extensions and instructions depending on vector size.

Machine learning is also seen as an important part of Armv9 as Arm sees more and more ML workloads to become common place in the next years. Running ML workloads on dedicated accelerators naturally will still be a requirement for anything that is performance or power efficiency critical, however there still will be vast new adoption of smaller scope ML workloads that will run on CPUs.

Matrix multiplication instructions are key here and will represent an important step in seeing larger adoption across the ecosystem as being a baseline feature of v9 CPUs.

Generally, I see SVE2 as probably the most important factor that would warrant the jump to a v9 nomenclature as it’s a more definitive ISA feature that differentiates it from v8 CPUs in every-day usage, and that would warrant the software ecosystem to go and actually diverge from the existing v8 stack. That’s actually become quite a problem for Arm in the server space as the software ecosystem is still baselining software packages on v8.0, which unfortunately is missing the all-important v8.1 Large System Extensions.

Having the whole software ecosystem move forward and being able to assume new v9 hardware has the capability of the new architectural extensions would help push things ahead, and probably solve some of the current situation.

However v9 isn’t only about SVE2 and new instructions, it also has a very large focus on security, where we’ll be seeing some more radical changes.

Introducing the Confidential Compute Architecture
Comments Locked

74 Comments

View All Comments

  • JoeDuarte - Wednesday, March 31, 2021 - link

    This is not true. Most developers have never used any SIMD and don't plan to. Some of them don't even know what SIMD is. You're severely overestimating its importance. Software developers are generally lazy and produce lots of underperforming and poorly optimized code.

    Given that Arm introduced SVE several years ago, and no one has even implemented it in a processor that you can buy, I don't know why you think Arm's noises about SVE2 matter. It won't matter. They're so fragmented that they can't even get consistent implementation of the latest versions of v8, like v8.3/4/5.

    Apple doesn't even want developers to optimize at that level, to use assembly or intrinsics, so they make it hard to even know what instructions are supported in their Arm CPUs. They want everyone to use terrible languages like Swift. On Android, there's so much fragmentation that you can't count on support for later versions of v8.x.

    SVE2 would matter on servers if and when Arm servers become a thing, a real thing, like a you can buy one from Supermicro kind of thing. They would need to be common, with accessible hardware. Developers will need access to the chips, either on their desks or in the cloud. It would need to be reliable access – the cloud generally isn't reliable that way, as there have been cases where AWS dropped people down to instances running on pre-Haswell CPUs, which broke developers' code using AVX2 instructions...

    You can't develop for SVE2 without access to hardware that supports it. Right now that hardware does not exist. Arm v9 isn't going to yield any hardware that supports SVE2 for a year or longer, and it might be four years or so before it's easily accessed, or longer, possibly never. By the time it's readily available, so many other variables will have changed in the market dynamic between Arm, AMD, and Intel that your claim doesn't work.
  • Ppietra - Friday, April 2, 2021 - link

    A lot of developers might not even know what SIMD is, but I would argue that a lot of apps actually end up using SIMD simply because many APIs to the system make use of NEON
  • Krysto - Tuesday, March 30, 2021 - link

    MTE will likely end up more of a short-term solution, as all such solutions are.

    If Arm was serious about actually getting rid of the majority of memory bugs, they would have announced first-class support for the Rust programming language.
  • SarahKerrigan - Tuesday, March 30, 2021 - link

    https://developer.arm.com/solutions/internet-of-th...

    Rust has been well-supported on ARM for a while.
  • Wilco1 - Wednesday, March 31, 2021 - link

    Many languages have claimed to solve all computing problems, but none did as well as C/C++. Why would Rust be any better than Java, C#, D, Swift, Go etc?

    Also you're forgetting that compilers and runtimes will still have bugs. 100% memory safe is only achievable using ROM.
  • kgardas - Wednesday, March 31, 2021 - link

    Because from all mentioned languages, Rust is not GC-based language and has highest chance to be involved in system programming. See Rust addition into the Linux kernel. See MS praise for Rust etc. Generally speaking Rust is more typesafe/memory safe than C, and good old C is really old enough to be replaced completely.
  • Wilco1 - Wednesday, March 31, 2021 - link

    Ditching GC is good but it doesn't solve the fundamental problem. I once worked on a new OS written in a C# variant, and it was riddled with constructs that switch off type checking and GC in order to get actual work done. So in the end it didn't gain safety while still suffering from all the extra overheads of using a "safe" language.

    So I remain sceptical that yet another new language can solve anything - it's hard to remain safe while messing about with low level registers, stacks, pointers, heaps etc. Low-level programming is something some people can do really well and others can never seem to master.
  • mdriftmeyer - Thursday, April 1, 2021 - link

    We're not talking C89 but C17 and most OS solutions are already implementing those modern features. C2x has an awful lot of work being finalized into it.

    http://www.open-std.org/jtc1/sc22/wg14/www/wg14_do...

    Good old C isn't old anymore.

    And no, Linus, Apple, Microsoft aren't ditching C/C++ in their Kernels for Rust.
  • melgross - Saturday, April 10, 2021 - link

    Rust will be safer until the hacking community is interested enough to find all of the bugs and poor thinking that undoubtedly exists in Rust, as it has in every language over the decades that was declared safe.
  • JoeDuarte - Wednesday, March 31, 2021 - link

    Are you aware of formal verification? There are formally verified OSes now, like seL4.

    There's also the CHERI CPU project, which Arm in involved in.

    And formally verified compilers line INIFRIA.

    We need to junk C and C++ and replace them with serious programming languages that are far more intuitive and sane, as well as being memory safe. Rust is terrible from a syntax and learning standpoint, and much better languages are possible. The software industry is appallingly lazy.

Log in

Don't have an account? Sign up now