Arm Cortex A725: Improvements to Middle Core Efficiency

The Arm Cortex-A725 is designed to balance performance and power efficiency, making it a critical component of the second-generation Armv9.2 architecture. Positioned as a mid-tier core, it complements the high-performance Cortex-X925 by offering robust capabilities for everyday computing tasks while maintaining energy efficiency. This core is especially targeted at devices that require consistent performance without the high power consumption associated with top-tier cores, such as smartphones, tablets, and laptops.

The Cortex-A725 builds on the successes of its predecessor, the Cortex-A720, with several key architectural enhancements. One of the significant improvements is the increased instruction issue queue and the expanded reorder buffer, which enable the core to handle more instructions simultaneously and execute them out of order for improved efficiency. This increase in the out-of-order execution window size allows the Cortex-A725 to utilize its execution units better, leading to smoother and faster processing of complex workloads.

The core also benefits from a new 1MB L2 cache configuration, which provides faster access to frequently used data and instructions. This larger cache size is designed to reduce latency and improve performance, particularly for applications that require rapid data retrieval. Additionally, the Cortex-A725 features enhancements in its register file structure, further streamlining data processing and reducing bottlenecks.

Power efficiency is a crucial aspect of the Cortex-A725's design. With leading-edge 2024 Cortex chips expected to be fabbed on newly-available 3nm process technologies from TSMC and others, the improved performance from these nodes is able to drive big improvements in energy efficiency, and Arm is leaning into that heavily with the A725. Overall, Arm is touting that A725 delivers significant power savings compared to previous generations. Compared to the Cortex-A720, the Cortex-A725 offers up to a 25% improvement in power efficiency (and 20% L3 traffic reduction), making it an ideal choice for mobile devices that require long battery life.

The core also features advanced power management capabilities, including dynamic voltage and frequency scaling (DVFS) and half-slice power-down modes. These features allow the Cortex-A725 to adjust its power consumption based on the current workload, ensuring energy is used efficiently without sacrificing performance. 

Arm Cortex X925: Leading The Way in Single-Threaded IPC Arm Cortex A520: Same 2023 Core Optimized For 3nm
POST A COMMENT

55 Comments

View All Comments

  • eastcoast_pete - Wednesday, May 29, 2024 - link

    Speaking of SVE and SME: are there any applications (for Android, Windows-on-ARM or Apple devices) available to the general public that use either or both of them? SVE was originally co-developed by ARM and Fujitsu for the core that powers Fugaku, Riken's supercomputer. There are reports (rumors) that SVE is painful to implement, and someone wrote that Qualcomm elected to not enable SVE in their 8 Gen3 SoC, even though it's in their big cores. Anyone here knows, can comment? Right now, outside of 1-2 benchmarks, which applications actually use SVE, never mind SME? Reply
  • name99 - Wednesday, May 29, 2024 - link

    Presumably ARM’s Kleidi AI libraries (and various MS equivalents) use SVE and SME if present.
    And that’s really what matters. This functionality is envisaged (for now) as “built-in”.
    Obviously they want developer buy-in over time, but that’s not what matters right now; what matters is what’s in the OS and API’s. Same as the fact that AMX was available to developers via Accelerate was great, but the primary user was Apple’s ML APIs.
    Reply
  • Marlin1975 - Thursday, May 30, 2024 - link

    What do you mean, its all there. They went over the Optimized design that will take advantage of the synergies of the new NM tech from a leading edge lithography manufacture and lead them to greater performance. Its a win win for everyone, are you not onboard?

    :)
    Reply
  • syxbit - Wednesday, May 29, 2024 - link

    I suspect this will still be worse than the A17 and the Nuvia chips. Reply
  • GC2:CS - Wednesday, May 29, 2024 - link

    A17 and M3 and M4 did not show much benefit by going to the 3nm. If ARM can do better than only good for them. Reply
  • BGQ-qbf-tqf-n6n - Wednesday, May 29, 2024 - link

    A17 was already 30% faster than S8G3 in single-core scores. In the same GB tests ARM is referring to, M4 is 27% faster still.

    Presuming the X925 is relative to the X4 with “36 percent faster”, they’ll still be behind M3, much less M4.
    Reply
  • OreoCookie - Saturday, June 1, 2024 - link

    The speed ups in single and multi core were significant. To my knowledge the 10-core M4 is the fastest stock CPU in single core performance that was tested (about 13 % faster than Intel's Core i9 14900 KS, which clocks up to 6.2 GHz stock). The M3 is about 6 % behind the 14900 KS. (I am unaware of e. g. SPECmark results for the M4.) Reply
  • mode_13h - Saturday, June 1, 2024 - link

    > I am unaware of e. g. SPECmark results for the M4.

    I'm pretty sure nobody is testing that, since Anandtech stopped doing it (i.e. after Andrei left).
    Reply
  • OreoCookie - Sunday, June 2, 2024 - link

    Yeah, and it seems nobody is doing it consistently across several generations. The best dissection of the M3 architecture I remember was by a Chinese Youtube channel, but nobody is carrying the baton. Maybe Ian and Andrei are doing this as part of their work for clients. (Andrei, I think, is working for Qualcomm now, isn't he?) Reply
  • mode_13h - Monday, June 3, 2024 - link

    name99 would know what M3 analysis is out there. He wrote/compiled the Apple M1 explainer, which is a 300-page PDF you can find with all the details about it.

    https://github.com/name99-org/AArch64-Explore/
    Reply

Log in

Don't have an account? Sign up now