7 Comments
User's avatar
c3dtops's avatar

hey Chester - ran the following on just the E-core (core 12).

Specs:

Core Ultra 265K

OS - Fedora Linux 42

Kernel - Linux 6.15.5-200.fc42.x86_64

Bios Version - 19.01 latest "as of July 16 2025" (enabled boost 200s)

Asus Motherboard Z890

DDR kit - G.SKILL DDR5-6000 Mhz, 32GBx2, CL30-40-40-96)

(model F5-6000J3040G32Gx2-FX5)

https://github.com/ChipsandCheese/Microbenchmarks

Picked MemoryLatency since that was the biggest negative about Arrowlake.

taskset -c 12 ./MemoryLatency

Usage: [-test <c/asm/tlb/mlp>] [-maxsizemb <max test size in MB>] [-iter <base iterations, default 100000000]

Region,Latency (ns)

2,0.874067

4,0.871156

8,0.869487

12,0.871046

16,0.870000

24,0.869852

32,0.872878

48,4.024554

64,4.098391

96,4.144344

128,4.167482

192,4.202611

256,4.720000

384,5.223539

512,5.475110

600,5.582734

768,5.701232

1024,5.832217

1536,5.959842

2048,6.162089

3072,6.484454

4096,8.368000

5120,10.734433

6144,12.412544

8192,14.061186

10240,14.958428

12288,15.550732

16384,16.721663

24567,25.251892

32768,42.031368

65536,82.608002

98304,94.112236

131072,100.787682

262144,107.434982

393216,109.781296

524288,111.751778

1048576,116.255997

lmk if you sense anything about ARL that's worth uncovering/deep-diving into.

Happy to run tests on my setup.

Expand full comment
Peter W.'s avatar

Thanks Chester! I may have overlooked it, but would be interested to see how the gaming performance of Arrow Lake running only (misty) on the Skymont E-Cores is (apologies if you've already written about it and I missed it). I believe that one has to leave at least one P-core enabled in Arrow Lake, but it would still be interesting to see how good, bad or ugly games like CP2077 or CoD would run mostly on the Skymonts.

Expand full comment
Marija's avatar

If I remember correctly, 13900KS runs its uncore at 5.0GHz, so going back down to 3.8GHz for ARL is a major performance regression. Would be interesting to know can it approach RPL speeds when overclocked, lowering L3 latency might help a lot.

Expand full comment
Marija's avatar

If I remember correctly, 13900KS runs its uncore at 5.0GHz, so going back down to 3.8GHz for ARL is a major performance regression. Would be interesting to know can it approach RPL speeds when overclocked, lowering L3 latency might help a lot.

Expand full comment
c3dtops's avatar

Great piece of investigative work, thanks for posting it up.

Questions:

Hardware -> DDR5-6000 28-36-36-96 (which brand of memory stick were you using?)

I'm working on setting up some RDMA networking stuff with smart NIC and trying to workout which memory-kits should i go with. Online reviews suggest more "Expensive" kits like CU-DIM to close the gaps with X3D.

CPU related

BIOS -> Intel 200S boost enabled?

Frontend Bandwidth -> "8 renamer-slots on Lion-Cove". So simple ALU/AGU instruction (4-8bytes bytecode length) it can handle the allocation of physical registers at ~8 micro-ops per cycle pass the instruction decoders?

Arbitration queue (ARB)

"The ARB runs at the CPU tile’s uncore clock, or 3.8 GHz"

**uncore" means the base-clock of the P-Cores? Can that be over-clocked?

Expand full comment
Chester Lam's avatar

G.Skill F5-6000J2836G16G, actually supplied by AMD, but I'm using it in the ARL system because it loads the EXPO profile just fine and it's the fastest memory I have around. I wouldn't pay too much attention to the memory specifics.

I updated the BIOS, but didn't change anything beyond enabling EXPO. I'll check later to see if the 200S boost thing is enabled

Allocation rate is just 8 micro-ops, not related to instruction length (unless you have a lot of long instructions, have a low op cache hitrate, and bottleneck the decoders from hitting L1i bandwidth limits)

The uncore clock is the clock of the ring bus, which connects the cores to L3. Its been decoupled from core clocks since Haswell. Can be overclocked (iirc it's a multiplier off a 100 mhz base clock just like with the cores), but I'm looking at the general picture rather than overclocking.

Expand full comment
David. Hellyx's avatar

There is no memory that will magically close a gap that sometimes exceeds 50%

Expand full comment