The script is already old, I couldn't manage to do it faster.
It's basically my old Tiger Lake vs. Renoir stance with some extra details.
I will post some summary pictures and further discussion points later.
So time for some summary pictures and some extras.
Willow Cove is a chunky boi in terms of resources and die area, nothing new but some may like a rough comparison table.
Function and implementation details could be quite different betweens architectures but you get the idea.
@yichensyd pointed out correctly that the cache hierarchy does function differently between the architectures.
Not only capacity matters but also latency, associativity and how cache coherency is handled, which can impact effective cache size and effective bandwidth...
...
Anandtech recently captured real latency numbers, 14 clocks for the L2$ and 39-45 for the L3$:
https://www.anandtech.com/show/16084/intel-tiger-lake-review-deep-dive-core-11th-gen/4

A little bit confusing is the cache coherency side.
Intel says the caches are non-inclusive, so when not strictly exclusive...
...this could mean that the same cacheline may live in the L1$, L2$ and L3$ at the same time.
I read some commentary about the L3$ handling of the Skylake SP processors which also use "non-inclusive" caches and it appears as if the L3$ is handled strictly exclusive?
...
... @trav_downs may know how the cache hierarchy works under Skylake SP.
There are also adaptive strategies which can be employed for replacement and coherency enforcement.
At least for replacement Intel has adaptive strategies which can change the behaviour...
One aspect in regards to the cache hierarchies which needs clarifiaction is Anandtech's reporting.
Based on Intels documentation Nehalem and Sandy Bridge used non-inclusive L2$ and inclusive L3$, this appears to be true at least up to Skylake (Client)...
... @IanCutress also wrote about a non-inclusive L2$ for Sunny Cove in July 2019 and for previous CPUs
https://www.anandtech.com/show/14514/examining-intels-ice-lake-microarchitecture-and-sunny-cove/3

But now it is reported as if previous architectures had inclusive L2$ in August 2020 and it only now changed with Willow Cove:
https://www.anandtech.com/show/15971/intels-11th-gen-core-tiger-lake-soc-detailed-superfin-willow-cove-and-xelp/3
In regards to the Rasterizer, there were two pictures which showcase the imbalance between the Pixel FrontEnd Rasterizer (Scan Converter) and the Pixel Backend (ROPs) and how throughput goes down when a triangle doesn't cover four pixel quads and how Polaris10 was built.
You can follow @Locuza_.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: