Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatch Between the LLC Misses and the MBM Counter #283

Open
yangziy opened this issue Sep 23, 2024 · 0 comments
Open

Mismatch Between the LLC Misses and the MBM Counter #283

yangziy opened this issue Sep 23, 2024 · 0 comments

Comments

@yangziy
Copy link

yangziy commented Sep 23, 2024

When I try to run a program that supposedly always misses the LLC, the LLC misses count does not match the MBM counter, the latter of which stays at zero. A minimal version of the program (40 LoC) is attached below.

# The program is pinned at core #3
 CORE         IPC      MISSES     LLC[KB]   MBL[MB/s]   MBR[MB/s]
   3        0.34      55601k     22464.0         0.0         0.0

The number of misses is 55601k, so I'd expect the MBL counter to be approximately 55601 * 1000 * 64 / (1024 ^ 2) = 3394MB/s. But it always stays at 0.

/* This program sequentially iterates over a 1GB buffer 
in a large stride so that it always accesses the same cache set */

#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#include <linux/mman.h>

#define SIZE_1GB (1024*1024*1024)

/* The MBM counter works as expected when the stride is 64B */
// #define STRIDE 64

/**
    Always access the same cache set
    This is specific for the Xeon 8275CL on EC2 c5.metal
        NUM_CL = LLC_SIZE / CL_SIZE = 35.75MB / 64B = 585728
        NUM_SETS = NUM_CL / NUM_WAYS = 585728 / 11 = 53248
        NUM_SETS * CL_STRIDE = 53248 * 64 = 3407872 = 0x340000
 */
#define STRIDE 0x340000


int main() {
    // Use a physically continuous page so that we can access the same cache set 
    register char *buf = (char *)mmap(/*addr*/ 0, /*len*/ SIZE_1GB,
                /*prot*/ PROT_READ | PROT_WRITE,
                /*flags*/ MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB | MAP_HUGE_1GB, /*fd*/ 0,
                /*offset*/ 0);
    register unsigned offset = 0;
    register uint64_t val = 10;

    while (1) {
        asm volatile("movq (%1), %0\n\t" : : "r"(val), "r"(buf + offset));
        offset += STRIDE;
        if (offset >= SIZE_1GB) {
            offset = 0;
        }
    }

    munmap(buf, SIZE_1GB);
}

=== Configuration ===

Platform: AWS c5.metal
CPU: Xeon 8275CL
OS And Kernel Version: Ubuntu 24.04 (6.8.0-1012-aws)
PQoS library version: 6.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant