-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
why getarch get L1d(8k) is different from lscpu(32k) ? #1232
Comments
Could be a mistake or omission in the get_cacheinfo() function of cpuid_x86.c - is this with the current |
develop |
As far as I can tell, the entries in the big switch{} statement around line 340 of cpuid_x86.c appear to match the enumeration at http://www.sandpile.org/x86/cpuid.htm#level_0000_0002h from which it is derived (at least as far as entries for "data L1 cache" are concerned). Perhaps it would help to know what info[i] contains in your case. |
He. Seems there is a "break" missing after line 645 of cpuid_x86.c, causing the 0x63 case to fall through into unrelated 0x66 that sets LD1 to 8k. Can you check if that fixes your problem, or would you like me to supply the modified file ? |
Patch committed as obvious (famous last words...), please check if this actually fixes your problem. |
I applied your patch and I do not get any L1-Cache information. My CPU is Kaby Lake. With lscpu I get: |
Me too ... |
So there must be more to this - I have reverted my "quick fix" now, sorry for that.. will try to get a dump of the info array from get_cacheinfo on Kaby Lake later today to see if there are any unhandled or conflicting returns. |
On Kaby Lake the non-zero elements of the info array turn out to be 0x63 0x03 0x76 0xFF 0xB5 0xF0 0xC3 , of which the unhandled B5 is supposed to set the code TLB values (4/8/64) and FF is "use cpuid(4,...) to query cache configuration". So it appears the inadvertent fall-through from 0x63 to 0x66 just masks this by supplying a wrong but passable value for the L1 size. |
Trouble is that the extended cpuid call 0x8000_0005 does not appear to return anything non-zero on Kaby Lake (probably Intel cpus in general ?), and standard 0x8000_0004 does not seem to provide the cache size (unless it is implicit in one of the items). I wonder if it makes sense to just fall back to a hardcoded 32k L1 code/data cache size for "modern" Intel processors if the value could not be read ? |
Can either of you test with #1236 please ? (I assume it will work on Linux and Windows, if anything there may be problems on OS X or other x86-based operating systems that I cannot test) |
The patch works for me. Now I get: |
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 94
Stepping: 3
CPU MHz: 816.664
BogoMIPS: 6816.07
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
NUMA node0 CPU(s): 0-7
hd@WellOcean12:~/OpenBLAS$ ./getarch 1
#define HASWELL
#define L1_CODE_SIZE 16384
#define L1_CODE_ASSOCIATIVE 4
#define L1_CODE_LINESIZE 64
#define L1_DATA_SIZE 8192
#define L1_DATA_ASSOCIATIVE 4
#define L1_DATA_LINESIZE 64
#define L2_SIZE 262144
#define L2_ASSOCIATIVE 8
#define L2_LINESIZE 64
hd@WellOcean12:~/OpenBLAS$ more /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 94
model name : Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
stepping : 3
microcode : 0x84
cpu MHz : 800.062
cache size : 8192 KB
The text was updated successfully, but these errors were encountered: