Skip to content

Conversation

@wangrong1069
Copy link
Contributor

environment:
kernel
Linux uos-PC 4.19.0-18-arm64 #1 SMP Debian 4.19.208-1 (2021-09-29) aarch64 GNU/Linux
linux-headers-4.19.0-18-arm64_4.19.208-1_arm64.deb
linux-headers-4.19.0-18-common_4.19.208-1_all.deb
linux-image-4.19.0-18-arm64_4.19.208-1_arm64.deb
linux-kbuild-4.19_4.19.208-1_arm64.deb
linux-libc-dev_4.19.208-1_arm64.deb
libc
GNU C Library (Debian GLIBC 2.28.17-1+eagle) stable release version 2.28.
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 8.3.0.
libc ABIs: UNIQUE ABSOLUTE
For bug reporting instructions, please see:
http://www.debian.org/Bugs/.
trafficserver
trafficserver_8.0.2+ds-1+deb10u1_arm64.deb
os
debian 10 buster

operations:
install trafficserver
update /etc/trafficserver/records.config for proxy
CONFIG proxy.config.url_remap.remap_required INT 0
CONFIG proxy.config.http.cache.http INT 1
CONFIG proxy.config.reverse_proxy.enabled.INT 0
restart trafficserver
curl -I -x 127.0.0.1:8080 https://github.com/
# block and not get the target web page

log:
/var/log/trafficserver/manager.log
[Jan 7 10:02:29.902] {0xffff995c1010} NOTE: Relaunching proxy after 4 sec...
[Jan 7 10:02:33.902] {0xffff995c1010} NOTE: [ProxyStateSet] Traffic Server Args: ' -M'
[Jan 7 10:02:33.902] {0xffff995c1010} NOTE: [LocalManager::listenForProxy] Listening on port: 8080 (ipv4)
[Jan 7 10:02:33.902] {0xffff995c1010} NOTE: [LocalManager::listenForProxy] Listening on port: 8080 (ipv6)
[Jan 7 10:02:33.902] {0xffff995c1010} NOTE: [LocalManager::startProxy] Launching ts process
[Jan 7 10:02:33.914] {0xffff995c1010} NOTE: [LocalManager::pollMgmtProcessServer] New process connecting fd '12'
[Jan 7 10:02:33.914] {0xffff995c1010} NOTE: [Alarms::signalAlarm] Server Process born
[Jan 7 10:02:33.915] {0xffff995c1010} NOTE: [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 11: Segmentation fault
[Jan 7 10:02:33.915] {0xffff995c1010} NOTE: [Alarms::signalAlarm] Server Process was reset
[Jan 7 10:02:34.917] {0xffff995c1010} NOTE: Relaunching proxy after 8 sec...

analyze:
"traffic_server" received signal SIGSEGV, Segmentation fault.
0x0000fffff7f91664 in freelist_free (item=0xfffffffff56c8000, f=0x9c7b00) at ink_queue.cc:319
space allocated
ats_memalign(alignment, INK_ALIGN(alloc_size, alignment)); => 0xfffff56c8000
space used
FREELIST_POINTER(item) => 0xfffffffff56c8000

more-test:
test.c
#include <stdio.h>
#include <stdlib.h>

    int main(int argc, char *argv[])
    {
        printf("%x\n", malloc(sizeof(int)));
        printf("%p\n", malloc(sizeof(int)));

        void *ptr = 0;
        posix_memalign(&ptr, 4096, 294912); # parameters from traffic_server
        printf("%x\n", ptr);
        printf("%p\n", ptr);
    }
posix_memalign on amd64
    mmap(NULL, 303104, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f1bb10e7000
posix_memalign on arm64
    mmap(NULL, 303104, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff9ddce000

root-cause:
on arm64, compression and decompression of pointers are inconsistent

update:
struct head_p {
data_type data;
# original:
# 0 ~ 47 bits : 48 bits, Virtual Address
# 48 ~ 63 bits : 16 bits, Freelist Version
# new:
# 0 ~ 48 bits : 49 bits, Virtual Address
# 49 ~ 63 bits : 15 bits, Freelist Version
}

@wangrong1069 wangrong1069 changed the base branch from master to 8.0.x January 12, 2022 07:54
@wangrong1069 wangrong1069 changed the title 8.0.x fix trafficserver segmentation fault on arm64 Jan 12, 2022
@ywkaras
Copy link
Contributor

ywkaras commented Jan 12, 2022

I can't locate a document that says the 15 most significant bits of aarch64 user space pointer are unused.

https://www.kernel.org/doc/html/latest/arm64/memory.html seems ambiguous as to whether one can count on 12 or 16 unused bits.

@wangrong1069
Copy link
Contributor Author

Yes, you are right.
Unused bits, depending on the page size. For simplicity and generality, we can choose 12.
How about the following modifications.

#elif defined(aarch64)
#define FREELIST_POINTER(_x) ((void *)((((intptr_t)(_x).data) & 0x000FFFFFFFFFFFFFULL)))
#define FREELIST_VERSION(_x) (((intptr_t)(_x).data) >> 52)
#define SET_FREELIST_POINTER_VERSION(_x, _p, _v) (_x).data = (((intptr_t)(_p)) | (((_v)&0xFFFULL) << 52))

@ywkaras
Copy link
Contributor

ywkaras commented Jan 13, 2022

Can you share a link or links to documents that are the best references to the aarch64 MMU/description of user space virtual address decoding?

@SolidWallOfCode
Copy link
Member

The freelist code simply assumes the top 16 are available and can be changed without damage. That documentation indicates that's not always true on ARM64 (depending on the page size). I would agree we can abuse the top 12 bits safely in that case, which I think should suffice. I'd be concerned if we get down to 8 bits.

@ywkaras
Copy link
Contributor

ywkaras commented Jan 13, 2022

It should be kept in mind that I think most x86 servers that trafficserver runs on (including those used by YCPI) support 128 bit CAS. On such an architecture, our configure script will store versioned pointers in an int128_t variable, and not depend on unused bits in 64 bit pointers. So, we may see some hidden bugs running on any arch without 128 bit CAS.

@wangrong1069
Copy link
Contributor Author

wangrong1069 commented Jan 14, 2022

Besides the problem of unused bits in pointers, another thing is that incorrect pointer fetches can also cause errors. In the original extraction mode (using sign extension).
right case (amd64): 0x7f1bb10e7000 => data => 0x7f1bb10e7000
wrong case (arm64): 0xfffff56c8000 => data => 0xfffffffff56c8000

In lastest commit, use top 12 bits to save version info and update pointer extraction method.

on arm64, compression and decompression of pointers are inconsistent
use top 12 bits to save version info and update pointer extraction method
#elif defined(__x86_64__) || defined(__ia64__) || defined(__powerpc64__) || defined(__mips64)
#define FREELIST_POINTER(_x) \
((void *)(((((intptr_t)(_x).data) << 16) >> 16) | (((~((((intptr_t)(_x).data) << 16 >> 63) - 1)) >> 48) << 48))) // sign extend
#define FREELIST_VERSION(_x) (((intptr_t)(_x).data) >> 48)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think we'd actually want to cast to uintptr_t, so that the right shift would not sign extend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, do you mean we should update the code as below.

#elif defined(aarch64)
#define FREELIST_POINTER(_x)
((void *)(((((uintptr_t)(_x).data) << 12) >> 12) | (((~((((intptr_t)(_x).data) << 12 >> 63) - 1)) >> 52) << 52))) // sign extend
#define FREELIST_VERSION(_x) (((intptr_t)(_x).data) >> 52)
#define SET_FREELIST_POINTER_VERSION(_x, _p, _v) (_x).data = ((((intptr_t)(_p)) & 0x000FFFFFFFFFFFFFULL) | (((_v)&0xFFFULL) << 52))

I think the last commit would be simpler and more efficient. maybe i'm wrong.
Can you give some detailed comments, thank you.

@ywkaras
Copy link
Contributor

ywkaras commented Jan 19, 2022

Can you provide a link to the architecture document(s) you are using for the ARM 64-bit?

@wangrong1069
Copy link
Contributor Author

I am using FT-2000/4 CPU
https://www.phytium.com.cn/en/article/97

@ywkaras
Copy link
Contributor

ywkaras commented Jan 21, 2022

I am using FT-2000/4 CPU https://www.phytium.com.cn/en/article/97

That doesn't look like an ARM CPU architecture reference document to me. The issue is, what is the proof/evidence that, for all implementations/configurations of 64-bit ARM architectures, there are at least 12 unused highest-precision bit in the 64 bit user space pointers?

@wangrong1069
Copy link
Contributor Author

I am using FT-2000/4 CPU https://www.phytium.com.cn/en/article/97

That doesn't look like an ARM CPU architecture reference document to me. The issue is, what is the proof/evidence that, for all implementations/configurations of 64-bit ARM architectures, there are at least 12 unused highest-precision bit in the 64 bit user space pointers?

https://www.kernel.org/doc/html/latest/arm64/memory.html
Unused bits, depending on page size.
4KB page, the upper 16 bits are unused.
64KB page, the upper 12 bits are unused.
Is my understanding correct.

@ywkaras
Copy link
Contributor

ywkaras commented Jan 24, 2022

I was hoping you could provide a reference from a document such as this: https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/third-party/ddi0100e_arm_arm.pdf

My concern is that we'll do something that doesn't work for all instances and configurations of the ARM 64 bit architecture, which could result in a very mysterious bug for someone.

@ezelkow1
Copy link
Member

[approve ci]

1 similar comment
@bryancall
Copy link
Contributor

[approve ci]

@ywkaras
Copy link
Contributor

ywkaras commented Jan 25, 2022

This PR needs to be against the master branch, not a release branch.

@wangrong1069
Copy link
Contributor Author

wangrong1069 commented Jan 25, 2022

According to ywkaras's advice, I try to check related code of master branch and make a new PR against the master branch.
But I found this issue(segmentation fault) already be fixed in master branch.
I update 8.0.x related code by master branch and test success.

#elif defined(__x86_64__) || defined(__ia64__) || defined(__powerpc64__) || defined(__aarch64__) || defined(__mips64)
/* Layout of FREELIST_POINTER
*
*  0 ~ 47 bits : 48 bits, Virtual Address (47 bits for AMD64 and 48 bits for AArch64)
* 48 ~ 62 bits : 15 bits, Freelist Version
*      63 bits :  1 bits, The type of Virtual Address (0 = user space, 1 = kernel space)
*/
/* Detect which shift is implemented by the simple expression ((~0 >> 1) < 0):
*
* If the shift is 'logical' the highest order bit of the left side of the comparison is 0 so the result is positive.
* If the shift is 'arithmetic' the highest order bit of the left side is 1 so the result is negative.
*/
#if ((~0 >> 1) < 0)
/* the shift is `arithmetic' */
#define FREELIST_POINTER(_x) \
((void *)((((intptr_t)(_x).data) & 0x0000FFFFFFFFFFFFLL) | ((((intptr_t)(_x).data) >> 63) << 48))) // sign extend
#else
/* the shift is `logical' */
#define FREELIST_POINTER(_x) \
((void *)((((intptr_t)(_x).data) & 0x0000FFFFFFFFFFFFLL) | (((~((((intptr_t)(_x).data) >> 63) - 1)) >> 48) << 48)))
#endif

#define FREELIST_VERSION(_x) ((((intptr_t)(_x).data) & 0x7FFF000000000000LL) >> 48)
#define SET_FREELIST_POINTER_VERSION(_x, _p, _v) (_x).data = ((((intptr_t)(_p)) & 0x8000FFFFFFFFFFFFLL) | (((_v)&0x7FFFLL) << 48))

I will close this PR.

==================================

I plan to make a new PR against the master branch, is this available?

  1. Remove redundant right shift operations (>> 48).
/* the shift is `logical' */
#define FREELIST_POINTER(_x) \
((void *)((((intptr_t)(_x).data) & 0x0000FFFFFFFFFFFFLL) | ((~((((intptr_t)(_x).data) >> 63) - 1)) << 48)))
#endif

The purpose of FREELIST_POINTER is to extract original pointer.

(((intptr_t)(_x).data) & 0x0000FFFFFFFFFFFFLL), get low 48 bits 
((~((((intptr_t)(_x).data) >> 63) - 1)) << 48)
    fill 0 at top 16 bits for user space pointer, fill 1 at top 16 bits for kernel space pointer
0x0000FFFFFFFFFFFF
    (((intptr_t)(_x).data) & 0x0000FFFFFFFFFFFFLL)      => 0x0000FFFFFFFFFFFF
    ((~((((intptr_t)(_x).data) >> 63) - 1)) << 48)    => 0x0000000000000000
        0x0000FFFFFFFFFFFF -> 0 -> 0xFFFFFFFFFFFFFFFF -> 0 -> 0
0x8000FFFFFFFFFFFF
    (((intptr_t)(_x).data) & 0x0000FFFFFFFFFFFFLL)      => 0x0000FFFFFFFFFFFF
    ((~((((intptr_t)(_x).data) >> 63) - 1)) << 48)    => 0xFFFF000000000000
        0x8000FFFFFFFFFFFF -> 1 -> 0 -> 0xFFFFFFFFFFFFFFFF -> 0xFFFF000000000000
  1. use 52 ~ 62 bits for aarch64
#elif defined(__aarch64__)
/* Layout of FREELIST_POINTER
*
*  0 ~ 51 bits : 52 bits, Virtual Address
* 52 ~ 62 bits : 11 bits, Freelist Version
*      63 bits :  1 bits, The type of Virtual Address (0 = user space, 1 = kernel space)
*/
/* Detect which shift is implemented by the simple expression ((~0 >> 1) < 0):
*
* If the shift is 'logical' the highest order bit of the left side of the comparison is 0 so the result is positive.
* If the shift is 'arithmetic' the highest order bit of the left side is 1 so the result is negative.
*/
#if ((~0 >> 1) < 0)
/* the shift is `arithmetic' */
#define FREELIST_POINTER(_x) \
((void *)((((intptr_t)(_x).data) & 0x000FFFFFFFFFFFFFLL) | ((((intptr_t)(_x).data) >> 63) << 52))) // sign extend
#else
/* the shift is `logical' */
#define FREELIST_POINTER(_x) \
((void *)((((intptr_t)(_x).data) & 0x000FFFFFFFFFFFFFLL) | ((~((((intptr_t)(_x).data) >> 63) - 1)) << 52)))
#endif

#define FREELIST_VERSION(_x) ((((intptr_t)(_x).data) & 0x7FF0000000000000LL) >> 52)
#define SET_FREELIST_POINTER_VERSION(_x, _p, _v) (_x).data = ((((intptr_t)(_p)) & 0x800FFFFFFFFFFFFFLL) | (((_v)&0x7FFLL) << 52))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants