Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please add a LICENSE.TXT to clarify licensing #1

Closed
mattray opened this issue Apr 20, 2015 · 3 comments
Closed

Please add a LICENSE.TXT to clarify licensing #1

mattray opened this issue Apr 20, 2015 · 3 comments

Comments

@mattray
Copy link

mattray commented Apr 20, 2015

Please put a LICENSE.TXT in the top-level directory so potential users and contributors understand the licensing of this project.

@snambakam
Copy link

Added LICENSE file at the top level folder.

@mbbroberg
Copy link

Hey @snambakam, is it worthwhile to keep just one copy of the license and point the license.py code to the top-level one as I did in #8?

@snambakam
Copy link

Hi Mark,

We have been strictly following the guidelines from our licensing team regarding the presentation and placement of the license files.

Currently the license.txt uses the EULA which is specifically about the technology preview, and is shown as part of the installation.

The LICENSE file currently is only maintained in the code and is not included as part of the installation.

Please let us leave these separate for now.

Thanks
Sriram

ghost pushed a commit that referenced this issue Nov 4, 2018
The linux-secure kernel panics during boot as shown below:

[    0.734037] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
[    0.734077] PGD 0 P4D 0
[    0.734092] Oops: 0000 [#1] SMP PTI
[    0.734108] CPU: 0 PID: 162 Comm: modprobe Tainted: G        W       T 4.18.9-2.ph3-secure #1-photon
[    0.734144] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.0.B64.1704110547 04/11/2017
[    0.734203] RIP: 0010:__x64_sys_brk+0x25/0x1f0
[    0.734222] Code: 18 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 4c 8d 6d c0 53 65 4c 8b 3c 25 40 4d 01 00 48 83 ec 20 4d 8b a7 7
[    0.734319] RSP: 0018:ffff9fbf414b3e20 EFLAGS: 00010282
[    0.734341] RAX: 0000000080000000 RBX: ffff9fbf414b3f58 RCX: 0000000000000000
[    0.734369] RDX: 000000000000004d RSI: 00007bc7977d41a3 RDI: 0000000000000000
[    0.734397] RBP: ffff9fbf414b3e68 R08: 0000000000000019 R09: 00007bc7977d43f9
[    0.734425] R10: 0000000000000000 R11: 0000000000000000 R12: ffff927b0ec31000
[    0.734454] R13: ffff9fbf414b3e28 R14: 0000000000000000 R15: ffff927b0ec2cb80
[    0.734483] FS:  0000000000000000(0000) GS:ffff927b7fc00000(0000) knlGS:0000000000000000
[    0.734515] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.734538] CR2: 0000000000000070 CR3: 000000000ec34004 CR4: 00000000001606f0
[    0.734614] Call Trace:
[    0.734630]  do_syscall_64+0x6d/0x330
[    0.734648]  ? invalid_op+0x14/0x20
[    0.734665]  entry_SYSCALL_64_after_hwframe+0x4f/0xb5
[    0.734687]  ? pax_randomize_kstack+0x85/0xa0
[    0.734707]  ? entry_SYSCALL_64_after_hwframe+0x42/0xb5
[    0.734729] Modules linked in:
[    0.734744] CR2: 0000000000000070
[    0.734759] ---[ end trace b8d9c1f99ecab103 ]---

The root-cause of this issue is two-fold:

- The PAX randkstack patch uses task_pt_regs(current) to calculate the
  new randomized stack pointer. However, that's incorrect in the
  current code, because the pt_regs pointer passed to do_syscall_64() is
  actually different from the one computed using task_pt_regs(current).
  So, we'll need to use the same pt_regs pointer to calculate the new
  randomized stack pointer as well.

- The RAP plugin patch passes 6 arguments to the syscall handler in
  do_syscall_64(). The latter used to accept 6 arguments before, but was
  recently changed to accept only 1 argument, namely the pt_regs
  pointer. This argument mismatch (which goes undetected by the compiler
  because it is coded up in in-line assembly) causes sys_brk() to try
  and access regs->di as if it was regs itself, causing the NULL pointer
  dereference down the road.

Fix both these issues to avoid the kernel panic. While at it, also
refactor the PAX randkstack patch to make sure that the stack
randomization logic is actually enclosed under CONFIG_PAX_RANDKSTACK.

Change-Id: Ia93a07c1c62ed0fe33db058996fb87c085f8e2e5
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/6078
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Alexey Makhalov <amakhalov@vmware.com>
ghost pushed a commit that referenced this issue Dec 17, 2019
This change allows to cross build set of core packages.

Change-Id: I1f5dfefe37501be0b24319c862721053817ec68d
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/6189
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Priyesh Padmavilasom <ppadmavilasom@vmware.com>
Reviewed-by: Anish Swaminathan <anishs@vmware.com>
gerrit-photon pushed a commit that referenced this issue Aug 5, 2020
Currently HAProxy on Photon OS 3 is built in such a way that it
dynamically links to the pthread library. This causes an issue
(https://www.mail-archive.com/haproxy@formilux.org/msg33214.html) when
running HAProxy in a `chroot` jail or a cgroup. The `pthread_cancel`
(https://man7.org/linux/man-pages/man3/pthread_cancel.3.html) call will
fail and so threads are aborted instead of cancelled, preventing the
thread from cleaning up its resources.

This is demonstrated with the following steps using a Docker image
for Photon that is installed and configured with HAProxy:

1. Start the image with a different entry point so HAProxy is not
   automatically launched:

    docker run -it --rm --name haproxy --entrypoint bash photon-haproxy

2. Inside the image, start HAProxy with the following command:

    $ haproxy -f /etc/haproxy/haproxy.cfg
    [NOTICE] 217/183238 (18) : New program 'api' (19) forked
    [NOTICE] 217/183238 (18) : New worker #1 (20) forked
    time="2020-08-05T18:32:39Z" level=info msg="HAProxy Data Plane API v2.1.0 af9b8e4"
    time="2020-08-05T18:32:39Z" level=info msg="Build from: https://github.com/haproxytech/dataplaneapi"
    time="2020-08-05T18:32:39Z" level=info msg="Build date: 2020-07-23T19:45:08"

3. Start another shell process inside the image:

    docker exec -it haproxy bash

4. When HAProxy is run in a `master-worker` mode, HAProxy will launch
   with a watchdog process that monitors the other threads/processes.
   Using this mode, you can reload HAProxy by sending it a `SIGUSR2`
   signal (https://cbonte.github.io/haproxy-dconv/2.0/management.html):

    kill -SIGUSR2 18

5. Back in the original shell the output from HAProxy will now show the
   following:

    [WARNING] 217/183315 (18) : Reexecuting Master process
    [WARNING] 217/183315 (20) : Stopping frontend GLOBAL in 0 ms.
    [WARNING] 217/183315 (20) : Proxy GLOBAL stopped (FE: 4 conns, BE: 4 conns).
    [NOTICE] 217/183315 (18) : New worker #1 (33) forked
    libgcc_s.so.1 must be installed for pthread_cancel to work
    [WARNING] 217/183316 (18) : Former worker #1 (20) exited with code 134 (Aborted)

    The above log shows the worker thread was aborted instead of cleanly
    shutdown.

THE SOLUTION
The fix for this is to build HAProxy with a flag to mark `gcc_s` as
`--no-as-needed`. Once the new HAProxy RPM is built, go back to the first shell:

1. Upgrade HAProxy with the new RPM:

    $ rpm --upgrade -vh haproxy-2.1.0-2.ph3.x86_64.rpm
    Verifying...                          ################################# [100%]
    Preparing...                          ################################# [100%]
    Updating / installing...
       1:haproxy-2.1.0-2.ph3              ################################# [ 50%]
    Cleaning up / removing...
       2:haproxy-2.1.0-1.ph3              ################################# [100%]

2. Start HAProxy:

    $ haproxy -f /etc/haproxy/haproxy.cfg
    [NOTICE] 217/183521 (38) : New program 'api' (39) forked
    [NOTICE] 217/183521 (38) : New worker #1 (40) forked
    time="2020-08-05T18:35:21Z" level=info msg="HAProxy Data Plane API v2.1.0 af9b8e4"
    time="2020-08-05T18:35:21Z" level=info msg="Build from: https://github.com/haproxytech/dataplaneapi"
    time="2020-08-05T18:35:21Z" level=info msg="Build date: 2020-07-23T19:45:08"

3. Switch to the second shell and send the reload signal to HAProxy:

    kill -SIGUSR2 38

4. Switch back to the first shell and notice that HAProxy no longer
   resorted to aborting threads:

    [WARNING] 217/183550 (38) : Reexecuting Master process
    [WARNING] 217/183550 (40) : Stopping frontend GLOBAL in 0 ms.
    [WARNING] 217/183550 (40) : Proxy GLOBAL stopped (FE: 3 conns, BE: 3 conns).
    [NOTICE] 217/183550 (38) : New worker #1 (52) forked
    [WARNING] 217/183551 (38) : Former worker #1 (40) exited with code 0 (Exit)

    In the above logs, it's noted that the former worker thread exited
    successfully (code `0`) instead of being aborted as before.

Change-Id: I9b76f0080624065fde4d88a7be7e6f169ca9c4cf
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/10620
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Anish Swaminathan <anishs@vmware.com>
gerrit-photon pushed a commit that referenced this issue Aug 21, 2020
Fix network stack for use-after-free issue in case timeout happens
on fragment queue and ip_expire is called.

Panic logs:

CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.9.154-3.ph2 #1-photon
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015
task: ffff8803317ac380 task.stack: ffffc90001910000
RIP: 0010:[<ffffffff81434917>]  [<ffffffff81434917>] rb_replace_node+0x7/0x70
RSP: 0018:ffff88033fd83e60  EFLAGS: 00010206
RAX: ffff88032e0d6800 RBX: ffff88032e005f00 RCX: 0000000000000020
RDX: ffff88032e005f90 RSI: 0002800d00008000 RDI: ffff880147927a00
RBP: ffff88033fd83e60 R08: 00000000cccccccd R09: 000000000000004e
R10: ffff88032e762400 R11: ffff880176cc40e6 R12: ffffffff81f67ec0
R13: ffff880147927a00 R14: ffff88032e005f90 R15: ffff88032e005f00
FS:  0000000000000000(0000) GS:ffff88033fd80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa3c41b9dcc CR3: 000000032d306000 CR4: 0000000000160670
Stack:
 ffff88033fd83e90 ffffffff817c1796 ffff88033fd92440 0000000000000100
 ffff8802a1767218 ffffffff817c1600 ffff88033fd83ec8 ffffffff810dc1a2
 ffff88033fd92440 ffff88033fd83ee8 dead000000000200 0000000000000001
Call Trace:
 <IRQ>
 [<ffffffff817c1796>] ip_expire+0x196/0x1c0
 [<ffffffff817c1600>] ? ip4_obj_cmpfn+0x30/0x30
 [<ffffffff810dc1a2>] call_timer_fn+0x32/0x120
 [<ffffffff810dc71e>] run_timer_softirq+0x3ee/0x460
 [<ffffffff810e42ce>] ? ktime_get+0x3e/0xb0
 [<ffffffff81050d91>] ? lapic_next_deadline+0x21/0x30
 [<ffffffff810ea9d4>] ? clockevents_program_event+0xc4/0x110
 [<ffffffff8107b93f>] __do_softirq+0xdf/0x2d0
 [<ffffffff8107bca0>] irq_exit+0xc0/0xd0
 [<ffffffff8105136b>] smp_apic_timer_interrupt+0x4b/0x60
 [<ffffffff81890476>] apic_timer_interrupt+0x96/0xa0
 <EOI>
 [<ffffffff8188eda0>] ? __sched_text_end+0x4/0x4
 [<ffffffff8188f096>] ? native_safe_halt+0x6/0x10
 [<ffffffff8188edbb>] default_idle+0x1b/0xe0
 [<ffffffff810367f0>] arch_cpu_idle+0x10/0x20
 [<ffffffff8188f1ee>] default_idle_call+0x1e/0x30
 [<ffffffff810b647a>] cpu_startup_entry+0x1ba/0x230
 [<ffffffff8104f098>] start_secondary+0x168/0x190
Code: 0f 1f 40 00 48 8b 07 55 48 89 e5 48 85 c0 75 05 eb 0c 48 89 d0 48 8b 50
      08 48 85 d2 75 f4 5d c3 0f 1f 40 00 48 8b 07 55 48 89 e5 <48> 89 06 49
      89 c0 48 8b 47 08 49 83 e0 fc 48 89 46 08 48 8b 47
RIP  [<ffffffff81434917>] rb_replace_node+0x7/0x70
 RSP <ffff88033fd83e60>
---[ end trace ec01802e4220d0e9 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: disabled

Bug Fixes Patches:
1] 0013-inet-frags-rework-rhashtable-dismantle.patch
   3c8fc8782044: "inet: frags: rework rhashtable dismantle"
2] 0014-inet-frags-fix-use-after-free-read-in-inet_frag_dest.patch
   dc93f46bc4e0: "inet: frags: fix use-after-free read in inet_frag_des"
3] 0015-inet-fix-various-use-after-free-in-defrags-units.patch
   d5dd88794a13: inet: fix various use-after-free in defrags units

prerequisite patches:
1] 0001-inet-rename-netns_frags-to-fqdir.patch
   6ce3b4dcee4f9: "inet: rename netns_frags to fqdir"
2] 0002-net-rename-inet_frags_exit_net-to-fqdir_exit.patch
   89fb900514d1: "net: rename inet_frags_exit_net() to fqdir_exit()"
3] 0003-net-rename-struct-fqdir-fields.patch
   803fdd996847: "net: rename struct fqdir fields"
4] 0004-ipv4-no-longer-reference-init_net-in.patch
   8dfdb31335ee: "ipv4: no longer reference init_net in
                  ip4_frags_ns_ctl_table[]"
5] 0005-ipv6-no-longer-reference-init_net-in.patch
   8668d0e2bfdf: "ipv6: no longer reference init_net in
                  ip6_frags_ns_ctl_table[]"
6] 0006-netfilter-ipv6-nf_defrag-no-longer-reference-init_ne.patch
   3bb13dd4cae0: "netfilter: ipv6: nf_defrag: no longer reference init_net
                  in nf_ct_frag6_sysctl_table"
7] 0007-ieee820154-6lowpan-no-longer-reference-init_net-in.patch
   d2dfd43598f3: "ieee820154: 6lowpan: no longer reference init_net in
                  lowpan_frags_ns_ctl_table"
8] 0008-net-rename-inet_frags_init_net-to-fdir_init.patch
   9cce45f22cee: "net: rename inet_frags_init_net() to fdir_init()"
9] 0009-net-add-a-net-pointer-to-struct-fqdir.patch
   a39aca678a06: "net: add a net pointer to struct fqdir"
10] 0010-net-dynamically-allocate-fqdir-structures.patch
    4907abc605e3: "net: dynamically allocate fqdir structures"
11] 0011-netns-add-pre_exit-method-to-struct-pernet_operation.patch
    d7d99872c144:  "netns: add pre_exit method to struct pernet_operations"
12] 0012-inet-frags-uninline-fqdir_init.patch
    6b73d19711d0: "inet: frags: uninline fqdir_init()"
16] 0016-netns-restore-ops-before-calling-ops_exit_list.patch
    b272a0ad7301: "netns: restore ops before calling ops_exit_list"

Change-Id: I3b6f8137f77d595f6b1bca19c11fdd5a44be4a80
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/9866
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: <bvikas@vmware.com>
gerrit-photon pushed a commit that referenced this issue Jul 5, 2022
Summary:
- when a CFS task that was boosted by a SCHED_DEADLINE
  task boosts another CFS task (nested priority inheritance).
  Kernel panic is seen.
- Backported the fix from 5.10 kernel.
Link:
https://lore.kernel.org/all/20201117061432.517340-1-juri.lelli@redhat.com/T/#u

Description:
kernel BUG at kernel/sched/deadline.c:1495!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 43 PID: 1154 Comm: irq/62-eth0-rxt Tainted: GOE 4.19.198-rt85-5.ph3-rt #1-photon
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
RIP: 0010:enqueue_task_dl+0x6e/0xa10
Call Trace:
 ? put_prev_entity+0x25/0x100
 rt_mutex_setprio+0x31c/0x470
 task_blocks_on_rt_mutex+0x250/0x2f0
 rt_spin_lock_slowlock_locked+0x9d/0x2b0
 ? preempt_schedule_irq+0x43/0xa0
 rt_spin_lock_slowlock+0x48/0x60
 rt_spin_lock+0x44/0x50
 __dev_queue_xmit+0x513/0x970
 ? xfrm_lookup+0x11/0x20
 dev_queue_xmit+0x10/0x20
 ? dev_queue_xmit+0x10/0x20
 ip_finish_output2+0x28b/0x3e0
 ip_finish_output+0xfe/0x1e0
 ? ip_finish_output+0xfe/0x1e0
 ? nf_hook_slow+0x48/0xc0
 ip_output+0x63/0xe0
 ? __ip_flush_pending_frames.isra.46+0x90/0x90
 ip_forward_finish+0x51/0x80
 ip_forward+0x368/0x460
 ? ip4_key_hashfn+0xc0/0xc0
 ip_rcv_finish+0x84/0xa0
 ip_rcv+0x47/0xd0
 ? ip_rcv_finish_core.isra.17+0x3a0/0x3a0
 __netif_receive_skb_one_core+0x4c/0x60
 __netif_receive_skb+0x18/0x60
 process_backlog+0xb5/0x1c0
 net_rx_action+0x203/0x4d0
 ? __switch_to_asm+0x35/0x70
 ? __switch_to_asm+0x41/0x70
 ? __switch_to_asm+0x35/0x70
 ? __switch_to_asm+0x41/0x70
 ? __switch_to_asm+0x35/0x70
 do_current_softirqs+0x1b5/0x3a0
 ? irq_finalize_oneshot.part.49+0xf0/0xf0
 __local_bh_enable+0x5d/0x70
 irq_forced_thread_fn+0x5c/0x70
 irq_thread+0xd3/0x160
 ? wake_threads_waitq+0x30/0x30
 kthread+0x160/0x180
 ? irq_thread_check_affinity+0x20/0x20
 ? kthread_create_worker_on_cpu+0x50/0x50
 ret_from_fork+0x1f/0x40


As per code analysis:

--> rt_mutex_setprio()

if (dl_prio(prio)) {
	if (!dl_prio(p->normal_prio) ||
	    (pi_task && dl_prio(pi_task->prio) &&
	     dl_entity_preempt(&pi_task->dl, &p->dl))) {
		p->dl.dl_boosted = 1;
		queue_flag |= ENQUEUE_REPLENISH;
	} else
		p->dl.dl_boosted = 0;
	p->sched_class = &dl_sched_class;
...
...
if (queued)
	enqueue_task(rq, p, queue_flag);


--> enqueue_task_dl()
} else if (!dl_prio(p->normal_prio)) {
	BUG_ON(!p->dl.dl_boosted || flags != ENQUEUE_REPLENISH); -> Hit
}

Main Patch:
- 0003-sched-deadline-Fix-priority-inheritance-with-multipl.patch:
Changes the way how sched_deadline attributes are being inherited from
original donor task. Changes involves in the same path as described above.
Also, upstream reported the similar Kernel panic with very similar
call trace and suggested this would fix the issue.
Link:https://lore.kernel.org/lkml/164875194487.15351.7634869661022863588@beryllium.lan/T/

Added supporting patches which also fixes bugs related throttling:
- 0001-sched-deadline-Unthrottle-PI-boosted-threads-while-e.patch
- 0002-sched-deadline-Fix-stale-throttling-on-de-boosted-ta.patch
- 0004-kernel-sched-Remove-dl_boosted-flag-comment.patch


Change-Id: I1bce698873f421cb3b83d96f85ed93964d5782ec
Signed-off-by: Ankit Jain <ankitja@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/16738
Tested-by: Srivatsa S. Bhat <srivatsab@vmware.com>
Reviewed-by: Srivatsa S. Bhat <srivatsab@vmware.com>
gerrit-photon pushed a commit that referenced this issue Mar 8, 2023
If wrong regex is given, then while parsing
  regex we call cleanup_regex
  - parse_task_ignore_string()
   - compile_regex()
    - cleanup_regex()

compiled = *compiled_expr;
if (compiled)
        free(compiled);
- we doesn't reset the *compiled_expr variable after freeing
  and it is extern variable
- again cleanup_regex() gets called from stalld.c
  with already freed address(Non Null Address):
https://git.kernel.org/pub/scm/utils/stalld/stalld.git/tree/src/stalld.c?h=v1.18.0#n1307

- when we `systemctl restart stalld`, we get coredump:
which shows
systemd[1]: Stopping Stall Monitor...
systemd-coredump[780991]: Process 780670 (stalld) of user 0 dumped core.
  Stack trace of thread 780670:
  #0  0x00007f6becf1e041 raise (libc.so.6 + 0x3d041)
  #1  0x00007f6becf07536 abort (libc.so.6 + 0x26536)
  #2  0x00007f6becf5f5a8 n/a (libc.so.6 + 0x7e5a8)
  #3  0x00007f6becf66fea n/a (libc.so.6 + 0x85fea)
  #4  0x00007f6becf673dc n/a (libc.so.6 + 0x863dc)
  #5  0x000055666563ed7e n/a (/usr/bin/stalld (deleted) + 0x6d7e)
systemd[1]: stalld.service: Main process exited, code=dumped, status=6/ABRT
systemd[1]: stalld.service: Failed with result 'core-dump'

- Resetting the extern variable to NULL after
  freeing the memory solves this issue.

Change-Id: Ia75f4fea59476aea3b0aa267fdf0884f95f8b2e2
Signed-off-by: Ankit Jain <ankitja@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/19931
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Srivatsa S. Bhat <srivatsab@vmware.com>
gerrit-photon pushed a commit that referenced this issue Mar 10, 2023
If wrong regex is given, then while parsing
  regex we call cleanup_regex
  - parse_task_ignore_string()
   - compile_regex()
    - cleanup_regex()

compiled = *compiled_expr;
if (compiled)
        free(compiled);
- we doesn't reset the *compiled_expr variable after freeing
  and it is extern variable
- again cleanup_regex() gets called from stalld.c
  with already freed address(Non Null Address):

- when we `systemctl restart stalld`, we get coredump:
which shows
systemd[1]: Stopping Stall Monitor...
systemd-coredump[780991]: Process 780670 (stalld) of user 0 dumped core.
  Stack trace of thread 780670:
  #0  0x00007f6becf1e041 raise (libc.so.6 + 0x3d041)
  #1  0x00007f6becf07536 abort (libc.so.6 + 0x26536)
  #2  0x00007f6becf5f5a8 n/a (libc.so.6 + 0x7e5a8)
  #3  0x00007f6becf66fea n/a (libc.so.6 + 0x85fea)
  #4  0x00007f6becf673dc n/a (libc.so.6 + 0x863dc)
  #5  0x000055666563ed7e n/a (/usr/bin/stalld (deleted) + 0x6d7e)
systemd[1]: stalld.service: Main process exited, code=dumped, status=6/ABRT
systemd[1]: stalld.service: Failed with result 'core-dump'

- Resetting the extern variable to NULL after
  freeing the memory solves this issue.

Change-Id: Ia6bae7379970b5633194619d00c1d19adea120c2
Signed-off-by: Ankit Jain <ankitja@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/19968
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Srivatsa S. Bhat <srivatsab@vmware.com>
gerrit-photon pushed a commit that referenced this issue Mar 10, 2023
If wrong regex is given, then while parsing
  regex we call cleanup_regex
  - parse_task_ignore_string()
   - compile_regex()
    - cleanup_regex()

compiled = *compiled_expr;
if (compiled)
        free(compiled);
- we doesn't reset the *compiled_expr variable after freeing
  and it is extern variable
- again cleanup_regex() gets called from stalld.c
  with already freed address(Non Null Address):

- when we `systemctl restart stalld`, we get coredump:
which shows
systemd[1]: Stopping Stall Monitor...
systemd-coredump[780991]: Process 780670 (stalld) of user 0 dumped core.
  Stack trace of thread 780670:
  #0  0x00007f6becf1e041 raise (libc.so.6 + 0x3d041)
  #1  0x00007f6becf07536 abort (libc.so.6 + 0x26536)
  #2  0x00007f6becf5f5a8 n/a (libc.so.6 + 0x7e5a8)
  #3  0x00007f6becf66fea n/a (libc.so.6 + 0x85fea)
  #4  0x00007f6becf673dc n/a (libc.so.6 + 0x863dc)
  #5  0x000055666563ed7e n/a (/usr/bin/stalld (deleted) + 0x6d7e)
systemd[1]: stalld.service: Main process exited, code=dumped, status=6/ABRT
systemd[1]: stalld.service: Failed with result 'core-dump'

- Resetting the extern variable to NULL after
  freeing the memory solves this issue.

Change-Id: I18244692338599e0a0589c75a2714636bfd8a355
Signed-off-by: Ankit Jain <ankitja@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/19967
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Srivatsa S. Bhat <srivatsab@vmware.com>
tapakund pushed a commit that referenced this issue Mar 20, 2023
If wrong regex is given, then while parsing
  regex we call cleanup_regex
  - parse_task_ignore_string()
   - compile_regex()
    - cleanup_regex()

compiled = *compiled_expr;
if (compiled)
        free(compiled);
- we doesn't reset the *compiled_expr variable after freeing
  and it is extern variable
- again cleanup_regex() gets called from stalld.c
  with already freed address(Non Null Address):

- when we `systemctl restart stalld`, we get coredump:
which shows
systemd[1]: Stopping Stall Monitor...
systemd-coredump[780991]: Process 780670 (stalld) of user 0 dumped core.
  Stack trace of thread 780670:
  #0  0x00007f6becf1e041 raise (libc.so.6 + 0x3d041)
  #1  0x00007f6becf07536 abort (libc.so.6 + 0x26536)
  #2  0x00007f6becf5f5a8 n/a (libc.so.6 + 0x7e5a8)
  #3  0x00007f6becf66fea n/a (libc.so.6 + 0x85fea)
  #4  0x00007f6becf673dc n/a (libc.so.6 + 0x863dc)
  #5  0x000055666563ed7e n/a (/usr/bin/stalld (deleted) + 0x6d7e)
systemd[1]: stalld.service: Main process exited, code=dumped, status=6/ABRT
systemd[1]: stalld.service: Failed with result 'core-dump'

- Resetting the extern variable to NULL after
  freeing the memory solves this issue.

Change-Id: I18244692338599e0a0589c75a2714636bfd8a355
Signed-off-by: Ankit Jain <ankitja@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/19967
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Srivatsa S. Bhat <srivatsab@vmware.com>
gerrit-photon pushed a commit that referenced this issue Apr 19, 2023
- If strlen(buffer) and buffer size is 100
and if buffer consist of non-null terminating
100bytes then __fortify_strlen() detects buffer
overflow and hit BUG() in kernel code.

[  496.370015] detected buffer overflow in __fortify_strlen
[  496.370079] ------------[ cut here ]------------
[  496.370081] kernel BUG at lib/string_helpers.c:1027!
[  496.370101] invalid opcode: 0000 [#1] SMP PTI
[  496.370111] CPU: 0 PID: 1179 Comm: mount Not tainted 6.1.10-6.ph5-esx #1-photon
[  496.370124] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.9318676.B64.1807270745 07/27/2018
[  496.370146] RIP: 0010:fortify_panic+0x13/0x15

- fixes this by replacing strlen with strnlen

Change-Id: I1b7f1880789b18d89dfe5b3515779bdcb3a4bb6f
Signed-off-by: Ankit Jain <ankitja@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/20573
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Tapas Kundu <tkundu@vmware.com>
gerrit-photon pushed a commit that referenced this issue Sep 19, 2023
- If strlen(buffer) and buffer size is 100
and if buffer consist of non-null terminating
100bytes then __fortify_strlen() detects buffer
overflow and hit BUG() in kernel code.

[  496.370015] detected buffer overflow in __fortify_strlen
[  496.370079] ------------[ cut here ]------------
[  496.370081] kernel BUG at lib/string_helpers.c:1027!
[  496.370101] invalid opcode: 0000 [#1] SMP PTI
[  496.370111] CPU: 0 PID: 1179 Comm: mount Not tainted 6.1.10-6.ph5-esx #1-photon
[  496.370124] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.9318676.B64.1807270745 07/27/2018
[  496.370146] RIP: 0010:fortify_panic+0x13/0x15

- fixes this by replacing strlen with strnlen

Change-Id: I1b7f1880789b18d89dfe5b3515779bdcb3a4bb6f
Signed-off-by: Ankit Jain <ankitja@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/20573
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Tapas Kundu <tkundu@vmware.com>
Signed-off-by: Ankit Jain <ankitja@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/c/photon/+/21922
Reviewed-by: Ajay Kaher <akaher@vmware.com>
sikkamukul pushed a commit to sikkamukul/photon that referenced this issue Mar 8, 2024
The linux-secure kernel panics during boot as shown below:

[    0.734037] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
[    0.734077] PGD 0 P4D 0
[    0.734092] Oops: 0000 [vmware#1] SMP PTI
[    0.734108] CPU: 0 PID: 162 Comm: modprobe Tainted: G        W       T 4.18.9-2.ph3-secure vmware#1-photon
[    0.734144] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.0.B64.1704110547 04/11/2017
[    0.734203] RIP: 0010:__x64_sys_brk+0x25/0x1f0
[    0.734222] Code: 18 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 4c 8d 6d c0 53 65 4c 8b 3c 25 40 4d 01 00 48 83 ec 20 4d 8b a7 7
[    0.734319] RSP: 0018:ffff9fbf414b3e20 EFLAGS: 00010282
[    0.734341] RAX: 0000000080000000 RBX: ffff9fbf414b3f58 RCX: 0000000000000000
[    0.734369] RDX: 000000000000004d RSI: 00007bc7977d41a3 RDI: 0000000000000000
[    0.734397] RBP: ffff9fbf414b3e68 R08: 0000000000000019 R09: 00007bc7977d43f9
[    0.734425] R10: 0000000000000000 R11: 0000000000000000 R12: ffff927b0ec31000
[    0.734454] R13: ffff9fbf414b3e28 R14: 0000000000000000 R15: ffff927b0ec2cb80
[    0.734483] FS:  0000000000000000(0000) GS:ffff927b7fc00000(0000) knlGS:0000000000000000
[    0.734515] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.734538] CR2: 0000000000000070 CR3: 000000000ec34004 CR4: 00000000001606f0
[    0.734614] Call Trace:
[    0.734630]  do_syscall_64+0x6d/0x330
[    0.734648]  ? invalid_op+0x14/0x20
[    0.734665]  entry_SYSCALL_64_after_hwframe+0x4f/0xb5
[    0.734687]  ? pax_randomize_kstack+0x85/0xa0
[    0.734707]  ? entry_SYSCALL_64_after_hwframe+0x42/0xb5
[    0.734729] Modules linked in:
[    0.734744] CR2: 0000000000000070
[    0.734759] ---[ end trace b8d9c1f99ecab103 ]---

The root-cause of this issue is two-fold:

- The PAX randkstack patch uses task_pt_regs(current) to calculate the
  new randomized stack pointer. However, that's incorrect in the
  current code, because the pt_regs pointer passed to do_syscall_64() is
  actually different from the one computed using task_pt_regs(current).
  So, we'll need to use the same pt_regs pointer to calculate the new
  randomized stack pointer as well.

- The RAP plugin patch passes 6 arguments to the syscall handler in
  do_syscall_64(). The latter used to accept 6 arguments before, but was
  recently changed to accept only 1 argument, namely the pt_regs
  pointer. This argument mismatch (which goes undetected by the compiler
  because it is coded up in in-line assembly) causes sys_brk() to try
  and access regs->di as if it was regs itself, causing the NULL pointer
  dereference down the road.

Fix both these issues to avoid the kernel panic. While at it, also
refactor the PAX randkstack patch to make sure that the stack
randomization logic is actually enclosed under CONFIG_PAX_RANDKSTACK.

Change-Id: Ia93a07c1c62ed0fe33db058996fb87c085f8e2e5
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/6078
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Alexey Makhalov <amakhalov@vmware.com>
sikkamukul pushed a commit to sikkamukul/photon that referenced this issue Mar 8, 2024
This change allows to cross build set of core packages.

Change-Id: I1f5dfefe37501be0b24319c862721053817ec68d
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/6189
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Priyesh Padmavilasom <ppadmavilasom@vmware.com>
Reviewed-by: Anish Swaminathan <anishs@vmware.com>
sikkamukul pushed a commit to sikkamukul/photon that referenced this issue Mar 8, 2024
- If strlen(buffer) and buffer size is 100
and if buffer consist of non-null terminating
100bytes then __fortify_strlen() detects buffer
overflow and hit BUG() in kernel code.

[  496.370015] detected buffer overflow in __fortify_strlen
[  496.370079] ------------[ cut here ]------------
[  496.370081] kernel BUG at lib/string_helpers.c:1027!
[  496.370101] invalid opcode: 0000 [vmware#1] SMP PTI
[  496.370111] CPU: 0 PID: 1179 Comm: mount Not tainted 6.1.10-6.ph5-esx vmware#1-photon
[  496.370124] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.9318676.B64.1807270745 07/27/2018
[  496.370146] RIP: 0010:fortify_panic+0x13/0x15

- fixes this by replacing strlen with strnlen

Change-Id: I1b7f1880789b18d89dfe5b3515779bdcb3a4bb6f
Signed-off-by: Ankit Jain <ankitja@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/20573
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Tapas Kundu <tkundu@vmware.com>
gerrit-photon pushed a commit that referenced this issue Mar 11, 2024
- If strlen(buffer) and buffer size is 100
and if buffer consist of non-null terminating
100bytes then __fortify_strlen() detects buffer
overflow and hit BUG() in kernel code.

[  496.370015] detected buffer overflow in __fortify_strlen
[  496.370079] ------------[ cut here ]------------
[  496.370081] kernel BUG at lib/string_helpers.c:1027!
[  496.370101] invalid opcode: 0000 [#1] SMP PTI
[  496.370111] CPU: 0 PID: 1179 Comm: mount Not tainted 6.1.10-6.ph5-esx #1-photon
[  496.370124] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.9318676.B64.1807270745 07/27/2018
[  496.370146] RIP: 0010:fortify_panic+0x13/0x15

- fixes this by replacing strlen with strnlen

Change-Id: I1b7f1880789b18d89dfe5b3515779bdcb3a4bb6f
Signed-off-by: Ankit Jain <ankitja@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/20573
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Tapas Kundu <tkundu@vmware.com>
Signed-off-by: Ankit Jain <ankitja@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/c/photon/+/21922
Reviewed-by: Ajay Kaher <akaher@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants