Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix typo 'acceptible' #203

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

TheTypoMaster
Copy link

No description provided.

@flarn2006
Copy link

Username checks out

@nkeck720
Copy link

Indeed
On Oct 10, 2015 7:27 PM, "flarn2006" notifications@github.com wrote:

Username checks out


Reply to this email directly or view it on GitHub
#203 (comment).

@sagarpanchal
Copy link

@nkeck720
Copy link

I don't see why every little spelling error in the commentary. Comments do
not affect the compiled result, guys. If there is a spelling error in the
code, by all means fix it, but please don't waste Linus' time with PRs
about spelling mistakes that don't even help the kernel, other than maybe
in the readability department. Spelling in the comments does not matter
unless you absolutely cannot tell what is being said.
On Oct 12, 2015 5:54 AM, "Sagar Panchal" notifications@github.com wrote:

https://en.wiktionary.org/wiki/acceptible


Reply to this email directly or view it on GitHub
#203 (comment).

@nkeck720
Copy link

Meant to say, "I don't see why you try to fix every little spelling error
in the commentary."
On Oct 14, 2015 7:14 AM, "Noah Keck" noahkeck72@gmail.com wrote:

I don't see why every little spelling error in the commentary. Comments do
not affect the compiled result, guys. If there is a spelling error in the
code, by all means fix it, but please don't waste Linus' time with PRs
about spelling mistakes that don't even help the kernel, other than maybe
in the readability department. Spelling in the comments does not matter
unless you absolutely cannot tell what is being said.
On Oct 12, 2015 5:54 AM, "Sagar Panchal" notifications@github.com wrote:

https://en.wiktionary.org/wiki/acceptible


Reply to this email directly or view it on GitHub
#203 (comment).

@TheTypoMaster
Copy link
Author

@nkeck720 This is a bot :)

@TheTypoMaster
Copy link
Author

I should add this:
Hi! I'm a bot that checks GitHub for spelling mistakes, and I found one in your repository. When it
should be 'acceptable', you typed 'acceptible'. I created this pull request to fix it!

If you think there is anything wrong with this pull request or just have a question, be kind to mail me
at thetypomaster@hotmail.com (professional email, huh?). I’ll try to address the problem as soon as
I’m aware of it.

Looking for the source code of this bot? Well, you have to be patient! The bot is under development
and I will publish the source code as soon as I’m finished with it.

If you decide to close this pull request, pleace specify why before doing so.

With kind regards,
TheTypoMaster

@petr0001
Copy link

petr0001 commented Mar 4, 2016

@TheTypoMaster You have a typo ("pleace") in your comment here on github:
"...pleace specify why before doing so"

@TheTypoMaster
Copy link
Author

@petr0001 Yep!

@TheTypoMaster
Copy link
Author

Jesus christ people hate this bot

@TheTypoMaster
Copy link
Author

What did I expect tbh

@LuyangLi
Copy link

LuyangLi commented Apr 6, 2016

I'm not gonna hate this bot but you are making a pulling request to the Linux kernel which is..

@ghost
Copy link

ghost commented Apr 6, 2016

The issue isn't your bot. The Linux project just does not accept Pull Requests on Github. There is a system for emailing patches to the Linux mailing list, search and you shall find.

@ghost
Copy link

ghost commented Apr 6, 2016

Maybe it would be a good idea for your bot to specifically blacklist Github repos which do not accept PR:s?

matt-auld pushed a commit to matt-auld/linux that referenced this pull request Mar 17, 2017
When manually overwriting the HWS, rather than assume irq_seqno_barrier
does the right thing, we can explicitly flush the cacheline instead.
This avoids us calling the engine->irq_seqno_barrier() from an illegal
context:

[ 1472.651797] BUG: scheduling while atomic: migration/0/11/0x00000002
[ 1472.651807] Modules linked in: ctr ccm arc4 snd_hda_codec_hdmi bnep rfcomm iwldvm snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel mac80211 snd_hda_codec snd_hda_core snd_pcm dm_multipath snd_hwdep intel_powerclamp coretemp snd_seq_midi crct10dif_pclmul snd_seq_midi_event crc32_pclmul iwlwifi ghash_clmulni_intel btusb snd_rawmidi btrtl aesni_intel btbcm aes_x86_64 crypto_simd btintel cryptd glue_helper bluetooth snd_seq cfg80211 snd_timer snd_seq_device intel_ips binfmt_misc snd mei_me soundcore mei dm_mirror dm_region_hash dm_log i915 intel_gtt i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea prime_numbers e1000e drm ahci libahci
[ 1472.651897] CPU: 0 PID: 11 Comm: migration/0 Tainted: G     U          4.11.0-rc1+ torvalds#203
[ 1472.651899] Hardware name: LENOVO 514328U/514328U, BIOS 6QET44WW (1.14 ) 04/20/2010
[ 1472.651900] Call Trace:
[ 1472.651913]  dump_stack+0x63/0x90
[ 1472.651922]  __schedule_bug+0x5d/0x6b
[ 1472.651930]  __schedule+0x46a/0x5f0
[ 1472.651934]  schedule+0x38/0x90
[ 1472.651938]  schedule_hrtimeout_range_clock+0x85/0x110
[ 1472.651945]  ? hrtimer_init+0x10/0x10
[ 1472.651949]  schedule_hrtimeout_range+0xe/0x10
[ 1472.651952]  usleep_range+0x4d/0x60
[ 1472.652037]  gen5_seqno_barrier+0x13/0x20 [i915]
[ 1472.652101]  intel_engine_init_global_seqno+0xd7/0x160 [i915]
[ 1472.652160]  __i915_gem_set_wedged_BKL+0xa0/0x180 [i915]
[ 1472.652166]  multi_cpu_stop+0xbb/0xe0
[ 1472.652170]  ? cpu_stop_queue_work+0x90/0x90
[ 1472.652174]  cpu_stopper_thread+0x82/0x110
[ 1472.652179]  smpboot_thread_fn+0x137/0x190
[ 1472.652184]  kthread+0xf7/0x130
[ 1472.652187]  ? sort_range+0x20/0x20
[ 1472.652191]  ? kthread_park+0x90/0x90
[ 1472.652195]  ret_from_fork+0x2c/0x40

Testcase: igt/gem_eio #ilk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170314111452.9375-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
torvalds pushed a commit that referenced this pull request May 30, 2018
Dasd uses completion_data from struct request to store per request
private data - this is problematic since this member is part of a
union which is also used by IO schedulers.
Let the block layer maintain space for per request data behind each
struct request.

Fixes crashes on block layer timeouts like this one:

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000000000000000 TEID: 0000000000000483
Fault in home space mode while using kernel ASCE.
AS:0000000001308007 R3:00000000fffc8007 S:00000000fffcc000 P:000000000000013d
Oops: 0004 ilc:2 [#1] PREEMPT SMP
Modules linked in: [...]
CPU: 0 PID: 1480 Comm: kworker/0:2H Not tainted 4.17.0-rc4-00046-gaa3bcd43b5af #203
Hardware name: IBM 3906 M02 702 (LPAR)
Workqueue: kblockd blk_mq_timeout_work
Krnl PSW : 0000000067ac406b 00000000b6960308 (do_raw_spin_trylock+0x30/0x78)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000c00 0000000000000000 0000000000000000 0000000000000001
           0000000000b9d3c8 0000000000000000 0000000000000001 00000000cf9639d8
           0000000000000000 0700000000000000 0000000000000000 000000000099f09e
           0000000000000000 000000000076e9d0 000000006247bb08 000000006247bae0
Krnl Code: 00000000001c159c: b90400c2           lgr     %r12,%r2
           00000000001c15a0: a7180000           lhi     %r1,0
          #00000000001c15a4: 583003a4           l       %r3,932
          >00000000001c15a8: ba132000           cs      %r1,%r3,0(%r2)
           00000000001c15ac: a7180001           lhi     %r1,1
           00000000001c15b0: a784000b           brc     8,1c15c6
           00000000001c15b4: c0e5004e72aa       brasl   %r14,b8fb08
           00000000001c15ba: 1812               lr      %r1,%r2
Call Trace:
([<0700000000000000>] 0x700000000000000)
 [<0000000000b9d3d2>] _raw_spin_lock_irqsave+0x7a/0xb8
 [<000000000099f09e>] dasd_times_out+0x46/0x278
 [<000000000076ea6e>] blk_mq_terminate_expired+0x9e/0x108
 [<000000000077497a>] bt_for_each+0x102/0x130
 [<0000000000774e54>] blk_mq_queue_tag_busy_iter+0x74/0xd8
 [<000000000076fea0>] blk_mq_timeout_work+0x260/0x320
 [<0000000000169dd4>] process_one_work+0x3bc/0x708
 [<000000000016a382>] worker_thread+0x262/0x408
 [<00000000001723a8>] kthread+0x160/0x178
 [<0000000000b9e73a>] kernel_thread_starter+0x6/0xc
 [<0000000000b9e734>] kernel_thread_starter+0x0/0xc
INFO: lockdep is turned off.
Last Breaking-Event-Address:
 [<0000000000b9d3cc>] _raw_spin_lock_irqsave+0x74/0xb8

Kernel panic - not syncing: Fatal exception: panic_on_oops

Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
frank-w referenced this pull request in frank-w/BPI-Router-Linux Jul 8, 2018
[ Upstream commit f0f59a2 ]

Dasd uses completion_data from struct request to store per request
private data - this is problematic since this member is part of a
union which is also used by IO schedulers.
Let the block layer maintain space for per request data behind each
struct request.

Fixes crashes on block layer timeouts like this one:

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000000000000000 TEID: 0000000000000483
Fault in home space mode while using kernel ASCE.
AS:0000000001308007 R3:00000000fffc8007 S:00000000fffcc000 P:000000000000013d
Oops: 0004 ilc:2 [#1] PREEMPT SMP
Modules linked in: [...]
CPU: 0 PID: 1480 Comm: kworker/0:2H Not tainted 4.17.0-rc4-00046-gaa3bcd43b5af #203
Hardware name: IBM 3906 M02 702 (LPAR)
Workqueue: kblockd blk_mq_timeout_work
Krnl PSW : 0000000067ac406b 00000000b6960308 (do_raw_spin_trylock+0x30/0x78)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000c00 0000000000000000 0000000000000000 0000000000000001
           0000000000b9d3c8 0000000000000000 0000000000000001 00000000cf9639d8
           0000000000000000 0700000000000000 0000000000000000 000000000099f09e
           0000000000000000 000000000076e9d0 000000006247bb08 000000006247bae0
Krnl Code: 00000000001c159c: b90400c2           lgr     %r12,%r2
           00000000001c15a0: a7180000           lhi     %r1,0
          #00000000001c15a4: 583003a4           l       %r3,932
          >00000000001c15a8: ba132000           cs      %r1,%r3,0(%r2)
           00000000001c15ac: a7180001           lhi     %r1,1
           00000000001c15b0: a784000b           brc     8,1c15c6
           00000000001c15b4: c0e5004e72aa       brasl   %r14,b8fb08
           00000000001c15ba: 1812               lr      %r1,%r2
Call Trace:
([<0700000000000000>] 0x700000000000000)
 [<0000000000b9d3d2>] _raw_spin_lock_irqsave+0x7a/0xb8
 [<000000000099f09e>] dasd_times_out+0x46/0x278
 [<000000000076ea6e>] blk_mq_terminate_expired+0x9e/0x108
 [<000000000077497a>] bt_for_each+0x102/0x130
 [<0000000000774e54>] blk_mq_queue_tag_busy_iter+0x74/0xd8
 [<000000000076fea0>] blk_mq_timeout_work+0x260/0x320
 [<0000000000169dd4>] process_one_work+0x3bc/0x708
 [<000000000016a382>] worker_thread+0x262/0x408
 [<00000000001723a8>] kthread+0x160/0x178
 [<0000000000b9e73a>] kernel_thread_starter+0x6/0xc
 [<0000000000b9e734>] kernel_thread_starter+0x0/0xc
INFO: lockdep is turned off.
Last Breaking-Event-Address:
 [<0000000000b9d3cc>] _raw_spin_lock_irqsave+0x74/0xb8

Kernel panic - not syncing: Fatal exception: panic_on_oops

Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ruscur pushed a commit to ruscur/linux that referenced this pull request Feb 26, 2020
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
torvalds#12:
Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such

WARNING: braces {} are not necessary for single statement blocks
torvalds#163: FILE: fs/proc/generic.c:536:
+	if (pde->proc_ops->proc_flags & PROC_ENTRY_PERMANENT) {
+		pde->flags |= PROC_ENTRY_PERMANENT;
+	}

WARNING: line over 80 characters
torvalds#203: FILE: fs/proc/generic.c:676:
+			WARN(1, "removing permanent /proc entry '%s'", de->name);

WARNING: braces {} are not necessary for single statement blocks
torvalds#207: FILE: fs/proc/generic.c:680:
+			if (S_ISDIR(de->mode)) {
+				parent->nlink--;
+			}

WARNING: line over 80 characters
torvalds#244: FILE: fs/proc/inode.c:198:
+static loff_t pde_lseek(struct proc_dir_entry *pde, struct file *file, loff_t offset, int whence)

WARNING: line over 80 characters
torvalds#274: FILE: fs/proc/inode.c:222:
+static ssize_t pde_read(struct proc_dir_entry *pde, struct file *file, char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#303: FILE: fs/proc/inode.c:246:
+static ssize_t pde_write(struct proc_dir_entry *pde, struct file *file, const char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#332: FILE: fs/proc/inode.c:270:
+static __poll_t pde_poll(struct proc_dir_entry *pde, struct file *file, struct poll_table_struct *pts)

WARNING: line over 80 characters
torvalds#361: FILE: fs/proc/inode.c:294:
+static long pde_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#391: FILE: fs/proc/inode.c:319:
+static long pde_compat_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#421: FILE: fs/proc/inode.c:343:
+static int pde_mmap(struct proc_dir_entry *pde, struct file *file, struct vm_area_struct *vma)

WARNING: line over 80 characters
torvalds#452: FILE: fs/proc/inode.c:368:
+pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned long orig_addr,

WARNING: line over 80 characters
torvalds#489: FILE: fs/proc/inode.c:393:
+		return pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: line over 80 characters
torvalds#491: FILE: fs/proc/inode.c:395:
+		rv = pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: braces {} are not necessary for single statement blocks
torvalds#518: FILE: fs/proc/inode.c:470:
+		if (release) {
+			return release(inode, file);
+		}

total: 0 errors, 15 warnings, 462 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/proc-faster-open-read-close-with-permanent-files.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
ruscur pushed a commit to ruscur/linux that referenced this pull request Mar 2, 2020
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
torvalds#12:
Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such

WARNING: braces {} are not necessary for single statement blocks
torvalds#163: FILE: fs/proc/generic.c:536:
+	if (pde->proc_ops->proc_flags & PROC_ENTRY_PERMANENT) {
+		pde->flags |= PROC_ENTRY_PERMANENT;
+	}

WARNING: line over 80 characters
torvalds#203: FILE: fs/proc/generic.c:676:
+			WARN(1, "removing permanent /proc entry '%s'", de->name);

WARNING: braces {} are not necessary for single statement blocks
torvalds#207: FILE: fs/proc/generic.c:680:
+			if (S_ISDIR(de->mode)) {
+				parent->nlink--;
+			}

WARNING: line over 80 characters
torvalds#244: FILE: fs/proc/inode.c:198:
+static loff_t pde_lseek(struct proc_dir_entry *pde, struct file *file, loff_t offset, int whence)

WARNING: line over 80 characters
torvalds#274: FILE: fs/proc/inode.c:222:
+static ssize_t pde_read(struct proc_dir_entry *pde, struct file *file, char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#303: FILE: fs/proc/inode.c:246:
+static ssize_t pde_write(struct proc_dir_entry *pde, struct file *file, const char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#332: FILE: fs/proc/inode.c:270:
+static __poll_t pde_poll(struct proc_dir_entry *pde, struct file *file, struct poll_table_struct *pts)

WARNING: line over 80 characters
torvalds#361: FILE: fs/proc/inode.c:294:
+static long pde_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#391: FILE: fs/proc/inode.c:319:
+static long pde_compat_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#421: FILE: fs/proc/inode.c:343:
+static int pde_mmap(struct proc_dir_entry *pde, struct file *file, struct vm_area_struct *vma)

WARNING: line over 80 characters
torvalds#452: FILE: fs/proc/inode.c:368:
+pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned long orig_addr,

WARNING: line over 80 characters
torvalds#489: FILE: fs/proc/inode.c:393:
+		return pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: line over 80 characters
torvalds#491: FILE: fs/proc/inode.c:395:
+		rv = pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: braces {} are not necessary for single statement blocks
torvalds#518: FILE: fs/proc/inode.c:470:
+		if (release) {
+			return release(inode, file);
+		}

total: 0 errors, 15 warnings, 462 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/proc-faster-open-read-close-with-permanent-files.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
ruscur pushed a commit to ruscur/linux that referenced this pull request Mar 4, 2020
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
torvalds#12:
Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such

WARNING: braces {} are not necessary for single statement blocks
torvalds#163: FILE: fs/proc/generic.c:536:
+	if (pde->proc_ops->proc_flags & PROC_ENTRY_PERMANENT) {
+		pde->flags |= PROC_ENTRY_PERMANENT;
+	}

WARNING: line over 80 characters
torvalds#203: FILE: fs/proc/generic.c:676:
+			WARN(1, "removing permanent /proc entry '%s'", de->name);

WARNING: braces {} are not necessary for single statement blocks
torvalds#207: FILE: fs/proc/generic.c:680:
+			if (S_ISDIR(de->mode)) {
+				parent->nlink--;
+			}

WARNING: line over 80 characters
torvalds#244: FILE: fs/proc/inode.c:198:
+static loff_t pde_lseek(struct proc_dir_entry *pde, struct file *file, loff_t offset, int whence)

WARNING: line over 80 characters
torvalds#274: FILE: fs/proc/inode.c:222:
+static ssize_t pde_read(struct proc_dir_entry *pde, struct file *file, char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#303: FILE: fs/proc/inode.c:246:
+static ssize_t pde_write(struct proc_dir_entry *pde, struct file *file, const char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#332: FILE: fs/proc/inode.c:270:
+static __poll_t pde_poll(struct proc_dir_entry *pde, struct file *file, struct poll_table_struct *pts)

WARNING: line over 80 characters
torvalds#361: FILE: fs/proc/inode.c:294:
+static long pde_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#391: FILE: fs/proc/inode.c:319:
+static long pde_compat_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#421: FILE: fs/proc/inode.c:343:
+static int pde_mmap(struct proc_dir_entry *pde, struct file *file, struct vm_area_struct *vma)

WARNING: line over 80 characters
torvalds#452: FILE: fs/proc/inode.c:368:
+pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned long orig_addr,

WARNING: line over 80 characters
torvalds#489: FILE: fs/proc/inode.c:393:
+		return pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: line over 80 characters
torvalds#491: FILE: fs/proc/inode.c:395:
+		rv = pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: braces {} are not necessary for single statement blocks
torvalds#518: FILE: fs/proc/inode.c:470:
+		if (release) {
+			return release(inode, file);
+		}

total: 0 errors, 15 warnings, 462 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/proc-faster-open-read-close-with-permanent-files.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
ruscur pushed a commit to ruscur/linux that referenced this pull request Mar 5, 2020
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
torvalds#12:
Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such

WARNING: braces {} are not necessary for single statement blocks
torvalds#163: FILE: fs/proc/generic.c:536:
+	if (pde->proc_ops->proc_flags & PROC_ENTRY_PERMANENT) {
+		pde->flags |= PROC_ENTRY_PERMANENT;
+	}

WARNING: line over 80 characters
torvalds#203: FILE: fs/proc/generic.c:676:
+			WARN(1, "removing permanent /proc entry '%s'", de->name);

WARNING: braces {} are not necessary for single statement blocks
torvalds#207: FILE: fs/proc/generic.c:680:
+			if (S_ISDIR(de->mode)) {
+				parent->nlink--;
+			}

WARNING: line over 80 characters
torvalds#244: FILE: fs/proc/inode.c:198:
+static loff_t pde_lseek(struct proc_dir_entry *pde, struct file *file, loff_t offset, int whence)

WARNING: line over 80 characters
torvalds#274: FILE: fs/proc/inode.c:222:
+static ssize_t pde_read(struct proc_dir_entry *pde, struct file *file, char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#303: FILE: fs/proc/inode.c:246:
+static ssize_t pde_write(struct proc_dir_entry *pde, struct file *file, const char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#332: FILE: fs/proc/inode.c:270:
+static __poll_t pde_poll(struct proc_dir_entry *pde, struct file *file, struct poll_table_struct *pts)

WARNING: line over 80 characters
torvalds#361: FILE: fs/proc/inode.c:294:
+static long pde_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#391: FILE: fs/proc/inode.c:319:
+static long pde_compat_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#421: FILE: fs/proc/inode.c:343:
+static int pde_mmap(struct proc_dir_entry *pde, struct file *file, struct vm_area_struct *vma)

WARNING: line over 80 characters
torvalds#452: FILE: fs/proc/inode.c:368:
+pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned long orig_addr,

WARNING: line over 80 characters
torvalds#489: FILE: fs/proc/inode.c:393:
+		return pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: line over 80 characters
torvalds#491: FILE: fs/proc/inode.c:395:
+		rv = pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: braces {} are not necessary for single statement blocks
torvalds#518: FILE: fs/proc/inode.c:470:
+		if (release) {
+			return release(inode, file);
+		}

total: 0 errors, 15 warnings, 462 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/proc-faster-open-read-close-with-permanent-files.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
ruscur pushed a commit to ruscur/linux that referenced this pull request Mar 10, 2020
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
torvalds#12:
Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such

WARNING: braces {} are not necessary for single statement blocks
torvalds#163: FILE: fs/proc/generic.c:536:
+	if (pde->proc_ops->proc_flags & PROC_ENTRY_PERMANENT) {
+		pde->flags |= PROC_ENTRY_PERMANENT;
+	}

WARNING: line over 80 characters
torvalds#203: FILE: fs/proc/generic.c:676:
+			WARN(1, "removing permanent /proc entry '%s'", de->name);

WARNING: braces {} are not necessary for single statement blocks
torvalds#207: FILE: fs/proc/generic.c:680:
+			if (S_ISDIR(de->mode)) {
+				parent->nlink--;
+			}

WARNING: line over 80 characters
torvalds#244: FILE: fs/proc/inode.c:198:
+static loff_t pde_lseek(struct proc_dir_entry *pde, struct file *file, loff_t offset, int whence)

WARNING: line over 80 characters
torvalds#274: FILE: fs/proc/inode.c:222:
+static ssize_t pde_read(struct proc_dir_entry *pde, struct file *file, char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#303: FILE: fs/proc/inode.c:246:
+static ssize_t pde_write(struct proc_dir_entry *pde, struct file *file, const char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#332: FILE: fs/proc/inode.c:270:
+static __poll_t pde_poll(struct proc_dir_entry *pde, struct file *file, struct poll_table_struct *pts)

WARNING: line over 80 characters
torvalds#361: FILE: fs/proc/inode.c:294:
+static long pde_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#391: FILE: fs/proc/inode.c:319:
+static long pde_compat_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#421: FILE: fs/proc/inode.c:343:
+static int pde_mmap(struct proc_dir_entry *pde, struct file *file, struct vm_area_struct *vma)

WARNING: line over 80 characters
torvalds#452: FILE: fs/proc/inode.c:368:
+pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned long orig_addr,

WARNING: line over 80 characters
torvalds#489: FILE: fs/proc/inode.c:393:
+		return pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: line over 80 characters
torvalds#491: FILE: fs/proc/inode.c:395:
+		rv = pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: braces {} are not necessary for single statement blocks
torvalds#518: FILE: fs/proc/inode.c:470:
+		if (release) {
+			return release(inode, file);
+		}

total: 0 errors, 15 warnings, 462 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/proc-faster-open-read-close-with-permanent-files.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
ruscur pushed a commit to ruscur/linux that referenced this pull request Mar 12, 2020
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
torvalds#12:
Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such

WARNING: braces {} are not necessary for single statement blocks
torvalds#163: FILE: fs/proc/generic.c:536:
+	if (pde->proc_ops->proc_flags & PROC_ENTRY_PERMANENT) {
+		pde->flags |= PROC_ENTRY_PERMANENT;
+	}

WARNING: line over 80 characters
torvalds#203: FILE: fs/proc/generic.c:676:
+			WARN(1, "removing permanent /proc entry '%s'", de->name);

WARNING: braces {} are not necessary for single statement blocks
torvalds#207: FILE: fs/proc/generic.c:680:
+			if (S_ISDIR(de->mode)) {
+				parent->nlink--;
+			}

WARNING: line over 80 characters
torvalds#244: FILE: fs/proc/inode.c:198:
+static loff_t pde_lseek(struct proc_dir_entry *pde, struct file *file, loff_t offset, int whence)

WARNING: line over 80 characters
torvalds#274: FILE: fs/proc/inode.c:222:
+static ssize_t pde_read(struct proc_dir_entry *pde, struct file *file, char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#303: FILE: fs/proc/inode.c:246:
+static ssize_t pde_write(struct proc_dir_entry *pde, struct file *file, const char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#332: FILE: fs/proc/inode.c:270:
+static __poll_t pde_poll(struct proc_dir_entry *pde, struct file *file, struct poll_table_struct *pts)

WARNING: line over 80 characters
torvalds#361: FILE: fs/proc/inode.c:294:
+static long pde_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#391: FILE: fs/proc/inode.c:319:
+static long pde_compat_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#421: FILE: fs/proc/inode.c:343:
+static int pde_mmap(struct proc_dir_entry *pde, struct file *file, struct vm_area_struct *vma)

WARNING: line over 80 characters
torvalds#452: FILE: fs/proc/inode.c:368:
+pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned long orig_addr,

WARNING: line over 80 characters
torvalds#489: FILE: fs/proc/inode.c:393:
+		return pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: line over 80 characters
torvalds#491: FILE: fs/proc/inode.c:395:
+		rv = pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: braces {} are not necessary for single statement blocks
torvalds#518: FILE: fs/proc/inode.c:470:
+		if (release) {
+			return release(inode, file);
+		}

total: 0 errors, 15 warnings, 462 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/proc-faster-open-read-close-with-permanent-files.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
ruscur pushed a commit to ruscur/linux that referenced this pull request Mar 13, 2020
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
torvalds#12:
Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such

WARNING: braces {} are not necessary for single statement blocks
torvalds#163: FILE: fs/proc/generic.c:536:
+	if (pde->proc_ops->proc_flags & PROC_ENTRY_PERMANENT) {
+		pde->flags |= PROC_ENTRY_PERMANENT;
+	}

WARNING: line over 80 characters
torvalds#203: FILE: fs/proc/generic.c:676:
+			WARN(1, "removing permanent /proc entry '%s'", de->name);

WARNING: braces {} are not necessary for single statement blocks
torvalds#207: FILE: fs/proc/generic.c:680:
+			if (S_ISDIR(de->mode)) {
+				parent->nlink--;
+			}

WARNING: line over 80 characters
torvalds#244: FILE: fs/proc/inode.c:198:
+static loff_t pde_lseek(struct proc_dir_entry *pde, struct file *file, loff_t offset, int whence)

WARNING: line over 80 characters
torvalds#274: FILE: fs/proc/inode.c:222:
+static ssize_t pde_read(struct proc_dir_entry *pde, struct file *file, char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#303: FILE: fs/proc/inode.c:246:
+static ssize_t pde_write(struct proc_dir_entry *pde, struct file *file, const char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#332: FILE: fs/proc/inode.c:270:
+static __poll_t pde_poll(struct proc_dir_entry *pde, struct file *file, struct poll_table_struct *pts)

WARNING: line over 80 characters
torvalds#361: FILE: fs/proc/inode.c:294:
+static long pde_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#391: FILE: fs/proc/inode.c:319:
+static long pde_compat_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#421: FILE: fs/proc/inode.c:343:
+static int pde_mmap(struct proc_dir_entry *pde, struct file *file, struct vm_area_struct *vma)

WARNING: line over 80 characters
torvalds#452: FILE: fs/proc/inode.c:368:
+pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned long orig_addr,

WARNING: line over 80 characters
torvalds#489: FILE: fs/proc/inode.c:393:
+		return pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: line over 80 characters
torvalds#491: FILE: fs/proc/inode.c:395:
+		rv = pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: braces {} are not necessary for single statement blocks
torvalds#518: FILE: fs/proc/inode.c:470:
+		if (release) {
+			return release(inode, file);
+		}

total: 0 errors, 15 warnings, 462 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/proc-faster-open-read-close-with-permanent-files.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
ruscur pushed a commit to ruscur/linux that referenced this pull request Mar 16, 2020
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
torvalds#12:
Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such

WARNING: braces {} are not necessary for single statement blocks
torvalds#163: FILE: fs/proc/generic.c:536:
+	if (pde->proc_ops->proc_flags & PROC_ENTRY_PERMANENT) {
+		pde->flags |= PROC_ENTRY_PERMANENT;
+	}

WARNING: line over 80 characters
torvalds#203: FILE: fs/proc/generic.c:676:
+			WARN(1, "removing permanent /proc entry '%s'", de->name);

WARNING: braces {} are not necessary for single statement blocks
torvalds#207: FILE: fs/proc/generic.c:680:
+			if (S_ISDIR(de->mode)) {
+				parent->nlink--;
+			}

WARNING: line over 80 characters
torvalds#244: FILE: fs/proc/inode.c:198:
+static loff_t pde_lseek(struct proc_dir_entry *pde, struct file *file, loff_t offset, int whence)

WARNING: line over 80 characters
torvalds#274: FILE: fs/proc/inode.c:222:
+static ssize_t pde_read(struct proc_dir_entry *pde, struct file *file, char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#303: FILE: fs/proc/inode.c:246:
+static ssize_t pde_write(struct proc_dir_entry *pde, struct file *file, const char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#332: FILE: fs/proc/inode.c:270:
+static __poll_t pde_poll(struct proc_dir_entry *pde, struct file *file, struct poll_table_struct *pts)

WARNING: line over 80 characters
torvalds#361: FILE: fs/proc/inode.c:294:
+static long pde_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#391: FILE: fs/proc/inode.c:319:
+static long pde_compat_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#421: FILE: fs/proc/inode.c:343:
+static int pde_mmap(struct proc_dir_entry *pde, struct file *file, struct vm_area_struct *vma)

WARNING: line over 80 characters
torvalds#452: FILE: fs/proc/inode.c:368:
+pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned long orig_addr,

WARNING: line over 80 characters
torvalds#489: FILE: fs/proc/inode.c:393:
+		return pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: line over 80 characters
torvalds#491: FILE: fs/proc/inode.c:395:
+		rv = pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: braces {} are not necessary for single statement blocks
torvalds#518: FILE: fs/proc/inode.c:470:
+		if (release) {
+			return release(inode, file);
+		}

total: 0 errors, 15 warnings, 462 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/proc-faster-open-read-close-with-permanent-files.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
ruscur pushed a commit to ruscur/linux that referenced this pull request Mar 23, 2020
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
torvalds#12:
Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such

WARNING: braces {} are not necessary for single statement blocks
torvalds#163: FILE: fs/proc/generic.c:536:
+	if (pde->proc_ops->proc_flags & PROC_ENTRY_PERMANENT) {
+		pde->flags |= PROC_ENTRY_PERMANENT;
+	}

WARNING: line over 80 characters
torvalds#203: FILE: fs/proc/generic.c:676:
+			WARN(1, "removing permanent /proc entry '%s'", de->name);

WARNING: braces {} are not necessary for single statement blocks
torvalds#207: FILE: fs/proc/generic.c:680:
+			if (S_ISDIR(de->mode)) {
+				parent->nlink--;
+			}

WARNING: line over 80 characters
torvalds#244: FILE: fs/proc/inode.c:198:
+static loff_t pde_lseek(struct proc_dir_entry *pde, struct file *file, loff_t offset, int whence)

WARNING: line over 80 characters
torvalds#274: FILE: fs/proc/inode.c:222:
+static ssize_t pde_read(struct proc_dir_entry *pde, struct file *file, char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#303: FILE: fs/proc/inode.c:246:
+static ssize_t pde_write(struct proc_dir_entry *pde, struct file *file, const char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#332: FILE: fs/proc/inode.c:270:
+static __poll_t pde_poll(struct proc_dir_entry *pde, struct file *file, struct poll_table_struct *pts)

WARNING: line over 80 characters
torvalds#361: FILE: fs/proc/inode.c:294:
+static long pde_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#391: FILE: fs/proc/inode.c:319:
+static long pde_compat_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#421: FILE: fs/proc/inode.c:343:
+static int pde_mmap(struct proc_dir_entry *pde, struct file *file, struct vm_area_struct *vma)

WARNING: line over 80 characters
torvalds#452: FILE: fs/proc/inode.c:368:
+pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned long orig_addr,

WARNING: line over 80 characters
torvalds#489: FILE: fs/proc/inode.c:393:
+		return pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: line over 80 characters
torvalds#491: FILE: fs/proc/inode.c:395:
+		rv = pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: braces {} are not necessary for single statement blocks
torvalds#518: FILE: fs/proc/inode.c:470:
+		if (release) {
+			return release(inode, file);
+		}

total: 0 errors, 15 warnings, 462 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/proc-faster-open-read-close-with-permanent-files.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
ruscur pushed a commit to ruscur/linux that referenced this pull request Mar 24, 2020
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
torvalds#12:
Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such

WARNING: braces {} are not necessary for single statement blocks
torvalds#163: FILE: fs/proc/generic.c:536:
+	if (pde->proc_ops->proc_flags & PROC_ENTRY_PERMANENT) {
+		pde->flags |= PROC_ENTRY_PERMANENT;
+	}

WARNING: line over 80 characters
torvalds#203: FILE: fs/proc/generic.c:676:
+			WARN(1, "removing permanent /proc entry '%s'", de->name);

WARNING: braces {} are not necessary for single statement blocks
torvalds#207: FILE: fs/proc/generic.c:680:
+			if (S_ISDIR(de->mode)) {
+				parent->nlink--;
+			}

WARNING: line over 80 characters
torvalds#244: FILE: fs/proc/inode.c:198:
+static loff_t pde_lseek(struct proc_dir_entry *pde, struct file *file, loff_t offset, int whence)

WARNING: line over 80 characters
torvalds#274: FILE: fs/proc/inode.c:222:
+static ssize_t pde_read(struct proc_dir_entry *pde, struct file *file, char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#303: FILE: fs/proc/inode.c:246:
+static ssize_t pde_write(struct proc_dir_entry *pde, struct file *file, const char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#332: FILE: fs/proc/inode.c:270:
+static __poll_t pde_poll(struct proc_dir_entry *pde, struct file *file, struct poll_table_struct *pts)

WARNING: line over 80 characters
torvalds#361: FILE: fs/proc/inode.c:294:
+static long pde_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#391: FILE: fs/proc/inode.c:319:
+static long pde_compat_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#421: FILE: fs/proc/inode.c:343:
+static int pde_mmap(struct proc_dir_entry *pde, struct file *file, struct vm_area_struct *vma)

WARNING: line over 80 characters
torvalds#452: FILE: fs/proc/inode.c:368:
+pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned long orig_addr,

WARNING: line over 80 characters
torvalds#489: FILE: fs/proc/inode.c:393:
+		return pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: line over 80 characters
torvalds#491: FILE: fs/proc/inode.c:395:
+		rv = pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: braces {} are not necessary for single statement blocks
torvalds#518: FILE: fs/proc/inode.c:470:
+		if (release) {
+			return release(inode, file);
+		}

total: 0 errors, 15 warnings, 462 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/proc-faster-open-read-close-with-permanent-files.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
ruscur pushed a commit to ruscur/linux that referenced this pull request Mar 27, 2020
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
torvalds#12:
Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such

WARNING: braces {} are not necessary for single statement blocks
torvalds#163: FILE: fs/proc/generic.c:536:
+	if (pde->proc_ops->proc_flags & PROC_ENTRY_PERMANENT) {
+		pde->flags |= PROC_ENTRY_PERMANENT;
+	}

WARNING: line over 80 characters
torvalds#203: FILE: fs/proc/generic.c:676:
+			WARN(1, "removing permanent /proc entry '%s'", de->name);

WARNING: braces {} are not necessary for single statement blocks
torvalds#207: FILE: fs/proc/generic.c:680:
+			if (S_ISDIR(de->mode)) {
+				parent->nlink--;
+			}

WARNING: line over 80 characters
torvalds#244: FILE: fs/proc/inode.c:198:
+static loff_t pde_lseek(struct proc_dir_entry *pde, struct file *file, loff_t offset, int whence)

WARNING: line over 80 characters
torvalds#274: FILE: fs/proc/inode.c:222:
+static ssize_t pde_read(struct proc_dir_entry *pde, struct file *file, char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#303: FILE: fs/proc/inode.c:246:
+static ssize_t pde_write(struct proc_dir_entry *pde, struct file *file, const char __user *buf, size_t count, loff_t *ppos)

WARNING: line over 80 characters
torvalds#332: FILE: fs/proc/inode.c:270:
+static __poll_t pde_poll(struct proc_dir_entry *pde, struct file *file, struct poll_table_struct *pts)

WARNING: line over 80 characters
torvalds#361: FILE: fs/proc/inode.c:294:
+static long pde_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#391: FILE: fs/proc/inode.c:319:
+static long pde_compat_ioctl(struct proc_dir_entry *pde, struct file *file, unsigned int cmd, unsigned long arg)

WARNING: line over 80 characters
torvalds#421: FILE: fs/proc/inode.c:343:
+static int pde_mmap(struct proc_dir_entry *pde, struct file *file, struct vm_area_struct *vma)

WARNING: line over 80 characters
torvalds#452: FILE: fs/proc/inode.c:368:
+pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned long orig_addr,

WARNING: line over 80 characters
torvalds#489: FILE: fs/proc/inode.c:393:
+		return pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: line over 80 characters
torvalds#491: FILE: fs/proc/inode.c:395:
+		rv = pde_get_unmapped_area(pde, file, orig_addr, len, pgoff, flags);

WARNING: braces {} are not necessary for single statement blocks
torvalds#518: FILE: fs/proc/inode.c:470:
+		if (release) {
+			return release(inode, file);
+		}

total: 0 errors, 15 warnings, 462 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/proc-faster-open-read-close-with-permanent-files.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
staging-kernelci-org pushed a commit to kernelci/linux that referenced this pull request Sep 16, 2022
…ixes

WARNING: labels should not be indented
torvalds#158: FILE: ipc/msg.c:1319:
+	fail_msg_hdrs:

WARNING: labels should not be indented
torvalds#160: FILE: ipc/msg.c:1321:
+	fail_msg_bytes:

ERROR: space required after that ';' (ctx:VxV)
torvalds#203: FILE: ipc/util.h:75:
+static inline int msg_init_ns(struct ipc_namespace *ns) { return 0;}
                                                                   ^

total: 1 errors, 2 warnings, 144 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/ipc-msg-mitigate-the-lock-contention-with-percpu-counter.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Jiebin Sun <jiebin.sun@intel.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Sep 16, 2022
…ixes

WARNING: labels should not be indented
torvalds#158: FILE: ipc/msg.c:1319:
+	fail_msg_hdrs:

WARNING: labels should not be indented
torvalds#160: FILE: ipc/msg.c:1321:
+	fail_msg_bytes:

ERROR: space required after that ';' (ctx:VxV)
torvalds#203: FILE: ipc/util.h:75:
+static inline int msg_init_ns(struct ipc_namespace *ns) { return 0;}
                                                                   ^

total: 1 errors, 2 warnings, 144 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/ipc-msg-mitigate-the-lock-contention-with-percpu-counter.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Jiebin Sun <jiebin.sun@intel.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Sep 19, 2022
…ixes

WARNING: labels should not be indented
torvalds#158: FILE: ipc/msg.c:1319:
+	fail_msg_hdrs:

WARNING: labels should not be indented
torvalds#160: FILE: ipc/msg.c:1321:
+	fail_msg_bytes:

ERROR: space required after that ';' (ctx:VxV)
torvalds#203: FILE: ipc/util.h:75:
+static inline int msg_init_ns(struct ipc_namespace *ns) { return 0;}
                                                                   ^

total: 1 errors, 2 warnings, 144 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/ipc-msg-mitigate-the-lock-contention-with-percpu-counter.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Jiebin Sun <jiebin.sun@intel.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Sep 22, 2022
…ixes

WARNING: labels should not be indented
torvalds#158: FILE: ipc/msg.c:1319:
+	fail_msg_hdrs:

WARNING: labels should not be indented
torvalds#160: FILE: ipc/msg.c:1321:
+	fail_msg_bytes:

ERROR: space required after that ';' (ctx:VxV)
torvalds#203: FILE: ipc/util.h:75:
+static inline int msg_init_ns(struct ipc_namespace *ns) { return 0;}
                                                                   ^

total: 1 errors, 2 warnings, 144 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/ipc-msg-mitigate-the-lock-contention-with-percpu-counter.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Jiebin Sun <jiebin.sun@intel.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Sep 25, 2022
…ixes

WARNING: labels should not be indented
torvalds#158: FILE: ipc/msg.c:1319:
+	fail_msg_hdrs:

WARNING: labels should not be indented
torvalds#160: FILE: ipc/msg.c:1321:
+	fail_msg_bytes:

ERROR: space required after that ';' (ctx:VxV)
torvalds#203: FILE: ipc/util.h:75:
+static inline int msg_init_ns(struct ipc_namespace *ns) { return 0;}
                                                                   ^

total: 1 errors, 2 warnings, 144 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/ipc-msg-mitigate-the-lock-contention-with-percpu-counter.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Jiebin Sun <jiebin.sun@intel.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Sep 27, 2022
…ixes

WARNING: labels should not be indented
torvalds#158: FILE: ipc/msg.c:1319:
+	fail_msg_hdrs:

WARNING: labels should not be indented
torvalds#160: FILE: ipc/msg.c:1321:
+	fail_msg_bytes:

ERROR: space required after that ';' (ctx:VxV)
torvalds#203: FILE: ipc/util.h:75:
+static inline int msg_init_ns(struct ipc_namespace *ns) { return 0;}
                                                                   ^

total: 1 errors, 2 warnings, 144 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/ipc-msg-mitigate-the-lock-contention-with-percpu-counter.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Jiebin Sun <jiebin.sun@intel.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Sep 27, 2022
…ixes

WARNING: labels should not be indented
torvalds#158: FILE: ipc/msg.c:1319:
+	fail_msg_hdrs:

WARNING: labels should not be indented
torvalds#160: FILE: ipc/msg.c:1321:
+	fail_msg_bytes:

ERROR: space required after that ';' (ctx:VxV)
torvalds#203: FILE: ipc/util.h:75:
+static inline int msg_init_ns(struct ipc_namespace *ns) { return 0;}
                                                                   ^

total: 1 errors, 2 warnings, 144 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/ipc-msg-mitigate-the-lock-contention-with-percpu-counter.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Jiebin Sun <jiebin.sun@intel.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Sep 29, 2022
…ixes

WARNING: labels should not be indented
torvalds#158: FILE: ipc/msg.c:1319:
+	fail_msg_hdrs:

WARNING: labels should not be indented
torvalds#160: FILE: ipc/msg.c:1321:
+	fail_msg_bytes:

ERROR: space required after that ';' (ctx:VxV)
torvalds#203: FILE: ipc/util.h:75:
+static inline int msg_init_ns(struct ipc_namespace *ns) { return 0;}
                                                                   ^

total: 1 errors, 2 warnings, 144 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/ipc-msg-mitigate-the-lock-contention-with-percpu-counter.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Jiebin Sun <jiebin.sun@intel.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Sep 30, 2022
…ixes

WARNING: labels should not be indented
torvalds#158: FILE: ipc/msg.c:1319:
+	fail_msg_hdrs:

WARNING: labels should not be indented
torvalds#160: FILE: ipc/msg.c:1321:
+	fail_msg_bytes:

ERROR: space required after that ';' (ctx:VxV)
torvalds#203: FILE: ipc/util.h:75:
+static inline int msg_init_ns(struct ipc_namespace *ns) { return 0;}
                                                                   ^

total: 1 errors, 2 warnings, 144 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/ipc-msg-mitigate-the-lock-contention-with-percpu-counter.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Jiebin Sun <jiebin.sun@intel.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Oct 1, 2022
…ixes

WARNING: labels should not be indented
torvalds#158: FILE: ipc/msg.c:1319:
+	fail_msg_hdrs:

WARNING: labels should not be indented
torvalds#160: FILE: ipc/msg.c:1321:
+	fail_msg_bytes:

ERROR: space required after that ';' (ctx:VxV)
torvalds#203: FILE: ipc/util.h:75:
+static inline int msg_init_ns(struct ipc_namespace *ns) { return 0;}
                                                                   ^

total: 1 errors, 2 warnings, 144 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/ipc-msg-mitigate-the-lock-contention-with-percpu-counter.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Jiebin Sun <jiebin.sun@intel.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Oct 2, 2022
…ixes

WARNING: labels should not be indented
torvalds#158: FILE: ipc/msg.c:1319:
+	fail_msg_hdrs:

WARNING: labels should not be indented
torvalds#160: FILE: ipc/msg.c:1321:
+	fail_msg_bytes:

ERROR: space required after that ';' (ctx:VxV)
torvalds#203: FILE: ipc/util.h:75:
+static inline int msg_init_ns(struct ipc_namespace *ns) { return 0;}
                                                                   ^

total: 1 errors, 2 warnings, 144 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

./patches/ipc-msg-mitigate-the-lock-contention-with-percpu-counter.patch has style problems, please review.

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Jiebin Sun <jiebin.sun@intel.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Oct 25, 2022
Some use-cases and/or data patterns may benefit from larger zspages. 
Currently the limit on the number of physical pages that are linked into a
zspage is hardcoded to 4.  Higher limit changes key characteristics of a
number of the size clases, improving compactness of the pool and redusing
the amount of memory zsmalloc pool uses.

For instance, the huge size class watermark is currently set to 3264
bytes.  With order 3 zspages we have more normal classe and huge size
watermark becomes 3632.  With order 4 zspages huge size watermark becomes
3840.

Commit #1 has more numbers and some analysis.
	

This patch (of 6):

zsmalloc has 255 size classes.  Size classes contain a number of zspages,
which store objects of the same size.  zspage can consist of up to four
physical pages.  The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of objects
zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes torvalds#95-99 are merged with size class torvalds#100.  That is, each time
we store an object of size, say, 1568 bytes instead of using class torvalds#96 we
end up storing it in size class torvalds#100.  Class torvalds#100 is for objects of 1632
bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes. 
Class torvalds#100 zspages consist of 2 physical pages and can hold 5 objects. 
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class torvalds#96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages.  A 5 page class torvalds#96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class torvalds#100 will allocate.

A higher order zspage for class torvalds#96 also changes its key characteristics:
pages per-zspage and objects per-zspage.  As a result classes torvalds#96 and torvalds#100
are not merged anymore, which gives us more compact zsmalloc.

Let's take a closer look at the bottom of
/sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is torvalds#202, which stores objects of size 3264 bytes.  Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class torvalds#254, which uses a whole physical page to store every object.  To
put it slightly differently - objects in huge classes don't share physical
pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from torvalds#203 to torvalds#254.  Similarly to class size torvalds#96 above, higher
order zspages change key characteristics for some of those huge size
classes and thus those classes become normal classes, where stored objects
share physical pages.

We move huge class watermark with higher order zspages.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

1) ChromeOS memory pressure test
-----------------------------------------------------------------------------

Our standard memory pressure test, that is designed with the reproducibility
in mind.

zram is configured as a swap device, lzo-rle compression algorithm.
We captured /sys/block/zram0/mm_stat after every test and rebooted
device.

Columns per (Documentation/admin-guide/blockdev/zram.rst)

orig_data_size        mem_used_total      mem_used_max         pages_compacted
          compr_data_size         mem_limit           same_pages          huge_pages

ORDER 2 (BASE)

10353639424 2981711944 3166896128        0 3543158784   579494   825135   123707
10168573952 2932288347 3106541568        0 3499085824   565187   853137   126153
9950461952 2815911234 3035693056        0 3441090560   586696   748054   122103
9892335616 2779566152 2943459328        0 3514736640   591541   650696   119621
9993949184 2814279212 3021357056        0 3336421376   582488   711744   121273
9953226752 2856382009 3025649664        0 3512893440   564559   787861   123034
9838448640 2785481728 2997575680        0 3367219200   573282   777099   122739

ORDER 3

9509138432 2706941227 2823393280        0 3389587456   535856  1011472    90223
10105245696 2882368370 3013095424        0 3296165888   563896  1059033    94808
9531236352 2666125512 2867650560        0 3396173824   567117  1126396    88807
9561812992 2714536764 2956652544        0 3310505984   548223   827322    90992
9807470592 2790315707 2908053504        0 3378315264   563670  1020933    93725
10178371584 2948838782 3071209472        0 3329548288   548533   954546    90730
9925165056 2849839413 2958274560        0 3336978432   551464  1058302    89381

ORDER 4

9444515840 2613362645 2668232704        0 3396759552   573735  1162207    83475
10129108992 2925888488 3038351360        0 3499597824   555634  1231542    84525
9876594688 2786692282 2897006592        0 3469463552   584835  1290535    84133
10012909568 2649711847 2801512448        0 3171323904   675405   750728    80424
10120966144 2866742402 2978639872        0 3257815040   587435  1093981    83587
9578790912 2671245225 2802270208        0 3376353280   545548  1047930    80895
10108588032 2888433523 2983960576        0 3316641792   571445  1290640    81402

First, we establish that order 3 and 4 don't cause any statistically
significant change in `orig_data_size` (number of bytes we store during
the test), in other words larger zspages don't cause regressions.

T-test for order 3:

x order-2-stored
+ order-3-stored
+-----------------------------------------------------------------------------+
|+ +  +                     +  x   x  +  x   x         +    x+               x|
| |________________________AM__|_________M_____A____|__________|              |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 9.8384486e+09 1.0353639e+10 9.9532268e+09 1.0021519e+10 1.7916718e+08
+   7 9.5091384e+09 1.0178372e+10 9.8074706e+09 9.8026344e+09 2.7856206e+08
No difference proven at 95.0% confidence

T-test for order 4:

x order-2-stored
+ order-4-stored
+-----------------------------------------------------------------------------+
|                                                         +                   |
|+          +                     x  +x    xx  x +       ++   x              x|
|              |__________________|____A____M____M____________|_|             |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 9.8384486e+09 1.0353639e+10 9.9532268e+09 1.0021519e+10 1.7916718e+08
+   7 9.4445158e+09 1.0129109e+10  1.001291e+10 9.8959249e+09 2.7947784e+08
No difference proven at 95.0% confidence

Next we establish that there is a statistically significant improvement
in `mem_used_total` metrics.

T-test for order 3:

x order-2-usedmem
+ order-3-usedmem
+-----------------------------------------------------------------------------+
|+         +        +       x ++        x  + xx x       +       x            x|
|        |_________________A__M__|____________|__A________________|           |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 2.9434593e+09 3.1668961e+09 3.0256497e+09 3.0424532e+09      73235062
+   7 2.8233933e+09 3.0712095e+09 2.9566525e+09 2.9426185e+09      84630851
Difference at 95.0% confidence
	-9.98347e+07 +/- 9.21744e+07
	-3.28139% +/- 3.02961%
	(Student's t, pooled s = 7.91383e+07)

T-test for order 4:

x order-2-usedmem
+ order-4-usedmem
+-----------------------------------------------------------------------------+
|                    +                                 x                      |
|+                   +              +      x    ++ x   x *          x        x|
|             |__________________A__M__________|_____|_M__A__________|        |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 2.9434593e+09 3.1668961e+09 3.0256497e+09 3.0424532e+09      73235062
+   7 2.6682327e+09 3.0383514e+09 2.8970066e+09 2.8814248e+09 1.3098053e+08
Difference at 95.0% confidence
	-1.61028e+08 +/- 1.23591e+08
	-5.29272% +/- 4.0622%
	(Student's t, pooled s = 1.06111e+08)

Order 3 zspages also show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
+-----------------------------------------------------------------------------+
|+   +     + x+        x  +   + +             x                x    x        x|
|    |________M__A_________|_|_____________________A___________M____________| |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 3.3364214e+09 3.5431588e+09 3.4990858e+09 3.4592294e+09      80073158
+   7 3.2961659e+09 3.3961738e+09 3.3369784e+09 3.3481822e+09      39840377
Difference at 95.0% confidence
	-1.11047e+08 +/- 7.36589e+07
	-3.21017% +/- 2.12934%
	(Student's t, pooled s = 6.32415e+07)

Order 4 zspages, on the other hand, do not show any statistically significant
improvement in `mem_used_max` metrics.

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
+-----------------------------------------------------------------------------+
|+                 +           +   x     x +   +        x     +     *  x     x|
|              |_______________________A___M________________A_|_____M_______| |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 3.3364214e+09 3.5431588e+09 3.4990858e+09 3.4592294e+09      80073158
+   7 3.1713239e+09 3.4995978e+09 3.3763533e+09 3.3554221e+09 1.1609062e+08
No difference proven at 95.0% confidence

Overall, with sufficient level of confidence order 3 zspages appear to be
beneficial for these particular use-case and data patterns.

Rather expectedly we also observed lower numbers of huge-pages when zsmalloc
is configured with order 3 and order 4 zspages, for the reason already
explained.

2) Synthetic test
-----------------------------------------------------------------------------

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted VM.

orig_data_size        mem_used_total      mem_used_max         pages_compacted
          compr_data_size         mem_limit           same_pages          huge_pages

ORDER 2 (BASE)

1691807744 628091753 655187968        0 655187968       59        0    34042    34043
1691803648 628089105 655159296        0 655159296       60        0    34043    34043
1691795456 628087429 655151104        0 655151104       59        0    34046    34046
1691799552 628093723 655216640        0 655216640       60        0    34044    34044

ORDER 3

1691787264 627781464 641740800        0 641740800       59        0    33591    33591
1691795456 627794239 641789952        0 641789952       59        0    33591    33591
1691811840 627788466 641691648        0 641691648       60        0    33591    33591
1691791360 627790682 641781760        0 641781760       59        0    33591    33591

ORDER 4

1691807744 627729506 639627264        0 639627264       59        0    33432    33432
1691820032 627731485 639606784        0 639606784       59        0    33432    33432
1691799552 627725753 639623168        0 639623168       59        0    33432    33433
1691820032 627734080 639746048        0 639746048       61        0    33432    33432

Order 3 and order 4 show statistically significant improvement in
`mem_used_total` metrics.

T-test for order 3:

x order-2-usedmem-comp
+ order-3-usedmem-comp
+-----------------------------------------------------------------------------+
|++                                                                          x|
|++                                                                          x|
|AM                                                                          A|
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4  6.551511e+08 6.5521664e+08 6.5518797e+08 6.5517875e+08     29795.878
+   4 6.4169165e+08 6.4178995e+08 6.4178176e+08 6.4175104e+08         45056
Difference at 95.0% confidence
	-1.34277e+07 +/- 66089.8
	-2.04947% +/- 0.0100873%
	(Student's t, pooled s = 38195.8)

T-test for order 4:

x order-2-usedmem-comp
+ order-4-usedmem-comp
+-----------------------------------------------------------------------------+
|+                                                                           x|
|+                                                                           x|
|++                                                                          x|
|A|                                                                          A|
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4  6.551511e+08 6.5521664e+08 6.5518797e+08 6.5517875e+08     29795.878
+   4 6.3960678e+08 6.3974605e+08 6.3962726e+08 6.3965082e+08     64101.637
Difference at 95.0% confidence
	-1.55279e+07 +/- 86486.9
	-2.37003% +/- 0.0132005%
	(Student's t, pooled s = 49984.1)

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem-comp
+ order-3-maxmem-comp
+-----------------------------------------------------------------------------+
|++                                                                          x|
|++                                                                          x|
|AM                                                                          A|
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4  6.551511e+08 6.5521664e+08 6.5518797e+08 6.5517875e+08     29795.878
+   4 6.4169165e+08 6.4178995e+08 6.4178176e+08 6.4175104e+08         45056
Difference at 95.0% confidence
	-1.34277e+07 +/- 66089.8
	-2.04947% +/- 0.0100873%
	(Student's t, pooled s = 38195.8)

T-test for order 4:

x order-2-maxmem-comp
+ order-4-maxmem-comp
+-----------------------------------------------------------------------------+
|+                                                                           x|
|+                                                                           x|
|++                                                                          x|
|A|                                                                          A|
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4  6.551511e+08 6.5521664e+08 6.5518797e+08 6.5517875e+08     29795.878
+   4 6.3960678e+08 6.3974605e+08 6.3962726e+08 6.3965082e+08     64101.637
Difference at 95.0% confidence
	-1.55279e+07 +/- 86486.9
	-2.37003% +/- 0.0132005%
	(Student's t, pooled s = 49984.1)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

Data patterns that generate a considerable number of badly compressible
objects benefit from higher `huge_class_size` watermark, which is achieved
with order 4 zspages.

Link: https://lkml.kernel.org/r/20221024161213.3221725-1-senozhatsky@chromium.org
Link: https://lkml.kernel.org/r/20221024161213.3221725-2-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Oct 26, 2022
zsmalloc has 255 size classes. Size classes contain a number of zspages,
which store objects of the same size. zspage can consist of up to four
physical pages. The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of
objects zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes torvalds#95-99 are merged with size class torvalds#100. That is, each time
we store an object of size, say, 1568 bytes instead of using class torvalds#96
we end up storing it in size class torvalds#100. Class torvalds#100 is for objects of
1632 bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes.
Class torvalds#100 zspages consist of 2 physical pages and can hold 5 objects.
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class torvalds#96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages. A 5 page class torvalds#96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class torvalds#100 will allocate.

A higher order zspage for class torvalds#96 also changes its key characteristics:
pages per-zspage and objects per-zspage. As a result classes torvalds#96 and torvalds#100
are not merged anymore, which gives us more compact zsmalloc.

Of course the described effect does not apply only to size classes torvalds#96 and
We still merge classes, but less often so. In other words classes are grouped
in a more compact way, which decreases memory wastage:

zspage order               # unique size classes
     2                                69
     3                               123
     4                               191

Let's take a closer look at the bottom of /sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is torvalds#202, which stores objects of size 3264 bytes. Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class torvalds#254, which uses a whole physical page to store every object. To put
it slightly differently - objects in huge classes don't share physical pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from torvalds#203 to torvalds#254. Similarly to class size torvalds#96 above, higher order
zspages change key characteristics for some of those huge size classes and
thus those classes become normal classes, where stored objects share physical
pages.

Hence yet another consequence of higher order zspages: we move the huge
size class watermark with higher order zspages, have less huge classes and
store large objects in a more compact way.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

1) ChromeOS memory pressure test
=============================================================================

Our standard memory pressure test, that is designed with reproducibility
in mind.

zram is configured as a swap device, lzo-rle compression algorithm.
We captured /sys/block/zram0/mm_stat after every test and rebooted
device.

Columns per (Documentation/admin-guide/blockdev/zram.rst)

orig_data_size        mem_used_total      mem_used_max         pages_compacted
          compr_data_size         mem_limit           same_pages          huge_pages

ORDER 2 (BASE) zspage

10353639424 2981711944 3166896128        0 3543158784   579494   825135   123707
10168573952 2932288347 3106541568        0 3499085824   565187   853137   126153
9950461952 2815911234 3035693056        0 3441090560   586696   748054   122103
9892335616 2779566152 2943459328        0 3514736640   591541   650696   119621
9993949184 2814279212 3021357056        0 3336421376   582488   711744   121273
9953226752 2856382009 3025649664        0 3512893440   564559   787861   123034
9838448640 2785481728 2997575680        0 3367219200   573282   777099   122739

ORDER 3 zspage

9509138432 2706941227 2823393280        0 3389587456   535856  1011472    90223
10105245696 2882368370 3013095424        0 3296165888   563896  1059033    94808
9531236352 2666125512 2867650560        0 3396173824   567117  1126396    88807
9561812992 2714536764 2956652544        0 3310505984   548223   827322    90992
9807470592 2790315707 2908053504        0 3378315264   563670  1020933    93725
10178371584 2948838782 3071209472        0 3329548288   548533   954546    90730
9925165056 2849839413 2958274560        0 3336978432   551464  1058302    89381

ORDER 4 zspage

9444515840 2613362645 2668232704        0 3396759552   573735  1162207    83475
10129108992 2925888488 3038351360        0 3499597824   555634  1231542    84525
9876594688 2786692282 2897006592        0 3469463552   584835  1290535    84133
10012909568 2649711847 2801512448        0 3171323904   675405   750728    80424
10120966144 2866742402 2978639872        0 3257815040   587435  1093981    83587
9578790912 2671245225 2802270208        0 3376353280   545548  1047930    80895
10108588032 2888433523 2983960576        0 3316641792   571445  1290640    81402

First, we establish that order 3 and 4 don't cause any statistically
significant change in `orig_data_size` (number of bytes we store during
the test), in other words larger zspages don't cause regressions.

T-test for order 3:

x order-2-stored
+ order-3-stored
+-----------------------------------------------------------------------------+
|+ +  +                     +  x   x  +  x   x         +    x+               x|
| |________________________AM__|_________M_____A____|__________|              |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 9.8384486e+09 1.0353639e+10 9.9532268e+09 1.0021519e+10 1.7916718e+08
+   7 9.5091384e+09 1.0178372e+10 9.8074706e+09 9.8026344e+09 2.7856206e+08
No difference proven at 95.0% confidence

T-test for order 4:

x order-2-stored
+ order-4-stored
+-----------------------------------------------------------------------------+
|                                                         +                   |
|+          +                     x  +x    xx  x +       ++   x              x|
|              |__________________|____A____M____M____________|_|             |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 9.8384486e+09 1.0353639e+10 9.9532268e+09 1.0021519e+10 1.7916718e+08
+   7 9.4445158e+09 1.0129109e+10  1.001291e+10 9.8959249e+09 2.7947784e+08
No difference proven at 95.0% confidence

Next we establish that there is a statistically significant improvement
in `mem_used_total` metrics.

T-test for order 3:

x order-2-usedmem
+ order-3-usedmem
+-----------------------------------------------------------------------------+
|+         +        +       x ++        x  + xx x       +       x            x|
|        |_________________A__M__|____________|__A________________|           |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 2.9434593e+09 3.1668961e+09 3.0256497e+09 3.0424532e+09      73235062
+   7 2.8233933e+09 3.0712095e+09 2.9566525e+09 2.9426185e+09      84630851
Difference at 95.0% confidence
	-9.98347e+07 +/- 9.21744e+07
	-3.28139% +/- 3.02961%
	(Student's t, pooled s = 7.91383e+07)

T-test for order 4:

x order-2-usedmem
+ order-4-usedmem
+-----------------------------------------------------------------------------+
|                    +                                 x                      |
|+                   +              +      x    ++ x   x *          x        x|
|             |__________________A__M__________|_____|_M__A__________|        |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 2.9434593e+09 3.1668961e+09 3.0256497e+09 3.0424532e+09      73235062
+   7 2.6682327e+09 3.0383514e+09 2.8970066e+09 2.8814248e+09 1.3098053e+08
Difference at 95.0% confidence
	-1.61028e+08 +/- 1.23591e+08
	-5.29272% +/- 4.0622%
	(Student's t, pooled s = 1.06111e+08)

Order 3 zspages also show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
+-----------------------------------------------------------------------------+
|+   +     + x+        x  +   + +             x                x    x        x|
|    |________M__A_________|_|_____________________A___________M____________| |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 3.3364214e+09 3.5431588e+09 3.4990858e+09 3.4592294e+09      80073158
+   7 3.2961659e+09 3.3961738e+09 3.3369784e+09 3.3481822e+09      39840377
Difference at 95.0% confidence
	-1.11047e+08 +/- 7.36589e+07
	-3.21017% +/- 2.12934%
	(Student's t, pooled s = 6.32415e+07)

Order 4 zspages, on the other hand, do not show any statistically significant
improvement in `mem_used_max` metrics.

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
+-----------------------------------------------------------------------------+
|+                 +           +   x     x +   +        x     +     *  x     x|
|              |_______________________A___M________________A_|_____M_______| |
+-----------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   7 3.3364214e+09 3.5431588e+09 3.4990858e+09 3.4592294e+09      80073158
+   7 3.1713239e+09 3.4995978e+09 3.3763533e+09 3.3554221e+09 1.1609062e+08
No difference proven at 95.0% confidence

Overall, with sufficient level of confidence, order 3 zspages appear to be
beneficial for these particular use-case and data patterns.

Rather expectedly we also observed lower numbers of huge-pages when zsmalloc
is configured with order 3 and order 4 zspages, for the reason already
explained.

2) Synthetic test
=============================================================================

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted the VM.

orig_data_size       mem_used_total     mem_used_max       pages_compacted
          compr_data_size         mem_limit         same_pages       huge_pages

ORDER 2 (BASE) zspage

1691791360 628086729 655171584        0 655171584       60        0    34043
1691787264 628089196 655175680        0 655175680       60        0    34046
1691803648 628098840 655187968        0 655187968       59        0    34047
1691795456 628091503 655183872        0 655183872       60        0    34044
1691799552 628086877 655183872        0 655183872       60        0    34047

ORDER 3 zspage

1691803648 627792993 641794048        0 641794048       60        0    33591
1691787264 627779342 641708032        0 641708032       59        0    33591
1691811840 627786616 641769472        0 641769472       60        0    33591
1691803648 627794468 641818624        0 641818624       59        0    33592
1691783168 627780882 641794048        0 641794048       61        0    33591

ORDER 4 zspage

1691803648 627726635 639655936        0 639655936       60        0    33435
1691811840 627733348 639643648        0 639643648       61        0    33434
1691795456 627726290 639614976        0 639614976       60        0    33435
1691803648 627730458 639688704        0 639688704       60        0    33434
1691811840 627727771 639688704        0 639688704       60        0    33434

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
+--------------------------------------------------------------------------+
|+                                                                        x|
|+                                                                        x|
|+                                                                        x|
|++                                                                       x|
|A|                                                                       A|
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.4170803e+08 6.4181862e+08 6.4179405e+08 6.4177684e+08     42210.666
Difference at 95.0% confidence
	-1.34038e+07 +/- 44080.7
	-2.04581% +/- 0.00672802%
	(Student's t, pooled s = 30224.5)

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
+--------------------------------------------------------------------------+
|+                                                                        x|
|+                                                                        x|
|+                                                                        x|
|+                                                                        x|
|+                                                                        x|
|A                                                                        A|
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.3961498e+08  6.396887e+08 6.3965594e+08 6.3965839e+08     31408.602
Difference at 95.0% confidence
	-1.55222e+07 +/- 33126.2
	-2.36915% +/- 0.00505604%
	(Student's t, pooled s = 22713.4)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

zsmalloc object distribution analysis
=============================================================================

Order 2 (4 pages per zspage) tends to put many objects in size class 2048,
which is merged with size classes torvalds#112-torvalds#125:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            0          6146       6146       1756                2        0
    74  1216           0            1          4560       4552       1368                3        0
    76  1248           0            1          2938       2934        904                4        0
    83  1360           0            0         10971      10971       3657                1        0
    91  1488           0            0         16126      16126       5864                4        0
    94  1536           0            1          5912       5908       2217                3        0
   100  1632           0            0         11990      11990       4796                2        0
   107  1744           0            1         15771      15768       6759                3        0
   111  1808           0            1         10386      10380       4616                4        0
   126  2048           0            0         45444      45444      22722                1        0
   144  2336           0            0         47446      47446      27112                4        0
   151  2448           1            0         10760      10759       6456                3        0
   168  2720           0            0         10173      10173       6782                2        0
   190  3072           0            1          1700       1697       1275                3        0
   202  3264           0            1           290        286        232                4        0
   254  4096           0            0         34051      34051      34051                1        0

Order 3 (8 pages per zspage) changed pool characteristics and unmerged
some of the size classes, which resulted in less objects being put into
size class 2048, because there are lower size classes are now available
for more compact object storage:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            1          2996       2994        856                2        0
    72  1184           0            1          1632       1609        476                7        0
    73  1200           1            0          1445       1442        425                5        0
    74  1216           0            0          1510       1510        453                3        0
    75  1232           0            1          1495       1479        455                7        0
    76  1248           0            1          1456       1451        448                4        0
    78  1280           0            1          3040       3033        950                5        0
    79  1296           0            1          1584       1571        504                7        0
    83  1360           0            0          6375       6375       2125                1        0
    84  1376           0            1          1817       1796        632                8        0
    87  1424           0            1          6020       6006       2107                7        0
    88  1440           0            1          2108       2101        744                6        0
    89  1456           0            1          2072       2064        740                5        0
    91  1488           0            1          4169       4159       1516                4        0
    92  1504           0            1          2014       2007        742                7        0
    94  1536           0            1          3904       3900       1464                3        0
    95  1552           0            1          1890       1873        720                8        0
    96  1568           0            1          1963       1958        755                5        0
    97  1584           0            1          1980       1974        770                7        0
   100  1632           0            1          6190       6187       2476                2        0
   103  1680           0            0          6477       6477       2667                7        0
   104  1696           0            1          2256       2253        940                5        0
   105  1712           0            1          2356       2340        992                8        0
   107  1744           1            0          4697       4696       2013                3        0
   110  1792           0            1          7744       7734       3388                7        0
   111  1808           0            1          2655       2649       1180                4        0
   114  1856           0            1          8371       8365       3805                5        0
   116  1888           1            0          5863       5862       2706                6        0
   117  1904           0            1          2955       2942       1379                7        0
   118  1920           0            1          3009       2997       1416                8        0
   126  2048           0            0         25276      25276      12638                1        0
   128  2080           0            1          6060       6052       3232                8        0
   129  2096           1            0          3081       3080       1659                7        0
   134  2176           0            1         14835      14830       7912                8        0
   135  2192           0            1          2769       2758       1491                7        0
   137  2224           0            1          5082       5077       2772                6        0
   140  2272           0            1          7236       7232       4020                5        0
   144  2336           0            1          8428       8423       4816                4        0
   147  2384           0            1          5316       5313       3101                7        0
   151  2448           0            1          5445       5443       3267                3        0
   155  2512           0            0          4121       4121       2536                8        0
   158  2560           0            1          2208       2205       1380                5        0
   160  2592           0            0          1133       1133        721                7        0
   168  2720           0            0          2712       2712       1808                2        0
   177  2864           1            0          1100       1098        770                7        0
   180  2912           0            1           189        183        135                5        0
   184  2976           0            1           176        166        128                8        0
   190  3072           0            0           252        252        189                3        0
   197  3184           0            1           198        192        154                7        0
   202  3264           0            1           100         96         80                4        0
   211  3408           0            1           210        208        175                5        0
   217  3504           0            1            98         94         84                6        0
   222  3584           0            0           104        104         91                7        0
   225  3632           0            1            54         50         48                8        0
   254  4096           0            0         33591      33591      33591                1        0

Note, the huge size watermark is above 3632 and there are a number of new
normal classes available that previously were merged with the huge class.
For instance, size class torvalds#211 holds 210 objects of size 3408 and uses 175
physical pages, while previously for those objects we would have used 210
physical pages.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Oct 27, 2022
zsmalloc has 255 size classes. Size classes contain a number of zspages,
which store objects of the same size. zspage can consist of up to four
physical pages. The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of
objects zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes torvalds#95-99 are merged with size class torvalds#100. That is, each time
we store an object of size, say, 1568 bytes instead of using class torvalds#96
we end up storing it in size class torvalds#100. Class torvalds#100 is for objects of
1632 bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes.
Class torvalds#100 zspages consist of 2 physical pages and can hold 5 objects.
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class torvalds#96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages. A 5 page class torvalds#96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class torvalds#100 will allocate.

A higher order zspage for class torvalds#96 also changes its key characteristics:
pages per-zspage and objects per-zspage. As a result classes torvalds#96 and torvalds#100
are not merged anymore, which gives us more compact zsmalloc.

Of course the described effect does not apply only to size classes torvalds#96 and
We still merge classes, but less often so. In other words classes are grouped
in a more compact way, which decreases memory wastage:

zspage order               # unique size classes
     2                                69
     3                               123
     4                               191

Let's take a closer look at the bottom of /sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is torvalds#202, which stores objects of size 3264 bytes. Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class torvalds#254, which uses a whole physical page to store every object. To put
it slightly differently - objects in huge classes don't share physical pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from torvalds#203 to torvalds#254. Similarly to class size torvalds#96 above, higher order
zspages change key characteristics for some of those huge size classes and
thus those classes become normal classes, where stored objects share physical
pages.

Hence yet another consequence of higher order zspages: we move the huge
size class watermark with higher order zspages, have less huge classes and
store large objects in a more compact way.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted the VM.

orig_data_size       mem_used_total     mem_used_max       pages_compacted
          compr_data_size         mem_limit         same_pages       huge_pages

ORDER 2 (BASE) zspage

1691791360 628086729 655171584        0 655171584       60        0    34043
1691787264 628089196 655175680        0 655175680       60        0    34046
1691803648 628098840 655187968        0 655187968       59        0    34047
1691795456 628091503 655183872        0 655183872       60        0    34044
1691799552 628086877 655183872        0 655183872       60        0    34047

ORDER 3 zspage

1691803648 627792993 641794048        0 641794048       60        0    33591
1691787264 627779342 641708032        0 641708032       59        0    33591
1691811840 627786616 641769472        0 641769472       60        0    33591
1691803648 627794468 641818624        0 641818624       59        0    33592
1691783168 627780882 641794048        0 641794048       61        0    33591

ORDER 4 zspage

1691803648 627726635 639655936        0 639655936       60        0    33435
1691811840 627733348 639643648        0 639643648       61        0    33434
1691795456 627726290 639614976        0 639614976       60        0    33435
1691803648 627730458 639688704        0 639688704       60        0    33434
1691811840 627727771 639688704        0 639688704       60        0    33434

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.4170803e+08 6.4181862e+08 6.4179405e+08 6.4177684e+08     42210.666
Difference at 95.0% confidence
	-1.34038e+07 +/- 44080.7
	-2.04581% +/- 0.00672802%
	(Student's t, pooled s = 30224.5)

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.3961498e+08  6.396887e+08 6.3965594e+08 6.3965839e+08     31408.602
Difference at 95.0% confidence
	-1.55222e+07 +/- 33126.2
	-2.36915% +/- 0.00505604%
	(Student's t, pooled s = 22713.4)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

zsmalloc object distribution analysis
=============================================================================

Order 2 (4 pages per zspage) tends to put many objects in size class 2048,
which is merged with size classes torvalds#112-torvalds#125:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            0          6146       6146       1756                2        0
    74  1216           0            1          4560       4552       1368                3        0
    76  1248           0            1          2938       2934        904                4        0
    83  1360           0            0         10971      10971       3657                1        0
    91  1488           0            0         16126      16126       5864                4        0
    94  1536           0            1          5912       5908       2217                3        0
   100  1632           0            0         11990      11990       4796                2        0
   107  1744           0            1         15771      15768       6759                3        0
   111  1808           0            1         10386      10380       4616                4        0
   126  2048           0            0         45444      45444      22722                1        0
   144  2336           0            0         47446      47446      27112                4        0
   151  2448           1            0         10760      10759       6456                3        0
   168  2720           0            0         10173      10173       6782                2        0
   190  3072           0            1          1700       1697       1275                3        0
   202  3264           0            1           290        286        232                4        0
   254  4096           0            0         34051      34051      34051                1        0

Order 3 (8 pages per zspage) changed pool characteristics and unmerged
some of the size classes, which resulted in less objects being put into
size class 2048, because there are lower size classes are now available
for more compact object storage:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            1          2996       2994        856                2        0
    72  1184           0            1          1632       1609        476                7        0
    73  1200           1            0          1445       1442        425                5        0
    74  1216           0            0          1510       1510        453                3        0
    75  1232           0            1          1495       1479        455                7        0
    76  1248           0            1          1456       1451        448                4        0
    78  1280           0            1          3040       3033        950                5        0
    79  1296           0            1          1584       1571        504                7        0
    83  1360           0            0          6375       6375       2125                1        0
    84  1376           0            1          1817       1796        632                8        0
    87  1424           0            1          6020       6006       2107                7        0
    88  1440           0            1          2108       2101        744                6        0
    89  1456           0            1          2072       2064        740                5        0
    91  1488           0            1          4169       4159       1516                4        0
    92  1504           0            1          2014       2007        742                7        0
    94  1536           0            1          3904       3900       1464                3        0
    95  1552           0            1          1890       1873        720                8        0
    96  1568           0            1          1963       1958        755                5        0
    97  1584           0            1          1980       1974        770                7        0
   100  1632           0            1          6190       6187       2476                2        0
   103  1680           0            0          6477       6477       2667                7        0
   104  1696           0            1          2256       2253        940                5        0
   105  1712           0            1          2356       2340        992                8        0
   107  1744           1            0          4697       4696       2013                3        0
   110  1792           0            1          7744       7734       3388                7        0
   111  1808           0            1          2655       2649       1180                4        0
   114  1856           0            1          8371       8365       3805                5        0
   116  1888           1            0          5863       5862       2706                6        0
   117  1904           0            1          2955       2942       1379                7        0
   118  1920           0            1          3009       2997       1416                8        0
   126  2048           0            0         25276      25276      12638                1        0
   128  2080           0            1          6060       6052       3232                8        0
   129  2096           1            0          3081       3080       1659                7        0
   134  2176           0            1         14835      14830       7912                8        0
   135  2192           0            1          2769       2758       1491                7        0
   137  2224           0            1          5082       5077       2772                6        0
   140  2272           0            1          7236       7232       4020                5        0
   144  2336           0            1          8428       8423       4816                4        0
   147  2384           0            1          5316       5313       3101                7        0
   151  2448           0            1          5445       5443       3267                3        0
   155  2512           0            0          4121       4121       2536                8        0
   158  2560           0            1          2208       2205       1380                5        0
   160  2592           0            0          1133       1133        721                7        0
   168  2720           0            0          2712       2712       1808                2        0
   177  2864           1            0          1100       1098        770                7        0
   180  2912           0            1           189        183        135                5        0
   184  2976           0            1           176        166        128                8        0
   190  3072           0            0           252        252        189                3        0
   197  3184           0            1           198        192        154                7        0
   202  3264           0            1           100         96         80                4        0
   211  3408           0            1           210        208        175                5        0
   217  3504           0            1            98         94         84                6        0
   222  3584           0            0           104        104         91                7        0
   225  3632           0            1            54         50         48                8        0
   254  4096           0            0         33591      33591      33591                1        0

Note, the huge size watermark is above 3632 and there are a number of new
normal classes available that previously were merged with the huge class.
For instance, size class torvalds#211 holds 210 objects of size 3408 and uses 175
physical pages, while previously for those objects we would have used 210
physical pages.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
jonhunter pushed a commit to jonhunter/linux that referenced this pull request Oct 28, 2022
zsmalloc has 255 size classes.  Size classes contain a number of zspages,
which store objects of the same size.  zspage can consist of up to four
physical pages.  The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of objects
zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes torvalds#95-99 are merged with size class torvalds#100.  That is, each time
we store an object of size, say, 1568 bytes instead of using class torvalds#96 we
end up storing it in size class torvalds#100.  Class torvalds#100 is for objects of 1632
bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes. 
Class torvalds#100 zspages consist of 2 physical pages and can hold 5 objects. 
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class torvalds#96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages. A 5 page class torvalds#96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class torvalds#100 will allocate.

A higher order zspage for class torvalds#96 also changes its key characteristics:
pages per-zspage and objects per-zspage. As a result classes torvalds#96 and torvalds#100
are not merged anymore, which gives us more compact zsmalloc.

Of course the described effect does not apply only to size classes torvalds#96 and
We still merge classes, but less often so. In other words classes are grouped
in a more compact way, which decreases memory wastage:

zspage order               # unique size classes
     2                                69
     3                               123
     4                               191

Let's take a closer look at the bottom of /sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is torvalds#202, which stores objects of size 3264 bytes. Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class torvalds#254, which uses a whole physical page to store every object. To put
it slightly differently - objects in huge classes don't share physical pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from torvalds#203 to torvalds#254. Similarly to class size torvalds#96 above, higher order
zspages change key characteristics for some of those huge size classes and
thus those classes become normal classes, where stored objects share physical
pages.

Hence yet another consequence of higher order zspages: we move the huge
size class watermark with higher order zspages, have less huge classes and
store large objects in a more compact way.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted the VM.

orig_data_size       mem_used_total     mem_used_max       pages_compacted
          compr_data_size         mem_limit         same_pages       huge_pages

ORDER 2 (BASE) zspage

1691791360 628086729 655171584        0 655171584       60        0    34043
1691787264 628089196 655175680        0 655175680       60        0    34046
1691803648 628098840 655187968        0 655187968       59        0    34047
1691795456 628091503 655183872        0 655183872       60        0    34044
1691799552 628086877 655183872        0 655183872       60        0    34047

ORDER 3 zspage

1691803648 627792993 641794048        0 641794048       60        0    33591
1691787264 627779342 641708032        0 641708032       59        0    33591
1691811840 627786616 641769472        0 641769472       60        0    33591
1691803648 627794468 641818624        0 641818624       59        0    33592
1691783168 627780882 641794048        0 641794048       61        0    33591

ORDER 4 zspage

1691803648 627726635 639655936        0 639655936       60        0    33435
1691811840 627733348 639643648        0 639643648       61        0    33434
1691795456 627726290 639614976        0 639614976       60        0    33435
1691803648 627730458 639688704        0 639688704       60        0    33434
1691811840 627727771 639688704        0 639688704       60        0    33434

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.4170803e+08 6.4181862e+08 6.4179405e+08 6.4177684e+08     42210.666
Difference at 95.0% confidence
	-1.34038e+07 +/- 44080.7
	-2.04581% +/- 0.00672802%
	(Student's t, pooled s = 30224.5)

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.3961498e+08  6.396887e+08 6.3965594e+08 6.3965839e+08     31408.602
Difference at 95.0% confidence
	-1.55222e+07 +/- 33126.2
	-2.36915% +/- 0.00505604%
	(Student's t, pooled s = 22713.4)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

zsmalloc object distribution analysis
=============================================================================

Order 2 (4 pages per zspage) tends to put many objects in size class 2048,
which is merged with size classes torvalds#112-torvalds#125:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            0          6146       6146       1756                2        0
    74  1216           0            1          4560       4552       1368                3        0
    76  1248           0            1          2938       2934        904                4        0
    83  1360           0            0         10971      10971       3657                1        0
    91  1488           0            0         16126      16126       5864                4        0
    94  1536           0            1          5912       5908       2217                3        0
   100  1632           0            0         11990      11990       4796                2        0
   107  1744           0            1         15771      15768       6759                3        0
   111  1808           0            1         10386      10380       4616                4        0
   126  2048           0            0         45444      45444      22722                1        0
   144  2336           0            0         47446      47446      27112                4        0
   151  2448           1            0         10760      10759       6456                3        0
   168  2720           0            0         10173      10173       6782                2        0
   190  3072           0            1          1700       1697       1275                3        0
   202  3264           0            1           290        286        232                4        0
   254  4096           0            0         34051      34051      34051                1        0

Order 3 (8 pages per zspage) changed pool characteristics and unmerged
some of the size classes, which resulted in less objects being put into
size class 2048, because there are lower size classes are now available
for more compact object storage:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            1          2996       2994        856                2        0
    72  1184           0            1          1632       1609        476                7        0
    73  1200           1            0          1445       1442        425                5        0
    74  1216           0            0          1510       1510        453                3        0
    75  1232           0            1          1495       1479        455                7        0
    76  1248           0            1          1456       1451        448                4        0
    78  1280           0            1          3040       3033        950                5        0
    79  1296           0            1          1584       1571        504                7        0
    83  1360           0            0          6375       6375       2125                1        0
    84  1376           0            1          1817       1796        632                8        0
    87  1424           0            1          6020       6006       2107                7        0
    88  1440           0            1          2108       2101        744                6        0
    89  1456           0            1          2072       2064        740                5        0
    91  1488           0            1          4169       4159       1516                4        0
    92  1504           0            1          2014       2007        742                7        0
    94  1536           0            1          3904       3900       1464                3        0
    95  1552           0            1          1890       1873        720                8        0
    96  1568           0            1          1963       1958        755                5        0
    97  1584           0            1          1980       1974        770                7        0
   100  1632           0            1          6190       6187       2476                2        0
   103  1680           0            0          6477       6477       2667                7        0
   104  1696           0            1          2256       2253        940                5        0
   105  1712           0            1          2356       2340        992                8        0
   107  1744           1            0          4697       4696       2013                3        0
   110  1792           0            1          7744       7734       3388                7        0
   111  1808           0            1          2655       2649       1180                4        0
   114  1856           0            1          8371       8365       3805                5        0
   116  1888           1            0          5863       5862       2706                6        0
   117  1904           0            1          2955       2942       1379                7        0
   118  1920           0            1          3009       2997       1416                8        0
   126  2048           0            0         25276      25276      12638                1        0
   128  2080           0            1          6060       6052       3232                8        0
   129  2096           1            0          3081       3080       1659                7        0
   134  2176           0            1         14835      14830       7912                8        0
   135  2192           0            1          2769       2758       1491                7        0
   137  2224           0            1          5082       5077       2772                6        0
   140  2272           0            1          7236       7232       4020                5        0
   144  2336           0            1          8428       8423       4816                4        0
   147  2384           0            1          5316       5313       3101                7        0
   151  2448           0            1          5445       5443       3267                3        0
   155  2512           0            0          4121       4121       2536                8        0
   158  2560           0            1          2208       2205       1380                5        0
   160  2592           0            0          1133       1133        721                7        0
   168  2720           0            0          2712       2712       1808                2        0
   177  2864           1            0          1100       1098        770                7        0
   180  2912           0            1           189        183        135                5        0
   184  2976           0            1           176        166        128                8        0
   190  3072           0            0           252        252        189                3        0
   197  3184           0            1           198        192        154                7        0
   202  3264           0            1           100         96         80                4        0
   211  3408           0            1           210        208        175                5        0
   217  3504           0            1            98         94         84                6        0
   222  3584           0            0           104        104         91                7        0
   225  3632           0            1            54         50         48                8        0
   254  4096           0            0         33591      33591      33591                1        0

Note, the huge size watermark is above 3632 and there are a number of new
normal classes available that previously were merged with the huge class.
For instance, size class torvalds#211 holds 210 objects of size 3408 and uses 175
physical pages, while previously for those objects we would have used 210
physical pages.

Link: https://lkml.kernel.org/r/20221027042651.234524-3-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Oct 29, 2022
zsmalloc has 255 size classes.  Size classes contain a number of zspages,
which store objects of the same size.  zspage can consist of up to four
physical pages.  The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of objects
zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes torvalds#95-99 are merged with size class torvalds#100.  That is, each time
we store an object of size, say, 1568 bytes instead of using class torvalds#96 we
end up storing it in size class torvalds#100.  Class torvalds#100 is for objects of 1632
bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes. 
Class torvalds#100 zspages consist of 2 physical pages and can hold 5 objects. 
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class torvalds#96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages. A 5 page class torvalds#96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class torvalds#100 will allocate.

A higher order zspage for class torvalds#96 also changes its key characteristics:
pages per-zspage and objects per-zspage. As a result classes torvalds#96 and torvalds#100
are not merged anymore, which gives us more compact zsmalloc.

Of course the described effect does not apply only to size classes torvalds#96 and
We still merge classes, but less often so. In other words classes are grouped
in a more compact way, which decreases memory wastage:

zspage order               # unique size classes
     2                                69
     3                               123
     4                               191

Let's take a closer look at the bottom of /sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is torvalds#202, which stores objects of size 3264 bytes. Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class torvalds#254, which uses a whole physical page to store every object. To put
it slightly differently - objects in huge classes don't share physical pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from torvalds#203 to torvalds#254. Similarly to class size torvalds#96 above, higher order
zspages change key characteristics for some of those huge size classes and
thus those classes become normal classes, where stored objects share physical
pages.

Hence yet another consequence of higher order zspages: we move the huge
size class watermark with higher order zspages, have less huge classes and
store large objects in a more compact way.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted the VM.

orig_data_size       mem_used_total     mem_used_max       pages_compacted
          compr_data_size         mem_limit         same_pages       huge_pages

ORDER 2 (BASE) zspage

1691791360 628086729 655171584        0 655171584       60        0    34043
1691787264 628089196 655175680        0 655175680       60        0    34046
1691803648 628098840 655187968        0 655187968       59        0    34047
1691795456 628091503 655183872        0 655183872       60        0    34044
1691799552 628086877 655183872        0 655183872       60        0    34047

ORDER 3 zspage

1691803648 627792993 641794048        0 641794048       60        0    33591
1691787264 627779342 641708032        0 641708032       59        0    33591
1691811840 627786616 641769472        0 641769472       60        0    33591
1691803648 627794468 641818624        0 641818624       59        0    33592
1691783168 627780882 641794048        0 641794048       61        0    33591

ORDER 4 zspage

1691803648 627726635 639655936        0 639655936       60        0    33435
1691811840 627733348 639643648        0 639643648       61        0    33434
1691795456 627726290 639614976        0 639614976       60        0    33435
1691803648 627730458 639688704        0 639688704       60        0    33434
1691811840 627727771 639688704        0 639688704       60        0    33434

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.4170803e+08 6.4181862e+08 6.4179405e+08 6.4177684e+08     42210.666
Difference at 95.0% confidence
	-1.34038e+07 +/- 44080.7
	-2.04581% +/- 0.00672802%
	(Student's t, pooled s = 30224.5)

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.3961498e+08  6.396887e+08 6.3965594e+08 6.3965839e+08     31408.602
Difference at 95.0% confidence
	-1.55222e+07 +/- 33126.2
	-2.36915% +/- 0.00505604%
	(Student's t, pooled s = 22713.4)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

zsmalloc object distribution analysis
=============================================================================

Order 2 (4 pages per zspage) tends to put many objects in size class 2048,
which is merged with size classes torvalds#112-torvalds#125:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            0          6146       6146       1756                2        0
    74  1216           0            1          4560       4552       1368                3        0
    76  1248           0            1          2938       2934        904                4        0
    83  1360           0            0         10971      10971       3657                1        0
    91  1488           0            0         16126      16126       5864                4        0
    94  1536           0            1          5912       5908       2217                3        0
   100  1632           0            0         11990      11990       4796                2        0
   107  1744           0            1         15771      15768       6759                3        0
   111  1808           0            1         10386      10380       4616                4        0
   126  2048           0            0         45444      45444      22722                1        0
   144  2336           0            0         47446      47446      27112                4        0
   151  2448           1            0         10760      10759       6456                3        0
   168  2720           0            0         10173      10173       6782                2        0
   190  3072           0            1          1700       1697       1275                3        0
   202  3264           0            1           290        286        232                4        0
   254  4096           0            0         34051      34051      34051                1        0

Order 3 (8 pages per zspage) changed pool characteristics and unmerged
some of the size classes, which resulted in less objects being put into
size class 2048, because there are lower size classes are now available
for more compact object storage:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            1          2996       2994        856                2        0
    72  1184           0            1          1632       1609        476                7        0
    73  1200           1            0          1445       1442        425                5        0
    74  1216           0            0          1510       1510        453                3        0
    75  1232           0            1          1495       1479        455                7        0
    76  1248           0            1          1456       1451        448                4        0
    78  1280           0            1          3040       3033        950                5        0
    79  1296           0            1          1584       1571        504                7        0
    83  1360           0            0          6375       6375       2125                1        0
    84  1376           0            1          1817       1796        632                8        0
    87  1424           0            1          6020       6006       2107                7        0
    88  1440           0            1          2108       2101        744                6        0
    89  1456           0            1          2072       2064        740                5        0
    91  1488           0            1          4169       4159       1516                4        0
    92  1504           0            1          2014       2007        742                7        0
    94  1536           0            1          3904       3900       1464                3        0
    95  1552           0            1          1890       1873        720                8        0
    96  1568           0            1          1963       1958        755                5        0
    97  1584           0            1          1980       1974        770                7        0
   100  1632           0            1          6190       6187       2476                2        0
   103  1680           0            0          6477       6477       2667                7        0
   104  1696           0            1          2256       2253        940                5        0
   105  1712           0            1          2356       2340        992                8        0
   107  1744           1            0          4697       4696       2013                3        0
   110  1792           0            1          7744       7734       3388                7        0
   111  1808           0            1          2655       2649       1180                4        0
   114  1856           0            1          8371       8365       3805                5        0
   116  1888           1            0          5863       5862       2706                6        0
   117  1904           0            1          2955       2942       1379                7        0
   118  1920           0            1          3009       2997       1416                8        0
   126  2048           0            0         25276      25276      12638                1        0
   128  2080           0            1          6060       6052       3232                8        0
   129  2096           1            0          3081       3080       1659                7        0
   134  2176           0            1         14835      14830       7912                8        0
   135  2192           0            1          2769       2758       1491                7        0
   137  2224           0            1          5082       5077       2772                6        0
   140  2272           0            1          7236       7232       4020                5        0
   144  2336           0            1          8428       8423       4816                4        0
   147  2384           0            1          5316       5313       3101                7        0
   151  2448           0            1          5445       5443       3267                3        0
   155  2512           0            0          4121       4121       2536                8        0
   158  2560           0            1          2208       2205       1380                5        0
   160  2592           0            0          1133       1133        721                7        0
   168  2720           0            0          2712       2712       1808                2        0
   177  2864           1            0          1100       1098        770                7        0
   180  2912           0            1           189        183        135                5        0
   184  2976           0            1           176        166        128                8        0
   190  3072           0            0           252        252        189                3        0
   197  3184           0            1           198        192        154                7        0
   202  3264           0            1           100         96         80                4        0
   211  3408           0            1           210        208        175                5        0
   217  3504           0            1            98         94         84                6        0
   222  3584           0            0           104        104         91                7        0
   225  3632           0            1            54         50         48                8        0
   254  4096           0            0         33591      33591      33591                1        0

Note, the huge size watermark is above 3632 and there are a number of new
normal classes available that previously were merged with the huge class.
For instance, size class torvalds#211 holds 210 objects of size 3408 and uses 175
physical pages, while previously for those objects we would have used 210
physical pages.

Link: https://lkml.kernel.org/r/20221027042651.234524-3-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Nov 1, 2022
zsmalloc has 255 size classes.  Size classes contain a number of zspages,
which store objects of the same size.  zspage can consist of up to four
physical pages.  The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of objects
zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes torvalds#95-99 are merged with size class torvalds#100. That is, each time
we store an object of size, say, 1568 bytes instead of using class torvalds#96
we end up storing it in size class torvalds#100. Class torvalds#100 is for objects of
1632 bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes.
Class torvalds#100 zspages consist of 2 physical pages and can hold 5 objects.
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class torvalds#96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages. A 5 page class torvalds#96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class torvalds#100 will allocate.

A higher order zspage for class torvalds#96 also changes its key characteristics:
pages per-zspage and objects per-zspage. As a result classes torvalds#96 and torvalds#100
are not merged anymore, which gives us more compact zsmalloc.

Of course the described effect does not apply only to size classes torvalds#96 and
We still merge classes, but less often so. In other words classes are grouped
in a more compact way, which decreases memory wastage:

zspage order               # unique size classes
     2                                69
     3                               123
     4                               191

Let's take a closer look at the bottom of /sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is torvalds#202, which stores objects of size 3264 bytes. Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class torvalds#254, which uses a whole physical page to store every object. To put
it slightly differently - objects in huge classes don't share physical pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from torvalds#203 to torvalds#254. Similarly to class size torvalds#96 above, higher order
zspages change key characteristics for some of those huge size classes and
thus those classes become normal classes, where stored objects share physical
pages.

Hence yet another consequence of higher order zspages: we move the huge
size class watermark with higher order zspages, have less huge classes and
store large objects in a more compact way.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted the VM.

orig_data_size       mem_used_total     mem_used_max       pages_compacted
          compr_data_size         mem_limit         same_pages       huge_pages

ORDER 2 (BASE) zspage

1691791360 628086729 655171584        0 655171584       60        0    34043
1691787264 628089196 655175680        0 655175680       60        0    34046
1691803648 628098840 655187968        0 655187968       59        0    34047
1691795456 628091503 655183872        0 655183872       60        0    34044
1691799552 628086877 655183872        0 655183872       60        0    34047

ORDER 3 zspage

1691803648 627792993 641794048        0 641794048       60        0    33591
1691787264 627779342 641708032        0 641708032       59        0    33591
1691811840 627786616 641769472        0 641769472       60        0    33591
1691803648 627794468 641818624        0 641818624       59        0    33592
1691783168 627780882 641794048        0 641794048       61        0    33591

ORDER 4 zspage

1691803648 627726635 639655936        0 639655936       60        0    33435
1691811840 627733348 639643648        0 639643648       61        0    33434
1691795456 627726290 639614976        0 639614976       60        0    33435
1691803648 627730458 639688704        0 639688704       60        0    33434
1691811840 627727771 639688704        0 639688704       60        0    33434

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.4170803e+08 6.4181862e+08 6.4179405e+08 6.4177684e+08     42210.666
Difference at 95.0% confidence
	-1.34038e+07 +/- 44080.7
	-2.04581% +/- 0.00672802%
	(Student's t, pooled s = 30224.5)

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.3961498e+08  6.396887e+08 6.3965594e+08 6.3965839e+08     31408.602
Difference at 95.0% confidence
	-1.55222e+07 +/- 33126.2
	-2.36915% +/- 0.00505604%
	(Student's t, pooled s = 22713.4)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

zsmalloc object distribution analysis
=============================================================================

Order 2 (4 pages per zspage) tends to put many objects in size class 2048,
which is merged with size classes torvalds#112-torvalds#125:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            0          6146       6146       1756                2        0
    74  1216           0            1          4560       4552       1368                3        0
    76  1248           0            1          2938       2934        904                4        0
    83  1360           0            0         10971      10971       3657                1        0
    91  1488           0            0         16126      16126       5864                4        0
    94  1536           0            1          5912       5908       2217                3        0
   100  1632           0            0         11990      11990       4796                2        0
   107  1744           0            1         15771      15768       6759                3        0
   111  1808           0            1         10386      10380       4616                4        0
   126  2048           0            0         45444      45444      22722                1        0
   144  2336           0            0         47446      47446      27112                4        0
   151  2448           1            0         10760      10759       6456                3        0
   168  2720           0            0         10173      10173       6782                2        0
   190  3072           0            1          1700       1697       1275                3        0
   202  3264           0            1           290        286        232                4        0
   254  4096           0            0         34051      34051      34051                1        0

Order 3 (8 pages per zspage) changed pool characteristics and unmerged
some of the size classes, which resulted in less objects being put into
size class 2048, because there are lower size classes are now available
for more compact object storage:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            1          2996       2994        856                2        0
    72  1184           0            1          1632       1609        476                7        0
    73  1200           1            0          1445       1442        425                5        0
    74  1216           0            0          1510       1510        453                3        0
    75  1232           0            1          1495       1479        455                7        0
    76  1248           0            1          1456       1451        448                4        0
    78  1280           0            1          3040       3033        950                5        0
    79  1296           0            1          1584       1571        504                7        0
    83  1360           0            0          6375       6375       2125                1        0
    84  1376           0            1          1817       1796        632                8        0
    87  1424           0            1          6020       6006       2107                7        0
    88  1440           0            1          2108       2101        744                6        0
    89  1456           0            1          2072       2064        740                5        0
    91  1488           0            1          4169       4159       1516                4        0
    92  1504           0            1          2014       2007        742                7        0
    94  1536           0            1          3904       3900       1464                3        0
    95  1552           0            1          1890       1873        720                8        0
    96  1568           0            1          1963       1958        755                5        0
    97  1584           0            1          1980       1974        770                7        0
   100  1632           0            1          6190       6187       2476                2        0
   103  1680           0            0          6477       6477       2667                7        0
   104  1696           0            1          2256       2253        940                5        0
   105  1712           0            1          2356       2340        992                8        0
   107  1744           1            0          4697       4696       2013                3        0
   110  1792           0            1          7744       7734       3388                7        0
   111  1808           0            1          2655       2649       1180                4        0
   114  1856           0            1          8371       8365       3805                5        0
   116  1888           1            0          5863       5862       2706                6        0
   117  1904           0            1          2955       2942       1379                7        0
   118  1920           0            1          3009       2997       1416                8        0
   126  2048           0            0         25276      25276      12638                1        0
   128  2080           0            1          6060       6052       3232                8        0
   129  2096           1            0          3081       3080       1659                7        0
   134  2176           0            1         14835      14830       7912                8        0
   135  2192           0            1          2769       2758       1491                7        0
   137  2224           0            1          5082       5077       2772                6        0
   140  2272           0            1          7236       7232       4020                5        0
   144  2336           0            1          8428       8423       4816                4        0
   147  2384           0            1          5316       5313       3101                7        0
   151  2448           0            1          5445       5443       3267                3        0
   155  2512           0            0          4121       4121       2536                8        0
   158  2560           0            1          2208       2205       1380                5        0
   160  2592           0            0          1133       1133        721                7        0
   168  2720           0            0          2712       2712       1808                2        0
   177  2864           1            0          1100       1098        770                7        0
   180  2912           0            1           189        183        135                5        0
   184  2976           0            1           176        166        128                8        0
   190  3072           0            0           252        252        189                3        0
   197  3184           0            1           198        192        154                7        0
   202  3264           0            1           100         96         80                4        0
   211  3408           0            1           210        208        175                5        0
   217  3504           0            1            98         94         84                6        0
   222  3584           0            0           104        104         91                7        0
   225  3632           0            1            54         50         48                8        0
   254  4096           0            0         33591      33591      33591                1        0

Note, the huge size watermark is above 3632 and there are a number of new
normal classes available that previously were merged with the huge class. 
For instance, size class torvalds#211 holds 210 objects of size 3408 and uses 175
physical pages, while previously for those objects we would have used 210
physical pages.

Link: https://lkml.kernel.org/r/20221031054108.541190-3-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Nov 1, 2022
zsmalloc has 255 size classes.  Size classes contain a number of zspages,
which store objects of the same size.  zspage can consist of up to four
physical pages.  The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of objects
zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes torvalds#95-99 are merged with size class torvalds#100. That is, each time
we store an object of size, say, 1568 bytes instead of using class torvalds#96
we end up storing it in size class torvalds#100. Class torvalds#100 is for objects of
1632 bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes.
Class torvalds#100 zspages consist of 2 physical pages and can hold 5 objects.
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class torvalds#96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages. A 5 page class torvalds#96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class torvalds#100 will allocate.

A higher order zspage for class torvalds#96 also changes its key characteristics:
pages per-zspage and objects per-zspage. As a result classes torvalds#96 and torvalds#100
are not merged anymore, which gives us more compact zsmalloc.

Of course the described effect does not apply only to size classes torvalds#96 and
We still merge classes, but less often so. In other words classes are grouped
in a more compact way, which decreases memory wastage:

zspage order               # unique size classes
     2                                69
     3                               123
     4                               191

Let's take a closer look at the bottom of /sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is torvalds#202, which stores objects of size 3264 bytes. Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class torvalds#254, which uses a whole physical page to store every object. To put
it slightly differently - objects in huge classes don't share physical pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from torvalds#203 to torvalds#254. Similarly to class size torvalds#96 above, higher order
zspages change key characteristics for some of those huge size classes and
thus those classes become normal classes, where stored objects share physical
pages.

Hence yet another consequence of higher order zspages: we move the huge
size class watermark with higher order zspages, have less huge classes and
store large objects in a more compact way.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted the VM.

orig_data_size       mem_used_total     mem_used_max       pages_compacted
          compr_data_size         mem_limit         same_pages       huge_pages

ORDER 2 (BASE) zspage

1691791360 628086729 655171584        0 655171584       60        0    34043
1691787264 628089196 655175680        0 655175680       60        0    34046
1691803648 628098840 655187968        0 655187968       59        0    34047
1691795456 628091503 655183872        0 655183872       60        0    34044
1691799552 628086877 655183872        0 655183872       60        0    34047

ORDER 3 zspage

1691803648 627792993 641794048        0 641794048       60        0    33591
1691787264 627779342 641708032        0 641708032       59        0    33591
1691811840 627786616 641769472        0 641769472       60        0    33591
1691803648 627794468 641818624        0 641818624       59        0    33592
1691783168 627780882 641794048        0 641794048       61        0    33591

ORDER 4 zspage

1691803648 627726635 639655936        0 639655936       60        0    33435
1691811840 627733348 639643648        0 639643648       61        0    33434
1691795456 627726290 639614976        0 639614976       60        0    33435
1691803648 627730458 639688704        0 639688704       60        0    33434
1691811840 627727771 639688704        0 639688704       60        0    33434

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.4170803e+08 6.4181862e+08 6.4179405e+08 6.4177684e+08     42210.666
Difference at 95.0% confidence
	-1.34038e+07 +/- 44080.7
	-2.04581% +/- 0.00672802%
	(Student's t, pooled s = 30224.5)

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.3961498e+08  6.396887e+08 6.3965594e+08 6.3965839e+08     31408.602
Difference at 95.0% confidence
	-1.55222e+07 +/- 33126.2
	-2.36915% +/- 0.00505604%
	(Student's t, pooled s = 22713.4)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

zsmalloc object distribution analysis
=============================================================================

Order 2 (4 pages per zspage) tends to put many objects in size class 2048,
which is merged with size classes torvalds#112-torvalds#125:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            0          6146       6146       1756                2        0
    74  1216           0            1          4560       4552       1368                3        0
    76  1248           0            1          2938       2934        904                4        0
    83  1360           0            0         10971      10971       3657                1        0
    91  1488           0            0         16126      16126       5864                4        0
    94  1536           0            1          5912       5908       2217                3        0
   100  1632           0            0         11990      11990       4796                2        0
   107  1744           0            1         15771      15768       6759                3        0
   111  1808           0            1         10386      10380       4616                4        0
   126  2048           0            0         45444      45444      22722                1        0
   144  2336           0            0         47446      47446      27112                4        0
   151  2448           1            0         10760      10759       6456                3        0
   168  2720           0            0         10173      10173       6782                2        0
   190  3072           0            1          1700       1697       1275                3        0
   202  3264           0            1           290        286        232                4        0
   254  4096           0            0         34051      34051      34051                1        0

Order 3 (8 pages per zspage) changed pool characteristics and unmerged
some of the size classes, which resulted in less objects being put into
size class 2048, because there are lower size classes are now available
for more compact object storage:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            1          2996       2994        856                2        0
    72  1184           0            1          1632       1609        476                7        0
    73  1200           1            0          1445       1442        425                5        0
    74  1216           0            0          1510       1510        453                3        0
    75  1232           0            1          1495       1479        455                7        0
    76  1248           0            1          1456       1451        448                4        0
    78  1280           0            1          3040       3033        950                5        0
    79  1296           0            1          1584       1571        504                7        0
    83  1360           0            0          6375       6375       2125                1        0
    84  1376           0            1          1817       1796        632                8        0
    87  1424           0            1          6020       6006       2107                7        0
    88  1440           0            1          2108       2101        744                6        0
    89  1456           0            1          2072       2064        740                5        0
    91  1488           0            1          4169       4159       1516                4        0
    92  1504           0            1          2014       2007        742                7        0
    94  1536           0            1          3904       3900       1464                3        0
    95  1552           0            1          1890       1873        720                8        0
    96  1568           0            1          1963       1958        755                5        0
    97  1584           0            1          1980       1974        770                7        0
   100  1632           0            1          6190       6187       2476                2        0
   103  1680           0            0          6477       6477       2667                7        0
   104  1696           0            1          2256       2253        940                5        0
   105  1712           0            1          2356       2340        992                8        0
   107  1744           1            0          4697       4696       2013                3        0
   110  1792           0            1          7744       7734       3388                7        0
   111  1808           0            1          2655       2649       1180                4        0
   114  1856           0            1          8371       8365       3805                5        0
   116  1888           1            0          5863       5862       2706                6        0
   117  1904           0            1          2955       2942       1379                7        0
   118  1920           0            1          3009       2997       1416                8        0
   126  2048           0            0         25276      25276      12638                1        0
   128  2080           0            1          6060       6052       3232                8        0
   129  2096           1            0          3081       3080       1659                7        0
   134  2176           0            1         14835      14830       7912                8        0
   135  2192           0            1          2769       2758       1491                7        0
   137  2224           0            1          5082       5077       2772                6        0
   140  2272           0            1          7236       7232       4020                5        0
   144  2336           0            1          8428       8423       4816                4        0
   147  2384           0            1          5316       5313       3101                7        0
   151  2448           0            1          5445       5443       3267                3        0
   155  2512           0            0          4121       4121       2536                8        0
   158  2560           0            1          2208       2205       1380                5        0
   160  2592           0            0          1133       1133        721                7        0
   168  2720           0            0          2712       2712       1808                2        0
   177  2864           1            0          1100       1098        770                7        0
   180  2912           0            1           189        183        135                5        0
   184  2976           0            1           176        166        128                8        0
   190  3072           0            0           252        252        189                3        0
   197  3184           0            1           198        192        154                7        0
   202  3264           0            1           100         96         80                4        0
   211  3408           0            1           210        208        175                5        0
   217  3504           0            1            98         94         84                6        0
   222  3584           0            0           104        104         91                7        0
   225  3632           0            1            54         50         48                8        0
   254  4096           0            0         33591      33591      33591                1        0

Note, the huge size watermark is above 3632 and there are a number of new
normal classes available that previously were merged with the huge class. 
For instance, size class torvalds#211 holds 210 objects of size 3408 and uses 175
physical pages, while previously for those objects we would have used 210
physical pages.

Link: https://lkml.kernel.org/r/20221031054108.541190-3-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Nov 2, 2022
zsmalloc has 255 size classes.  Size classes contain a number of zspages,
which store objects of the same size.  zspage can consist of up to four
physical pages.  The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of objects
zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes torvalds#95-99 are merged with size class torvalds#100. That is, each time
we store an object of size, say, 1568 bytes instead of using class torvalds#96
we end up storing it in size class torvalds#100. Class torvalds#100 is for objects of
1632 bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes.
Class torvalds#100 zspages consist of 2 physical pages and can hold 5 objects.
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class torvalds#96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages. A 5 page class torvalds#96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class torvalds#100 will allocate.

A higher order zspage for class torvalds#96 also changes its key characteristics:
pages per-zspage and objects per-zspage. As a result classes torvalds#96 and torvalds#100
are not merged anymore, which gives us more compact zsmalloc.

Of course the described effect does not apply only to size classes torvalds#96 and
We still merge classes, but less often so. In other words classes are grouped
in a more compact way, which decreases memory wastage:

zspage order               # unique size classes
     2                                69
     3                               123
     4                               191

Let's take a closer look at the bottom of /sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is torvalds#202, which stores objects of size 3264 bytes. Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class torvalds#254, which uses a whole physical page to store every object. To put
it slightly differently - objects in huge classes don't share physical pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from torvalds#203 to torvalds#254. Similarly to class size torvalds#96 above, higher order
zspages change key characteristics for some of those huge size classes and
thus those classes become normal classes, where stored objects share physical
pages.

Hence yet another consequence of higher order zspages: we move the huge
size class watermark with higher order zspages, have less huge classes and
store large objects in a more compact way.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted the VM.

orig_data_size       mem_used_total     mem_used_max       pages_compacted
          compr_data_size         mem_limit         same_pages       huge_pages

ORDER 2 (BASE) zspage

1691791360 628086729 655171584        0 655171584       60        0    34043
1691787264 628089196 655175680        0 655175680       60        0    34046
1691803648 628098840 655187968        0 655187968       59        0    34047
1691795456 628091503 655183872        0 655183872       60        0    34044
1691799552 628086877 655183872        0 655183872       60        0    34047

ORDER 3 zspage

1691803648 627792993 641794048        0 641794048       60        0    33591
1691787264 627779342 641708032        0 641708032       59        0    33591
1691811840 627786616 641769472        0 641769472       60        0    33591
1691803648 627794468 641818624        0 641818624       59        0    33592
1691783168 627780882 641794048        0 641794048       61        0    33591

ORDER 4 zspage

1691803648 627726635 639655936        0 639655936       60        0    33435
1691811840 627733348 639643648        0 639643648       61        0    33434
1691795456 627726290 639614976        0 639614976       60        0    33435
1691803648 627730458 639688704        0 639688704       60        0    33434
1691811840 627727771 639688704        0 639688704       60        0    33434

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.4170803e+08 6.4181862e+08 6.4179405e+08 6.4177684e+08     42210.666
Difference at 95.0% confidence
	-1.34038e+07 +/- 44080.7
	-2.04581% +/- 0.00672802%
	(Student's t, pooled s = 30224.5)

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.3961498e+08  6.396887e+08 6.3965594e+08 6.3965839e+08     31408.602
Difference at 95.0% confidence
	-1.55222e+07 +/- 33126.2
	-2.36915% +/- 0.00505604%
	(Student's t, pooled s = 22713.4)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

zsmalloc object distribution analysis
=============================================================================

Order 2 (4 pages per zspage) tends to put many objects in size class 2048,
which is merged with size classes torvalds#112-torvalds#125:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            0          6146       6146       1756                2        0
    74  1216           0            1          4560       4552       1368                3        0
    76  1248           0            1          2938       2934        904                4        0
    83  1360           0            0         10971      10971       3657                1        0
    91  1488           0            0         16126      16126       5864                4        0
    94  1536           0            1          5912       5908       2217                3        0
   100  1632           0            0         11990      11990       4796                2        0
   107  1744           0            1         15771      15768       6759                3        0
   111  1808           0            1         10386      10380       4616                4        0
   126  2048           0            0         45444      45444      22722                1        0
   144  2336           0            0         47446      47446      27112                4        0
   151  2448           1            0         10760      10759       6456                3        0
   168  2720           0            0         10173      10173       6782                2        0
   190  3072           0            1          1700       1697       1275                3        0
   202  3264           0            1           290        286        232                4        0
   254  4096           0            0         34051      34051      34051                1        0

Order 3 (8 pages per zspage) changed pool characteristics and unmerged
some of the size classes, which resulted in less objects being put into
size class 2048, because there are lower size classes are now available
for more compact object storage:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            1          2996       2994        856                2        0
    72  1184           0            1          1632       1609        476                7        0
    73  1200           1            0          1445       1442        425                5        0
    74  1216           0            0          1510       1510        453                3        0
    75  1232           0            1          1495       1479        455                7        0
    76  1248           0            1          1456       1451        448                4        0
    78  1280           0            1          3040       3033        950                5        0
    79  1296           0            1          1584       1571        504                7        0
    83  1360           0            0          6375       6375       2125                1        0
    84  1376           0            1          1817       1796        632                8        0
    87  1424           0            1          6020       6006       2107                7        0
    88  1440           0            1          2108       2101        744                6        0
    89  1456           0            1          2072       2064        740                5        0
    91  1488           0            1          4169       4159       1516                4        0
    92  1504           0            1          2014       2007        742                7        0
    94  1536           0            1          3904       3900       1464                3        0
    95  1552           0            1          1890       1873        720                8        0
    96  1568           0            1          1963       1958        755                5        0
    97  1584           0            1          1980       1974        770                7        0
   100  1632           0            1          6190       6187       2476                2        0
   103  1680           0            0          6477       6477       2667                7        0
   104  1696           0            1          2256       2253        940                5        0
   105  1712           0            1          2356       2340        992                8        0
   107  1744           1            0          4697       4696       2013                3        0
   110  1792           0            1          7744       7734       3388                7        0
   111  1808           0            1          2655       2649       1180                4        0
   114  1856           0            1          8371       8365       3805                5        0
   116  1888           1            0          5863       5862       2706                6        0
   117  1904           0            1          2955       2942       1379                7        0
   118  1920           0            1          3009       2997       1416                8        0
   126  2048           0            0         25276      25276      12638                1        0
   128  2080           0            1          6060       6052       3232                8        0
   129  2096           1            0          3081       3080       1659                7        0
   134  2176           0            1         14835      14830       7912                8        0
   135  2192           0            1          2769       2758       1491                7        0
   137  2224           0            1          5082       5077       2772                6        0
   140  2272           0            1          7236       7232       4020                5        0
   144  2336           0            1          8428       8423       4816                4        0
   147  2384           0            1          5316       5313       3101                7        0
   151  2448           0            1          5445       5443       3267                3        0
   155  2512           0            0          4121       4121       2536                8        0
   158  2560           0            1          2208       2205       1380                5        0
   160  2592           0            0          1133       1133        721                7        0
   168  2720           0            0          2712       2712       1808                2        0
   177  2864           1            0          1100       1098        770                7        0
   180  2912           0            1           189        183        135                5        0
   184  2976           0            1           176        166        128                8        0
   190  3072           0            0           252        252        189                3        0
   197  3184           0            1           198        192        154                7        0
   202  3264           0            1           100         96         80                4        0
   211  3408           0            1           210        208        175                5        0
   217  3504           0            1            98         94         84                6        0
   222  3584           0            0           104        104         91                7        0
   225  3632           0            1            54         50         48                8        0
   254  4096           0            0         33591      33591      33591                1        0

Note, the huge size watermark is above 3632 and there are a number of new
normal classes available that previously were merged with the huge class. 
For instance, size class torvalds#211 holds 210 objects of size 3408 and uses 175
physical pages, while previously for those objects we would have used 210
physical pages.

Link: https://lkml.kernel.org/r/20221031054108.541190-3-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Nov 3, 2022
zsmalloc has 255 size classes.  Size classes contain a number of zspages,
which store objects of the same size.  zspage can consist of up to four
physical pages.  The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of objects
zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes torvalds#95-99 are merged with size class torvalds#100. That is, each time
we store an object of size, say, 1568 bytes instead of using class torvalds#96
we end up storing it in size class torvalds#100. Class torvalds#100 is for objects of
1632 bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes.
Class torvalds#100 zspages consist of 2 physical pages and can hold 5 objects.
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class torvalds#96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages. A 5 page class torvalds#96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class torvalds#100 will allocate.

A higher order zspage for class torvalds#96 also changes its key characteristics:
pages per-zspage and objects per-zspage. As a result classes torvalds#96 and torvalds#100
are not merged anymore, which gives us more compact zsmalloc.

Of course the described effect does not apply only to size classes torvalds#96 and
We still merge classes, but less often so. In other words classes are grouped
in a more compact way, which decreases memory wastage:

zspage order               # unique size classes
     2                                69
     3                               123
     4                               191

Let's take a closer look at the bottom of /sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is torvalds#202, which stores objects of size 3264 bytes. Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class torvalds#254, which uses a whole physical page to store every object. To put
it slightly differently - objects in huge classes don't share physical pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from torvalds#203 to torvalds#254. Similarly to class size torvalds#96 above, higher order
zspages change key characteristics for some of those huge size classes and
thus those classes become normal classes, where stored objects share physical
pages.

Hence yet another consequence of higher order zspages: we move the huge
size class watermark with higher order zspages, have less huge classes and
store large objects in a more compact way.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted the VM.

orig_data_size       mem_used_total     mem_used_max       pages_compacted
          compr_data_size         mem_limit         same_pages       huge_pages

ORDER 2 (BASE) zspage

1691791360 628086729 655171584        0 655171584       60        0    34043
1691787264 628089196 655175680        0 655175680       60        0    34046
1691803648 628098840 655187968        0 655187968       59        0    34047
1691795456 628091503 655183872        0 655183872       60        0    34044
1691799552 628086877 655183872        0 655183872       60        0    34047

ORDER 3 zspage

1691803648 627792993 641794048        0 641794048       60        0    33591
1691787264 627779342 641708032        0 641708032       59        0    33591
1691811840 627786616 641769472        0 641769472       60        0    33591
1691803648 627794468 641818624        0 641818624       59        0    33592
1691783168 627780882 641794048        0 641794048       61        0    33591

ORDER 4 zspage

1691803648 627726635 639655936        0 639655936       60        0    33435
1691811840 627733348 639643648        0 639643648       61        0    33434
1691795456 627726290 639614976        0 639614976       60        0    33435
1691803648 627730458 639688704        0 639688704       60        0    33434
1691811840 627727771 639688704        0 639688704       60        0    33434

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.4170803e+08 6.4181862e+08 6.4179405e+08 6.4177684e+08     42210.666
Difference at 95.0% confidence
	-1.34038e+07 +/- 44080.7
	-2.04581% +/- 0.00672802%
	(Student's t, pooled s = 30224.5)

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.3961498e+08  6.396887e+08 6.3965594e+08 6.3965839e+08     31408.602
Difference at 95.0% confidence
	-1.55222e+07 +/- 33126.2
	-2.36915% +/- 0.00505604%
	(Student's t, pooled s = 22713.4)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

zsmalloc object distribution analysis
=============================================================================

Order 2 (4 pages per zspage) tends to put many objects in size class 2048,
which is merged with size classes torvalds#112-torvalds#125:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            0          6146       6146       1756                2        0
    74  1216           0            1          4560       4552       1368                3        0
    76  1248           0            1          2938       2934        904                4        0
    83  1360           0            0         10971      10971       3657                1        0
    91  1488           0            0         16126      16126       5864                4        0
    94  1536           0            1          5912       5908       2217                3        0
   100  1632           0            0         11990      11990       4796                2        0
   107  1744           0            1         15771      15768       6759                3        0
   111  1808           0            1         10386      10380       4616                4        0
   126  2048           0            0         45444      45444      22722                1        0
   144  2336           0            0         47446      47446      27112                4        0
   151  2448           1            0         10760      10759       6456                3        0
   168  2720           0            0         10173      10173       6782                2        0
   190  3072           0            1          1700       1697       1275                3        0
   202  3264           0            1           290        286        232                4        0
   254  4096           0            0         34051      34051      34051                1        0

Order 3 (8 pages per zspage) changed pool characteristics and unmerged
some of the size classes, which resulted in less objects being put into
size class 2048, because there are lower size classes are now available
for more compact object storage:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            1          2996       2994        856                2        0
    72  1184           0            1          1632       1609        476                7        0
    73  1200           1            0          1445       1442        425                5        0
    74  1216           0            0          1510       1510        453                3        0
    75  1232           0            1          1495       1479        455                7        0
    76  1248           0            1          1456       1451        448                4        0
    78  1280           0            1          3040       3033        950                5        0
    79  1296           0            1          1584       1571        504                7        0
    83  1360           0            0          6375       6375       2125                1        0
    84  1376           0            1          1817       1796        632                8        0
    87  1424           0            1          6020       6006       2107                7        0
    88  1440           0            1          2108       2101        744                6        0
    89  1456           0            1          2072       2064        740                5        0
    91  1488           0            1          4169       4159       1516                4        0
    92  1504           0            1          2014       2007        742                7        0
    94  1536           0            1          3904       3900       1464                3        0
    95  1552           0            1          1890       1873        720                8        0
    96  1568           0            1          1963       1958        755                5        0
    97  1584           0            1          1980       1974        770                7        0
   100  1632           0            1          6190       6187       2476                2        0
   103  1680           0            0          6477       6477       2667                7        0
   104  1696           0            1          2256       2253        940                5        0
   105  1712           0            1          2356       2340        992                8        0
   107  1744           1            0          4697       4696       2013                3        0
   110  1792           0            1          7744       7734       3388                7        0
   111  1808           0            1          2655       2649       1180                4        0
   114  1856           0            1          8371       8365       3805                5        0
   116  1888           1            0          5863       5862       2706                6        0
   117  1904           0            1          2955       2942       1379                7        0
   118  1920           0            1          3009       2997       1416                8        0
   126  2048           0            0         25276      25276      12638                1        0
   128  2080           0            1          6060       6052       3232                8        0
   129  2096           1            0          3081       3080       1659                7        0
   134  2176           0            1         14835      14830       7912                8        0
   135  2192           0            1          2769       2758       1491                7        0
   137  2224           0            1          5082       5077       2772                6        0
   140  2272           0            1          7236       7232       4020                5        0
   144  2336           0            1          8428       8423       4816                4        0
   147  2384           0            1          5316       5313       3101                7        0
   151  2448           0            1          5445       5443       3267                3        0
   155  2512           0            0          4121       4121       2536                8        0
   158  2560           0            1          2208       2205       1380                5        0
   160  2592           0            0          1133       1133        721                7        0
   168  2720           0            0          2712       2712       1808                2        0
   177  2864           1            0          1100       1098        770                7        0
   180  2912           0            1           189        183        135                5        0
   184  2976           0            1           176        166        128                8        0
   190  3072           0            0           252        252        189                3        0
   197  3184           0            1           198        192        154                7        0
   202  3264           0            1           100         96         80                4        0
   211  3408           0            1           210        208        175                5        0
   217  3504           0            1            98         94         84                6        0
   222  3584           0            0           104        104         91                7        0
   225  3632           0            1            54         50         48                8        0
   254  4096           0            0         33591      33591      33591                1        0

Note, the huge size watermark is above 3632 and there are a number of new
normal classes available that previously were merged with the huge class. 
For instance, size class torvalds#211 holds 210 objects of size 3408 and uses 175
physical pages, while previously for those objects we would have used 210
physical pages.

Link: https://lkml.kernel.org/r/20221031054108.541190-3-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Nov 5, 2022
zsmalloc has 255 size classes.  Size classes contain a number of zspages,
which store objects of the same size.  zspage can consist of up to four
physical pages.  The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of objects
zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes torvalds#95-99 are merged with size class torvalds#100. That is, each time
we store an object of size, say, 1568 bytes instead of using class torvalds#96
we end up storing it in size class torvalds#100. Class torvalds#100 is for objects of
1632 bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes.
Class torvalds#100 zspages consist of 2 physical pages and can hold 5 objects.
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class torvalds#96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages. A 5 page class torvalds#96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class torvalds#100 will allocate.

A higher order zspage for class torvalds#96 also changes its key characteristics:
pages per-zspage and objects per-zspage. As a result classes torvalds#96 and torvalds#100
are not merged anymore, which gives us more compact zsmalloc.

Of course the described effect does not apply only to size classes torvalds#96 and
We still merge classes, but less often so. In other words classes are grouped
in a more compact way, which decreases memory wastage:

zspage order               # unique size classes
     2                                69
     3                               123
     4                               191

Let's take a closer look at the bottom of /sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is torvalds#202, which stores objects of size 3264 bytes. Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class torvalds#254, which uses a whole physical page to store every object. To put
it slightly differently - objects in huge classes don't share physical pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from torvalds#203 to torvalds#254. Similarly to class size torvalds#96 above, higher order
zspages change key characteristics for some of those huge size classes and
thus those classes become normal classes, where stored objects share physical
pages.

Hence yet another consequence of higher order zspages: we move the huge
size class watermark with higher order zspages, have less huge classes and
store large objects in a more compact way.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted the VM.

orig_data_size       mem_used_total     mem_used_max       pages_compacted
          compr_data_size         mem_limit         same_pages       huge_pages

ORDER 2 (BASE) zspage

1691791360 628086729 655171584        0 655171584       60        0    34043
1691787264 628089196 655175680        0 655175680       60        0    34046
1691803648 628098840 655187968        0 655187968       59        0    34047
1691795456 628091503 655183872        0 655183872       60        0    34044
1691799552 628086877 655183872        0 655183872       60        0    34047

ORDER 3 zspage

1691803648 627792993 641794048        0 641794048       60        0    33591
1691787264 627779342 641708032        0 641708032       59        0    33591
1691811840 627786616 641769472        0 641769472       60        0    33591
1691803648 627794468 641818624        0 641818624       59        0    33592
1691783168 627780882 641794048        0 641794048       61        0    33591

ORDER 4 zspage

1691803648 627726635 639655936        0 639655936       60        0    33435
1691811840 627733348 639643648        0 639643648       61        0    33434
1691795456 627726290 639614976        0 639614976       60        0    33435
1691803648 627730458 639688704        0 639688704       60        0    33434
1691811840 627727771 639688704        0 639688704       60        0    33434

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.4170803e+08 6.4181862e+08 6.4179405e+08 6.4177684e+08     42210.666
Difference at 95.0% confidence
	-1.34038e+07 +/- 44080.7
	-2.04581% +/- 0.00672802%
	(Student's t, pooled s = 30224.5)

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.3961498e+08  6.396887e+08 6.3965594e+08 6.3965839e+08     31408.602
Difference at 95.0% confidence
	-1.55222e+07 +/- 33126.2
	-2.36915% +/- 0.00505604%
	(Student's t, pooled s = 22713.4)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

zsmalloc object distribution analysis
=============================================================================

Order 2 (4 pages per zspage) tends to put many objects in size class 2048,
which is merged with size classes torvalds#112-torvalds#125:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            0          6146       6146       1756                2        0
    74  1216           0            1          4560       4552       1368                3        0
    76  1248           0            1          2938       2934        904                4        0
    83  1360           0            0         10971      10971       3657                1        0
    91  1488           0            0         16126      16126       5864                4        0
    94  1536           0            1          5912       5908       2217                3        0
   100  1632           0            0         11990      11990       4796                2        0
   107  1744           0            1         15771      15768       6759                3        0
   111  1808           0            1         10386      10380       4616                4        0
   126  2048           0            0         45444      45444      22722                1        0
   144  2336           0            0         47446      47446      27112                4        0
   151  2448           1            0         10760      10759       6456                3        0
   168  2720           0            0         10173      10173       6782                2        0
   190  3072           0            1          1700       1697       1275                3        0
   202  3264           0            1           290        286        232                4        0
   254  4096           0            0         34051      34051      34051                1        0

Order 3 (8 pages per zspage) changed pool characteristics and unmerged
some of the size classes, which resulted in less objects being put into
size class 2048, because there are lower size classes are now available
for more compact object storage:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            1          2996       2994        856                2        0
    72  1184           0            1          1632       1609        476                7        0
    73  1200           1            0          1445       1442        425                5        0
    74  1216           0            0          1510       1510        453                3        0
    75  1232           0            1          1495       1479        455                7        0
    76  1248           0            1          1456       1451        448                4        0
    78  1280           0            1          3040       3033        950                5        0
    79  1296           0            1          1584       1571        504                7        0
    83  1360           0            0          6375       6375       2125                1        0
    84  1376           0            1          1817       1796        632                8        0
    87  1424           0            1          6020       6006       2107                7        0
    88  1440           0            1          2108       2101        744                6        0
    89  1456           0            1          2072       2064        740                5        0
    91  1488           0            1          4169       4159       1516                4        0
    92  1504           0            1          2014       2007        742                7        0
    94  1536           0            1          3904       3900       1464                3        0
    95  1552           0            1          1890       1873        720                8        0
    96  1568           0            1          1963       1958        755                5        0
    97  1584           0            1          1980       1974        770                7        0
   100  1632           0            1          6190       6187       2476                2        0
   103  1680           0            0          6477       6477       2667                7        0
   104  1696           0            1          2256       2253        940                5        0
   105  1712           0            1          2356       2340        992                8        0
   107  1744           1            0          4697       4696       2013                3        0
   110  1792           0            1          7744       7734       3388                7        0
   111  1808           0            1          2655       2649       1180                4        0
   114  1856           0            1          8371       8365       3805                5        0
   116  1888           1            0          5863       5862       2706                6        0
   117  1904           0            1          2955       2942       1379                7        0
   118  1920           0            1          3009       2997       1416                8        0
   126  2048           0            0         25276      25276      12638                1        0
   128  2080           0            1          6060       6052       3232                8        0
   129  2096           1            0          3081       3080       1659                7        0
   134  2176           0            1         14835      14830       7912                8        0
   135  2192           0            1          2769       2758       1491                7        0
   137  2224           0            1          5082       5077       2772                6        0
   140  2272           0            1          7236       7232       4020                5        0
   144  2336           0            1          8428       8423       4816                4        0
   147  2384           0            1          5316       5313       3101                7        0
   151  2448           0            1          5445       5443       3267                3        0
   155  2512           0            0          4121       4121       2536                8        0
   158  2560           0            1          2208       2205       1380                5        0
   160  2592           0            0          1133       1133        721                7        0
   168  2720           0            0          2712       2712       1808                2        0
   177  2864           1            0          1100       1098        770                7        0
   180  2912           0            1           189        183        135                5        0
   184  2976           0            1           176        166        128                8        0
   190  3072           0            0           252        252        189                3        0
   197  3184           0            1           198        192        154                7        0
   202  3264           0            1           100         96         80                4        0
   211  3408           0            1           210        208        175                5        0
   217  3504           0            1            98         94         84                6        0
   222  3584           0            0           104        104         91                7        0
   225  3632           0            1            54         50         48                8        0
   254  4096           0            0         33591      33591      33591                1        0

Note, the huge size watermark is above 3632 and there are a number of new
normal classes available that previously were merged with the huge class. 
For instance, size class torvalds#211 holds 210 objects of size 3408 and uses 175
physical pages, while previously for those objects we would have used 210
physical pages.

Link: https://lkml.kernel.org/r/20221031054108.541190-3-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
jonhunter pushed a commit to jonhunter/linux that referenced this pull request Nov 7, 2022
zsmalloc has 255 size classes.  Size classes contain a number of zspages,
which store objects of the same size.  zspage can consist of up to four
physical pages.  The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of objects
zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes torvalds#95-99 are merged with size class torvalds#100. That is, each time
we store an object of size, say, 1568 bytes instead of using class torvalds#96
we end up storing it in size class torvalds#100. Class torvalds#100 is for objects of
1632 bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes.
Class torvalds#100 zspages consist of 2 physical pages and can hold 5 objects.
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class torvalds#96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages. A 5 page class torvalds#96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class torvalds#100 will allocate.

A higher order zspage for class torvalds#96 also changes its key characteristics:
pages per-zspage and objects per-zspage. As a result classes torvalds#96 and torvalds#100
are not merged anymore, which gives us more compact zsmalloc.

Of course the described effect does not apply only to size classes torvalds#96 and
We still merge classes, but less often so. In other words classes are grouped
in a more compact way, which decreases memory wastage:

zspage order               # unique size classes
     2                                69
     3                               123
     4                               191

Let's take a closer look at the bottom of /sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is torvalds#202, which stores objects of size 3264 bytes. Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class torvalds#254, which uses a whole physical page to store every object. To put
it slightly differently - objects in huge classes don't share physical pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from torvalds#203 to torvalds#254. Similarly to class size torvalds#96 above, higher order
zspages change key characteristics for some of those huge size classes and
thus those classes become normal classes, where stored objects share physical
pages.

Hence yet another consequence of higher order zspages: we move the huge
size class watermark with higher order zspages, have less huge classes and
store large objects in a more compact way.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted the VM.

orig_data_size       mem_used_total     mem_used_max       pages_compacted
          compr_data_size         mem_limit         same_pages       huge_pages

ORDER 2 (BASE) zspage

1691791360 628086729 655171584        0 655171584       60        0    34043
1691787264 628089196 655175680        0 655175680       60        0    34046
1691803648 628098840 655187968        0 655187968       59        0    34047
1691795456 628091503 655183872        0 655183872       60        0    34044
1691799552 628086877 655183872        0 655183872       60        0    34047

ORDER 3 zspage

1691803648 627792993 641794048        0 641794048       60        0    33591
1691787264 627779342 641708032        0 641708032       59        0    33591
1691811840 627786616 641769472        0 641769472       60        0    33591
1691803648 627794468 641818624        0 641818624       59        0    33592
1691783168 627780882 641794048        0 641794048       61        0    33591

ORDER 4 zspage

1691803648 627726635 639655936        0 639655936       60        0    33435
1691811840 627733348 639643648        0 639643648       61        0    33434
1691795456 627726290 639614976        0 639614976       60        0    33435
1691803648 627730458 639688704        0 639688704       60        0    33434
1691811840 627727771 639688704        0 639688704       60        0    33434

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.4170803e+08 6.4181862e+08 6.4179405e+08 6.4177684e+08     42210.666
Difference at 95.0% confidence
	-1.34038e+07 +/- 44080.7
	-2.04581% +/- 0.00672802%
	(Student's t, pooled s = 30224.5)

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.3961498e+08  6.396887e+08 6.3965594e+08 6.3965839e+08     31408.602
Difference at 95.0% confidence
	-1.55222e+07 +/- 33126.2
	-2.36915% +/- 0.00505604%
	(Student's t, pooled s = 22713.4)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

zsmalloc object distribution analysis
=============================================================================

Order 2 (4 pages per zspage) tends to put many objects in size class 2048,
which is merged with size classes torvalds#112-torvalds#125:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            0          6146       6146       1756                2        0
    74  1216           0            1          4560       4552       1368                3        0
    76  1248           0            1          2938       2934        904                4        0
    83  1360           0            0         10971      10971       3657                1        0
    91  1488           0            0         16126      16126       5864                4        0
    94  1536           0            1          5912       5908       2217                3        0
   100  1632           0            0         11990      11990       4796                2        0
   107  1744           0            1         15771      15768       6759                3        0
   111  1808           0            1         10386      10380       4616                4        0
   126  2048           0            0         45444      45444      22722                1        0
   144  2336           0            0         47446      47446      27112                4        0
   151  2448           1            0         10760      10759       6456                3        0
   168  2720           0            0         10173      10173       6782                2        0
   190  3072           0            1          1700       1697       1275                3        0
   202  3264           0            1           290        286        232                4        0
   254  4096           0            0         34051      34051      34051                1        0

Order 3 (8 pages per zspage) changed pool characteristics and unmerged
some of the size classes, which resulted in less objects being put into
size class 2048, because there are lower size classes are now available
for more compact object storage:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            1          2996       2994        856                2        0
    72  1184           0            1          1632       1609        476                7        0
    73  1200           1            0          1445       1442        425                5        0
    74  1216           0            0          1510       1510        453                3        0
    75  1232           0            1          1495       1479        455                7        0
    76  1248           0            1          1456       1451        448                4        0
    78  1280           0            1          3040       3033        950                5        0
    79  1296           0            1          1584       1571        504                7        0
    83  1360           0            0          6375       6375       2125                1        0
    84  1376           0            1          1817       1796        632                8        0
    87  1424           0            1          6020       6006       2107                7        0
    88  1440           0            1          2108       2101        744                6        0
    89  1456           0            1          2072       2064        740                5        0
    91  1488           0            1          4169       4159       1516                4        0
    92  1504           0            1          2014       2007        742                7        0
    94  1536           0            1          3904       3900       1464                3        0
    95  1552           0            1          1890       1873        720                8        0
    96  1568           0            1          1963       1958        755                5        0
    97  1584           0            1          1980       1974        770                7        0
   100  1632           0            1          6190       6187       2476                2        0
   103  1680           0            0          6477       6477       2667                7        0
   104  1696           0            1          2256       2253        940                5        0
   105  1712           0            1          2356       2340        992                8        0
   107  1744           1            0          4697       4696       2013                3        0
   110  1792           0            1          7744       7734       3388                7        0
   111  1808           0            1          2655       2649       1180                4        0
   114  1856           0            1          8371       8365       3805                5        0
   116  1888           1            0          5863       5862       2706                6        0
   117  1904           0            1          2955       2942       1379                7        0
   118  1920           0            1          3009       2997       1416                8        0
   126  2048           0            0         25276      25276      12638                1        0
   128  2080           0            1          6060       6052       3232                8        0
   129  2096           1            0          3081       3080       1659                7        0
   134  2176           0            1         14835      14830       7912                8        0
   135  2192           0            1          2769       2758       1491                7        0
   137  2224           0            1          5082       5077       2772                6        0
   140  2272           0            1          7236       7232       4020                5        0
   144  2336           0            1          8428       8423       4816                4        0
   147  2384           0            1          5316       5313       3101                7        0
   151  2448           0            1          5445       5443       3267                3        0
   155  2512           0            0          4121       4121       2536                8        0
   158  2560           0            1          2208       2205       1380                5        0
   160  2592           0            0          1133       1133        721                7        0
   168  2720           0            0          2712       2712       1808                2        0
   177  2864           1            0          1100       1098        770                7        0
   180  2912           0            1           189        183        135                5        0
   184  2976           0            1           176        166        128                8        0
   190  3072           0            0           252        252        189                3        0
   197  3184           0            1           198        192        154                7        0
   202  3264           0            1           100         96         80                4        0
   211  3408           0            1           210        208        175                5        0
   217  3504           0            1            98         94         84                6        0
   222  3584           0            0           104        104         91                7        0
   225  3632           0            1            54         50         48                8        0
   254  4096           0            0         33591      33591      33591                1        0

Note, the huge size watermark is above 3632 and there are a number of new
normal classes available that previously were merged with the huge class. 
For instance, size class torvalds#211 holds 210 objects of size 3408 and uses 175
physical pages, while previously for those objects we would have used 210
physical pages.

Link: https://lkml.kernel.org/r/20221031054108.541190-3-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
jonhunter pushed a commit to jonhunter/linux that referenced this pull request Nov 8, 2022
zsmalloc has 255 size classes.  Size classes contain a number of zspages,
which store objects of the same size.  zspage can consist of up to four
physical pages.  The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of objects
zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes torvalds#95-99 are merged with size class torvalds#100. That is, each time
we store an object of size, say, 1568 bytes instead of using class torvalds#96
we end up storing it in size class torvalds#100. Class torvalds#100 is for objects of
1632 bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes.
Class torvalds#100 zspages consist of 2 physical pages and can hold 5 objects.
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class torvalds#96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages. A 5 page class torvalds#96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class torvalds#100 will allocate.

A higher order zspage for class torvalds#96 also changes its key characteristics:
pages per-zspage and objects per-zspage. As a result classes torvalds#96 and torvalds#100
are not merged anymore, which gives us more compact zsmalloc.

Of course the described effect does not apply only to size classes torvalds#96 and
We still merge classes, but less often so. In other words classes are grouped
in a more compact way, which decreases memory wastage:

zspage order               # unique size classes
     2                                69
     3                               123
     4                               191

Let's take a closer look at the bottom of /sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is torvalds#202, which stores objects of size 3264 bytes. Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class torvalds#254, which uses a whole physical page to store every object. To put
it slightly differently - objects in huge classes don't share physical pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from torvalds#203 to torvalds#254. Similarly to class size torvalds#96 above, higher order
zspages change key characteristics for some of those huge size classes and
thus those classes become normal classes, where stored objects share physical
pages.

Hence yet another consequence of higher order zspages: we move the huge
size class watermark with higher order zspages, have less huge classes and
store large objects in a more compact way.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted the VM.

orig_data_size       mem_used_total     mem_used_max       pages_compacted
          compr_data_size         mem_limit         same_pages       huge_pages

ORDER 2 (BASE) zspage

1691791360 628086729 655171584        0 655171584       60        0    34043
1691787264 628089196 655175680        0 655175680       60        0    34046
1691803648 628098840 655187968        0 655187968       59        0    34047
1691795456 628091503 655183872        0 655183872       60        0    34044
1691799552 628086877 655183872        0 655183872       60        0    34047

ORDER 3 zspage

1691803648 627792993 641794048        0 641794048       60        0    33591
1691787264 627779342 641708032        0 641708032       59        0    33591
1691811840 627786616 641769472        0 641769472       60        0    33591
1691803648 627794468 641818624        0 641818624       59        0    33592
1691783168 627780882 641794048        0 641794048       61        0    33591

ORDER 4 zspage

1691803648 627726635 639655936        0 639655936       60        0    33435
1691811840 627733348 639643648        0 639643648       61        0    33434
1691795456 627726290 639614976        0 639614976       60        0    33435
1691803648 627730458 639688704        0 639688704       60        0    33434
1691811840 627727771 639688704        0 639688704       60        0    33434

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.4170803e+08 6.4181862e+08 6.4179405e+08 6.4177684e+08     42210.666
Difference at 95.0% confidence
	-1.34038e+07 +/- 44080.7
	-2.04581% +/- 0.00672802%
	(Student's t, pooled s = 30224.5)

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.3961498e+08  6.396887e+08 6.3965594e+08 6.3965839e+08     31408.602
Difference at 95.0% confidence
	-1.55222e+07 +/- 33126.2
	-2.36915% +/- 0.00505604%
	(Student's t, pooled s = 22713.4)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

zsmalloc object distribution analysis
=============================================================================

Order 2 (4 pages per zspage) tends to put many objects in size class 2048,
which is merged with size classes torvalds#112-torvalds#125:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            0          6146       6146       1756                2        0
    74  1216           0            1          4560       4552       1368                3        0
    76  1248           0            1          2938       2934        904                4        0
    83  1360           0            0         10971      10971       3657                1        0
    91  1488           0            0         16126      16126       5864                4        0
    94  1536           0            1          5912       5908       2217                3        0
   100  1632           0            0         11990      11990       4796                2        0
   107  1744           0            1         15771      15768       6759                3        0
   111  1808           0            1         10386      10380       4616                4        0
   126  2048           0            0         45444      45444      22722                1        0
   144  2336           0            0         47446      47446      27112                4        0
   151  2448           1            0         10760      10759       6456                3        0
   168  2720           0            0         10173      10173       6782                2        0
   190  3072           0            1          1700       1697       1275                3        0
   202  3264           0            1           290        286        232                4        0
   254  4096           0            0         34051      34051      34051                1        0

Order 3 (8 pages per zspage) changed pool characteristics and unmerged
some of the size classes, which resulted in less objects being put into
size class 2048, because there are lower size classes are now available
for more compact object storage:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            1          2996       2994        856                2        0
    72  1184           0            1          1632       1609        476                7        0
    73  1200           1            0          1445       1442        425                5        0
    74  1216           0            0          1510       1510        453                3        0
    75  1232           0            1          1495       1479        455                7        0
    76  1248           0            1          1456       1451        448                4        0
    78  1280           0            1          3040       3033        950                5        0
    79  1296           0            1          1584       1571        504                7        0
    83  1360           0            0          6375       6375       2125                1        0
    84  1376           0            1          1817       1796        632                8        0
    87  1424           0            1          6020       6006       2107                7        0
    88  1440           0            1          2108       2101        744                6        0
    89  1456           0            1          2072       2064        740                5        0
    91  1488           0            1          4169       4159       1516                4        0
    92  1504           0            1          2014       2007        742                7        0
    94  1536           0            1          3904       3900       1464                3        0
    95  1552           0            1          1890       1873        720                8        0
    96  1568           0            1          1963       1958        755                5        0
    97  1584           0            1          1980       1974        770                7        0
   100  1632           0            1          6190       6187       2476                2        0
   103  1680           0            0          6477       6477       2667                7        0
   104  1696           0            1          2256       2253        940                5        0
   105  1712           0            1          2356       2340        992                8        0
   107  1744           1            0          4697       4696       2013                3        0
   110  1792           0            1          7744       7734       3388                7        0
   111  1808           0            1          2655       2649       1180                4        0
   114  1856           0            1          8371       8365       3805                5        0
   116  1888           1            0          5863       5862       2706                6        0
   117  1904           0            1          2955       2942       1379                7        0
   118  1920           0            1          3009       2997       1416                8        0
   126  2048           0            0         25276      25276      12638                1        0
   128  2080           0            1          6060       6052       3232                8        0
   129  2096           1            0          3081       3080       1659                7        0
   134  2176           0            1         14835      14830       7912                8        0
   135  2192           0            1          2769       2758       1491                7        0
   137  2224           0            1          5082       5077       2772                6        0
   140  2272           0            1          7236       7232       4020                5        0
   144  2336           0            1          8428       8423       4816                4        0
   147  2384           0            1          5316       5313       3101                7        0
   151  2448           0            1          5445       5443       3267                3        0
   155  2512           0            0          4121       4121       2536                8        0
   158  2560           0            1          2208       2205       1380                5        0
   160  2592           0            0          1133       1133        721                7        0
   168  2720           0            0          2712       2712       1808                2        0
   177  2864           1            0          1100       1098        770                7        0
   180  2912           0            1           189        183        135                5        0
   184  2976           0            1           176        166        128                8        0
   190  3072           0            0           252        252        189                3        0
   197  3184           0            1           198        192        154                7        0
   202  3264           0            1           100         96         80                4        0
   211  3408           0            1           210        208        175                5        0
   217  3504           0            1            98         94         84                6        0
   222  3584           0            0           104        104         91                7        0
   225  3632           0            1            54         50         48                8        0
   254  4096           0            0         33591      33591      33591                1        0

Note, the huge size watermark is above 3632 and there are a number of new
normal classes available that previously were merged with the huge class. 
For instance, size class torvalds#211 holds 210 objects of size 3408 and uses 175
physical pages, while previously for those objects we would have used 210
physical pages.

Link: https://lkml.kernel.org/r/20221031054108.541190-3-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Nov 9, 2022
zsmalloc has 255 size classes.  Size classes contain a number of zspages,
which store objects of the same size.  zspage can consist of up to four
physical pages.  The exact (most optimal) zspage size is calculated for
each size class during zsmalloc pool creation.

As a reasonable optimization, zsmalloc merges size classes that have
similar characteristics: number of pages per zspage and number of objects
zspage can store.

For example, let's look at the following size classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
..
   94  1536           0            0             0          0          0                3        0
  100  1632           0            0             0          0          0                2        0
..

Size classes torvalds#95-99 are merged with size class torvalds#100. That is, each time
we store an object of size, say, 1568 bytes instead of using class torvalds#96
we end up storing it in size class torvalds#100. Class torvalds#100 is for objects of
1632 bytes in size, hence every 1568 bytes object wastes 1632-1568 bytes.
Class torvalds#100 zspages consist of 2 physical pages and can hold 5 objects.
When we need to store, say, 13 objects of size 1568 we end up allocating
three zspages; in other words, 6 physical pages.

However, if we'll look closer at size class torvalds#96 (which should hold objects
of size 1568 bytes) and trace get_pages_per_zspage():

    pages per zspage      wasted bytes     used%
           1                  960           76
           2                  352           95
           3                 1312           89
           4                  704           95
           5                   96           99

We'd notice that the most optimal zspage configuration for this class is
when it consists of 5 physical pages, but currently we never let zspages
to consists of more than 4 pages. A 5 page class torvalds#96 configuration would
store 13 objects of size 1568 in a single zspage, allocating 5 physical
pages, as opposed to 6 physical pages that class torvalds#100 will allocate.

A higher order zspage for class torvalds#96 also changes its key characteristics:
pages per-zspage and objects per-zspage. As a result classes torvalds#96 and torvalds#100
are not merged anymore, which gives us more compact zsmalloc.

Of course the described effect does not apply only to size classes torvalds#96 and
We still merge classes, but less often so. In other words classes are grouped
in a more compact way, which decreases memory wastage:

zspage order               # unique size classes
     2                                69
     3                               123
     4                               191

Let's take a closer look at the bottom of /sys/kernel/debug/zsmalloc/zram0/classes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  254  4096           0            0             0          0          0                1        0
...

For exactly same reason - maximum 4 pages per zspage - the last non-huge
size class is torvalds#202, which stores objects of size 3264 bytes. Any object
larger than 3264 bytes, hence, is considered to be huge and lands in size
class torvalds#254, which uses a whole physical page to store every object. To put
it slightly differently - objects in huge classes don't share physical pages.

3264 bytes is too low of a watermark and we have too many huge classes:
classes from torvalds#203 to torvalds#254. Similarly to class size torvalds#96 above, higher order
zspages change key characteristics for some of those huge size classes and
thus those classes become normal classes, where stored objects share physical
pages.

Hence yet another consequence of higher order zspages: we move the huge
size class watermark with higher order zspages, have less huge classes and
store large objects in a more compact way.

For order 3, huge class watermark becomes 3632 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  211  3408           0            0             0          0          0                5        0
  217  3504           0            0             0          0          0                6        0
  222  3584           0            0             0          0          0                7        0
  225  3632           0            0             0          0          0                8        0
  254  4096           0            0             0          0          0                1        0
...

For order 4, huge class watermark becomes 3840 bytes:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
  202  3264           0            0             0          0          0                4        0
  206  3328           0            0             0          0          0               13        0
  207  3344           0            0             0          0          0                9        0
  208  3360           0            0             0          0          0               14        0
  211  3408           0            0             0          0          0                5        0
  212  3424           0            0             0          0          0               16        0
  214  3456           0            0             0          0          0               11        0
  217  3504           0            0             0          0          0                6        0
  219  3536           0            0             0          0          0               13        0
  222  3584           0            0             0          0          0                7        0
  223  3600           0            0             0          0          0               15        0
  225  3632           0            0             0          0          0                8        0
  228  3680           0            0             0          0          0                9        0
  230  3712           0            0             0          0          0               10        0
  232  3744           0            0             0          0          0               11        0
  234  3776           0            0             0          0          0               12        0
  235  3792           0            0             0          0          0               13        0
  236  3808           0            0             0          0          0               14        0
  238  3840           0            0             0          0          0               15        0
  254  4096           0            0             0          0          0                1        0
...

TESTS
=====

Test untars linux-6.0.tar.xz and compiles the kernel.

zram is configured as a block device with ext4 file system, lzo-rle
compression algorithm. We captured /sys/block/zram0/mm_stat after
every test and rebooted the VM.

orig_data_size       mem_used_total     mem_used_max       pages_compacted
          compr_data_size         mem_limit         same_pages       huge_pages

ORDER 2 (BASE) zspage

1691791360 628086729 655171584        0 655171584       60        0    34043
1691787264 628089196 655175680        0 655175680       60        0    34046
1691803648 628098840 655187968        0 655187968       59        0    34047
1691795456 628091503 655183872        0 655183872       60        0    34044
1691799552 628086877 655183872        0 655183872       60        0    34047

ORDER 3 zspage

1691803648 627792993 641794048        0 641794048       60        0    33591
1691787264 627779342 641708032        0 641708032       59        0    33591
1691811840 627786616 641769472        0 641769472       60        0    33591
1691803648 627794468 641818624        0 641818624       59        0    33592
1691783168 627780882 641794048        0 641794048       61        0    33591

ORDER 4 zspage

1691803648 627726635 639655936        0 639655936       60        0    33435
1691811840 627733348 639643648        0 639643648       61        0    33434
1691795456 627726290 639614976        0 639614976       60        0    33435
1691803648 627730458 639688704        0 639688704       60        0    33434
1691811840 627727771 639688704        0 639688704       60        0    33434

Order 3 and order 4 show statistically significant improvement in
`mem_used_max` metrics.

T-test for order 3:

x order-2-maxmem
+ order-3-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.4170803e+08 6.4181862e+08 6.4179405e+08 6.4177684e+08     42210.666
Difference at 95.0% confidence
	-1.34038e+07 +/- 44080.7
	-2.04581% +/- 0.00672802%
	(Student's t, pooled s = 30224.5)

T-test for order 4:

x order-2-maxmem
+ order-4-maxmem
    N           Min           Max        Median           Avg        Stddev
x   5 6.5517158e+08 6.5518797e+08 6.5518387e+08  6.551806e+08     6730.4157
+   5 6.3961498e+08  6.396887e+08 6.3965594e+08 6.3965839e+08     31408.602
Difference at 95.0% confidence
	-1.55222e+07 +/- 33126.2
	-2.36915% +/- 0.00505604%
	(Student's t, pooled s = 22713.4)

This test tends to benefit more from order 4 zspages, due to test's data
patterns.

zsmalloc object distribution analysis
=============================================================================

Order 2 (4 pages per zspage) tends to put many objects in size class 2048,
which is merged with size classes torvalds#112-torvalds#125:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            0          6146       6146       1756                2        0
    74  1216           0            1          4560       4552       1368                3        0
    76  1248           0            1          2938       2934        904                4        0
    83  1360           0            0         10971      10971       3657                1        0
    91  1488           0            0         16126      16126       5864                4        0
    94  1536           0            1          5912       5908       2217                3        0
   100  1632           0            0         11990      11990       4796                2        0
   107  1744           0            1         15771      15768       6759                3        0
   111  1808           0            1         10386      10380       4616                4        0
   126  2048           0            0         45444      45444      22722                1        0
   144  2336           0            0         47446      47446      27112                4        0
   151  2448           1            0         10760      10759       6456                3        0
   168  2720           0            0         10173      10173       6782                2        0
   190  3072           0            1          1700       1697       1275                3        0
   202  3264           0            1           290        286        232                4        0
   254  4096           0            0         34051      34051      34051                1        0

Order 3 (8 pages per zspage) changed pool characteristics and unmerged
some of the size classes, which resulted in less objects being put into
size class 2048, because there are lower size classes are now available
for more compact object storage:

class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage freeable
...
    71  1168           0            1          2996       2994        856                2        0
    72  1184           0            1          1632       1609        476                7        0
    73  1200           1            0          1445       1442        425                5        0
    74  1216           0            0          1510       1510        453                3        0
    75  1232           0            1          1495       1479        455                7        0
    76  1248           0            1          1456       1451        448                4        0
    78  1280           0            1          3040       3033        950                5        0
    79  1296           0            1          1584       1571        504                7        0
    83  1360           0            0          6375       6375       2125                1        0
    84  1376           0            1          1817       1796        632                8        0
    87  1424           0            1          6020       6006       2107                7        0
    88  1440           0            1          2108       2101        744                6        0
    89  1456           0            1          2072       2064        740                5        0
    91  1488           0            1          4169       4159       1516                4        0
    92  1504           0            1          2014       2007        742                7        0
    94  1536           0            1          3904       3900       1464                3        0
    95  1552           0            1          1890       1873        720                8        0
    96  1568           0            1          1963       1958        755                5        0
    97  1584           0            1          1980       1974        770                7        0
   100  1632           0            1          6190       6187       2476                2        0
   103  1680           0            0          6477       6477       2667                7        0
   104  1696           0            1          2256       2253        940                5        0
   105  1712           0            1          2356       2340        992                8        0
   107  1744           1            0          4697       4696       2013                3        0
   110  1792           0            1          7744       7734       3388                7        0
   111  1808           0            1          2655       2649       1180                4        0
   114  1856           0            1          8371       8365       3805                5        0
   116  1888           1            0          5863       5862       2706                6        0
   117  1904           0            1          2955       2942       1379                7        0
   118  1920           0            1          3009       2997       1416                8        0
   126  2048           0            0         25276      25276      12638                1        0
   128  2080           0            1          6060       6052       3232                8        0
   129  2096           1            0          3081       3080       1659                7        0
   134  2176           0            1         14835      14830       7912                8        0
   135  2192           0            1          2769       2758       1491                7        0
   137  2224           0            1          5082       5077       2772                6        0
   140  2272           0            1          7236       7232       4020                5        0
   144  2336           0            1          8428       8423       4816                4        0
   147  2384           0            1          5316       5313       3101                7        0
   151  2448           0            1          5445       5443       3267                3        0
   155  2512           0            0          4121       4121       2536                8        0
   158  2560           0            1          2208       2205       1380                5        0
   160  2592           0            0          1133       1133        721                7        0
   168  2720           0            0          2712       2712       1808                2        0
   177  2864           1            0          1100       1098        770                7        0
   180  2912           0            1           189        183        135                5        0
   184  2976           0            1           176        166        128                8        0
   190  3072           0            0           252        252        189                3        0
   197  3184           0            1           198        192        154                7        0
   202  3264           0            1           100         96         80                4        0
   211  3408           0            1           210        208        175                5        0
   217  3504           0            1            98         94         84                6        0
   222  3584           0            0           104        104         91                7        0
   225  3632           0            1            54         50         48                8        0
   254  4096           0            0         33591      33591      33591                1        0

Note, the huge size watermark is above 3632 and there are a number of new
normal classes available that previously were merged with the huge class. 
For instance, size class torvalds#211 holds 210 objects of size 3408 and uses 175
physical pages, while previously for those objects we would have used 210
physical pages.

Link: https://lkml.kernel.org/r/20221031054108.541190-3-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
rogerq added a commit to rogerq/linux that referenced this pull request Nov 16, 2022
The set channel operation "ethtool -L tx <n>" broke with
the recent suspend/resume changes.

Revert back to original driver behaviour of not freeing
the TX/RX IRQs at am65_cpsw_nuss_common_stop(). We will
now free them only on .suspend() as we need to release
the DMA channels (as DMA looses context) and re-acquiring
them on .resume() may not necessarily give us the same
IRQs.

Introduce am65_cpsw_nuss_remove_rx_chns() which is similar
to am65_cpsw_nuss_remove_tx_chns() and invoke them both in
.suspend().

At .resume() call am65_cpsw_nuss_init_rx/tx_chns() to
acquire the DMA channels.

To as IRQs need to be requested after knowing the IRQ
numbers, move am65_cpsw_nuss_ndev_add_tx_napi() call to
am65_cpsw_nuss_init_tx_chns().

Also fixes the below warning during suspend/resume on multi
CPU system.

[   67.347684] ------------[ cut here ]------------
[   67.347700] Unbalanced enable for IRQ 119
[   67.347726] WARNING: CPU: 0 PID: 1080 at kernel/irq/manage.c:781 __enable_irq+0x4c/0x80
[   67.347754] Modules linked in: wlcore_sdio wl18xx wlcore mac80211 libarc4 cfg80211 rfkill crct10dif_ce sch_fq_codel ipv6
[   67.347803] CPU: 0 PID: 1080 Comm: rtcwake Not tainted 6.1.0-rc4-00023-gc826e5480732-dirty torvalds#203
[   67.347812] Hardware name: Texas Instruments AM625 (DT)
[   67.347818] pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   67.347829] pc : __enable_irq+0x4c/0x80
[   67.347838] lr : __enable_irq+0x4c/0x80
[   67.347846] sp : ffff80000999ba00
[   67.347850] x29: ffff80000999ba00 x28: ffff0000011c1c80 x27: 0000000000000000
[   67.347863] x26: 00000000000001f4 x25: ffff000001058358 x24: ffff000001059080
[   67.347876] x23: ffff000001058080 x22: ffff000001060000 x21: 0000000000000077
[   67.347888] x20: ffff0000011c1c80 x19: ffff000001429600 x18: 0000000000000001
[   67.347900] x17: 0000000000000080 x16: fffffc000176e008 x15: ffff0000011c21b0
[   67.347913] x14: 0000000000000000 x13: 3931312051524920 x12: 726f6620656c6261
[   67.347925] x11: 656820747563205b x10: 000000000000000a x9 : ffff80000999ba00
[   67.347938] x8 : ffff800009121068 x7 : ffff80000999b810 x6 : 00000000fffff17f
[   67.347950] x5 : ffff00007fb99b18 x4 : 0000000000000000 x3 : 0000000000000027
[   67.347962] x2 : ffff00007fb99b20 x1 : 50dd48f7f19deb00 x0 : 0000000000000000
[   67.347975] Call trace:
[   67.347980]  __enable_irq+0x4c/0x80
[   67.347989]  enable_irq+0x4c/0xa0
[   67.347999]  am65_cpsw_nuss_ndo_slave_open+0x4b0/0x568
[   67.348015]  am65_cpsw_nuss_resume+0x68/0x160
[   67.348025]  dpm_run_callback.isra.0+0x28/0x88
[   67.348040]  device_resume+0x78/0x160
[   67.348050]  dpm_resume+0xc0/0x1f8
[   67.348057]  dpm_resume_end+0x18/0x30
[   67.348063]  suspend_devices_and_enter+0x1cc/0x4e0
[   67.348075]  pm_suspend+0x1f8/0x268
[   67.348084]  state_store+0x8c/0x118
[   67.348092]  kobj_attr_store+0x18/0x30
[   67.348104]  sysfs_kf_write+0x44/0x58
[   67.348117]  kernfs_fop_write_iter+0x118/0x1a8
[   67.348127]  vfs_write+0x31c/0x418
[   67.348140]  ksys_write+0x6c/0xf8
[   67.348150]  __arm64_sys_write+0x1c/0x28
[   67.348160]  invoke_syscall+0x44/0x108
[   67.348172]  el0_svc_common.constprop.0+0x44/0xf0
[   67.348182]  do_el0_svc+0x2c/0xc8
[   67.348191]  el0_svc+0x2c/0x88
[   67.348201]  el0t_64_sync_handler+0xb8/0xc0
[   67.348209]  el0t_64_sync+0x18c/0x190
[   67.348218] ---[ end trace 0000000000000000 ]---

Fixes: fd23df7 ("net: ethernet: ti: am65-cpsw: Add suspend/resume support")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Nov 16, 2022
The set channel operation "ethtool -L tx <n>" broke with
the recent suspend/resume changes.

Revert back to original driver behaviour of not freeing
the TX/RX IRQs at am65_cpsw_nuss_common_stop(). We will
now free them only on .suspend() as we need to release
the DMA channels (as DMA looses context) and re-acquiring
them on .resume() may not necessarily give us the same
IRQs.

Introduce am65_cpsw_nuss_remove_rx_chns() which is similar
to am65_cpsw_nuss_remove_tx_chns() and invoke them both in
.suspend().

At .resume() call am65_cpsw_nuss_init_rx/tx_chns() to
acquire the DMA channels.

To as IRQs need to be requested after knowing the IRQ
numbers, move am65_cpsw_nuss_ndev_add_tx_napi() call to
am65_cpsw_nuss_init_tx_chns().

Also fixes the below warning during suspend/resume on multi
CPU system.

[   67.347684] ------------[ cut here ]------------
[   67.347700] Unbalanced enable for IRQ 119
[   67.347726] WARNING: CPU: 0 PID: 1080 at kernel/irq/manage.c:781 __enable_irq+0x4c/0x80
[   67.347754] Modules linked in: wlcore_sdio wl18xx wlcore mac80211 libarc4 cfg80211 rfkill crct10dif_ce sch_fq_codel ipv6
[   67.347803] CPU: 0 PID: 1080 Comm: rtcwake Not tainted 6.1.0-rc4-00023-gc826e5480732-dirty torvalds#203
[   67.347812] Hardware name: Texas Instruments AM625 (DT)
[   67.347818] pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   67.347829] pc : __enable_irq+0x4c/0x80
[   67.347838] lr : __enable_irq+0x4c/0x80
[   67.347846] sp : ffff80000999ba00
[   67.347850] x29: ffff80000999ba00 x28: ffff0000011c1c80 x27: 0000000000000000
[   67.347863] x26: 00000000000001f4 x25: ffff000001058358 x24: ffff000001059080
[   67.347876] x23: ffff000001058080 x22: ffff000001060000 x21: 0000000000000077
[   67.347888] x20: ffff0000011c1c80 x19: ffff000001429600 x18: 0000000000000001
[   67.347900] x17: 0000000000000080 x16: fffffc000176e008 x15: ffff0000011c21b0
[   67.347913] x14: 0000000000000000 x13: 3931312051524920 x12: 726f6620656c6261
[   67.347925] x11: 656820747563205b x10: 000000000000000a x9 : ffff80000999ba00
[   67.347938] x8 : ffff800009121068 x7 : ffff80000999b810 x6 : 00000000fffff17f
[   67.347950] x5 : ffff00007fb99b18 x4 : 0000000000000000 x3 : 0000000000000027
[   67.347962] x2 : ffff00007fb99b20 x1 : 50dd48f7f19deb00 x0 : 0000000000000000
[   67.347975] Call trace:
[   67.347980]  __enable_irq+0x4c/0x80
[   67.347989]  enable_irq+0x4c/0xa0
[   67.347999]  am65_cpsw_nuss_ndo_slave_open+0x4b0/0x568
[   67.348015]  am65_cpsw_nuss_resume+0x68/0x160
[   67.348025]  dpm_run_callback.isra.0+0x28/0x88
[   67.348040]  device_resume+0x78/0x160
[   67.348050]  dpm_resume+0xc0/0x1f8
[   67.348057]  dpm_resume_end+0x18/0x30
[   67.348063]  suspend_devices_and_enter+0x1cc/0x4e0
[   67.348075]  pm_suspend+0x1f8/0x268
[   67.348084]  state_store+0x8c/0x118
[   67.348092]  kobj_attr_store+0x18/0x30
[   67.348104]  sysfs_kf_write+0x44/0x58
[   67.348117]  kernfs_fop_write_iter+0x118/0x1a8
[   67.348127]  vfs_write+0x31c/0x418
[   67.348140]  ksys_write+0x6c/0xf8
[   67.348150]  __arm64_sys_write+0x1c/0x28
[   67.348160]  invoke_syscall+0x44/0x108
[   67.348172]  el0_svc_common.constprop.0+0x44/0xf0
[   67.348182]  do_el0_svc+0x2c/0xc8
[   67.348191]  el0_svc+0x2c/0x88
[   67.348201]  el0t_64_sync_handler+0xb8/0xc0
[   67.348209]  el0t_64_sync+0x18c/0x190
[   67.348218] ---[ end trace 0000000000000000 ]---

Fixes: fd23df7 ("net: ethernet: ti: am65-cpsw: Add suspend/resume support")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
rogerq added a commit to rogerq/linux that referenced this pull request Nov 17, 2022
The set channel operation "ethtool -L tx <n>" broke with
the recent suspend/resume changes.

Revert back to original driver behaviour of not freeing
the TX/RX IRQs at am65_cpsw_nuss_common_stop(). We will
now free them only on .suspend() as we need to release
the DMA channels (as DMA looses context) and re-acquiring
them on .resume() may not necessarily give us the same
IRQs.

Introduce am65_cpsw_nuss_remove_rx_chns() which is similar
to am65_cpsw_nuss_remove_tx_chns() and invoke them both in
.suspend().

At .resume() call am65_cpsw_nuss_init_rx/tx_chns() to
acquire the DMA channels.

To as IRQs need to be requested after knowing the IRQ
numbers, move am65_cpsw_nuss_ndev_add_tx_napi() call to
am65_cpsw_nuss_init_tx_chns().

Also fixes the below warning during suspend/resume on multi
CPU system.

[   67.347684] ------------[ cut here ]------------
[   67.347700] Unbalanced enable for IRQ 119
[   67.347726] WARNING: CPU: 0 PID: 1080 at kernel/irq/manage.c:781 __enable_irq+0x4c/0x80
[   67.347754] Modules linked in: wlcore_sdio wl18xx wlcore mac80211 libarc4 cfg80211 rfkill crct10dif_ce sch_fq_codel ipv6
[   67.347803] CPU: 0 PID: 1080 Comm: rtcwake Not tainted 6.1.0-rc4-00023-gc826e5480732-dirty torvalds#203
[   67.347812] Hardware name: Texas Instruments AM625 (DT)
[   67.347818] pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   67.347829] pc : __enable_irq+0x4c/0x80
[   67.347838] lr : __enable_irq+0x4c/0x80
[   67.347846] sp : ffff80000999ba00
[   67.347850] x29: ffff80000999ba00 x28: ffff0000011c1c80 x27: 0000000000000000
[   67.347863] x26: 00000000000001f4 x25: ffff000001058358 x24: ffff000001059080
[   67.347876] x23: ffff000001058080 x22: ffff000001060000 x21: 0000000000000077
[   67.347888] x20: ffff0000011c1c80 x19: ffff000001429600 x18: 0000000000000001
[   67.347900] x17: 0000000000000080 x16: fffffc000176e008 x15: ffff0000011c21b0
[   67.347913] x14: 0000000000000000 x13: 3931312051524920 x12: 726f6620656c6261
[   67.347925] x11: 656820747563205b x10: 000000000000000a x9 : ffff80000999ba00
[   67.347938] x8 : ffff800009121068 x7 : ffff80000999b810 x6 : 00000000fffff17f
[   67.347950] x5 : ffff00007fb99b18 x4 : 0000000000000000 x3 : 0000000000000027
[   67.347962] x2 : ffff00007fb99b20 x1 : 50dd48f7f19deb00 x0 : 0000000000000000
[   67.347975] Call trace:
[   67.347980]  __enable_irq+0x4c/0x80
[   67.347989]  enable_irq+0x4c/0xa0
[   67.347999]  am65_cpsw_nuss_ndo_slave_open+0x4b0/0x568
[   67.348015]  am65_cpsw_nuss_resume+0x68/0x160
[   67.348025]  dpm_run_callback.isra.0+0x28/0x88
[   67.348040]  device_resume+0x78/0x160
[   67.348050]  dpm_resume+0xc0/0x1f8
[   67.348057]  dpm_resume_end+0x18/0x30
[   67.348063]  suspend_devices_and_enter+0x1cc/0x4e0
[   67.348075]  pm_suspend+0x1f8/0x268
[   67.348084]  state_store+0x8c/0x118
[   67.348092]  kobj_attr_store+0x18/0x30
[   67.348104]  sysfs_kf_write+0x44/0x58
[   67.348117]  kernfs_fop_write_iter+0x118/0x1a8
[   67.348127]  vfs_write+0x31c/0x418
[   67.348140]  ksys_write+0x6c/0xf8
[   67.348150]  __arm64_sys_write+0x1c/0x28
[   67.348160]  invoke_syscall+0x44/0x108
[   67.348172]  el0_svc_common.constprop.0+0x44/0xf0
[   67.348182]  do_el0_svc+0x2c/0xc8
[   67.348191]  el0_svc+0x2c/0x88
[   67.348201]  el0t_64_sync_handler+0xb8/0xc0
[   67.348209]  el0t_64_sync+0x18c/0x190
[   67.348218] ---[ end trace 0000000000000000 ]---

Fixes: fd23df7 ("net: ethernet: ti: am65-cpsw: Add suspend/resume support")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
pv added a commit to pv/linux that referenced this pull request Mar 11, 2023
If hci_conn_del gets called on a LE connection linked to a CIS
connection, subsequent hci_conn_del on the CIS connection results to
use-after-free [1] as cis->link still points to the deleted connection.

This occurs e.g. if hci_cmd_sync_queue fails in hci_le_create_cis.

Fix it by doing the same what is done with the SCO+ACL linked
connections.

[1]:
BUG: KASAN: use-after-free in hci_conn_del+0xa4/0x3e0
Write of size 8 at addr ffff8880013d2668 by task iso-tester/29

CPU: 0 PID: 29 Comm: iso-tester Not tainted 6.2.0-rc7-00024-g0e21956501c0-dirty torvalds#203
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.1-2.fc37 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x19/0x27
 print_report+0x160/0x484
 ? __virt_addr_valid+0xd4/0x150
 ? hci_conn_del+0xa4/0x3e0
 kasan_report+0xc7/0xf0
 ? hci_conn_del+0xa4/0x3e0
 hci_conn_del+0xa4/0x3e0
 hci_conn_hash_flush+0xea/0x130
 hci_dev_close_sync+0x34f/0x930
 hci_unregister_dev+0x104/0x2a0
 vhci_release+0x4c/0x90
 __fput+0x102/0x410
 task_work_run+0xfe/0x180
 ? __pfx_task_work_run+0x10/0x10
 exit_to_user_mode_prepare+0xfd/0x100
 syscall_exit_to_user_mode+0x1c/0x50
 do_syscall_64+0x4e/0x90
 entry_SYSCALL_64_after_hwframe+0x70/0xda
RIP: 0033:0x7f9880de0944
pv added a commit to pv/linux that referenced this pull request Jul 15, 2023
If hci_conn_del gets called on a LE connection linked to a CIS
connection, subsequent hci_conn_del on the CIS connection results to
use-after-free [1] as cis->link still points to the deleted connection.

This occurs e.g. if hci_cmd_sync_queue fails in hci_le_create_cis.

Fix it by doing the same what is done with the SCO+ACL linked
connections.

[1]:
BUG: KASAN: use-after-free in hci_conn_del+0xa4/0x3e0
Write of size 8 at addr ffff8880013d2668 by task iso-tester/29

CPU: 0 PID: 29 Comm: iso-tester Not tainted 6.2.0-rc7-00024-g0e21956501c0-dirty torvalds#203
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.1-2.fc37 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x19/0x27
 print_report+0x160/0x484
 ? __virt_addr_valid+0xd4/0x150
 ? hci_conn_del+0xa4/0x3e0
 kasan_report+0xc7/0xf0
 ? hci_conn_del+0xa4/0x3e0
 hci_conn_del+0xa4/0x3e0
 hci_conn_hash_flush+0xea/0x130
 hci_dev_close_sync+0x34f/0x930
 hci_unregister_dev+0x104/0x2a0
 vhci_release+0x4c/0x90
 __fput+0x102/0x410
 task_work_run+0xfe/0x180
 ? __pfx_task_work_run+0x10/0x10
 exit_to_user_mode_prepare+0xfd/0x100
 syscall_exit_to_user_mode+0x1c/0x50
 do_syscall_64+0x4e/0x90
 entry_SYSCALL_64_after_hwframe+0x70/0xda
RIP: 0033:0x7f9880de0944
IonAgorria pushed a commit to IonAgorria/linux that referenced this pull request Nov 23, 2023
…fuel gauge unbind

The charger manager obtained reference to fuel gauge power supply in probe
with power_supply_get_by_name() for later usage. However if fuel gauge
driver was removed and re-added then this reference would point to old
power supply (from driver which was removed).

This lead to accessing old (and probably invalid) memory which could be
observed with:
$ echo "12-0036" > /sys/bus/i2c/drivers/max17042/unbind
$ echo "12-0036" > /sys/bus/i2c/drivers/max17042/bind
$ cat /sys/devices/virtual/power_supply/battery/capacity
[  240.480084] INFO: task cat:1393 blocked for more than 120 seconds.
[  240.484799]       Not tainted 3.17.0-next-20141007-00028-ge60b6dd79570 torvalds#203
[  240.491782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.499589] cat             D c0469530     0  1393      1 0x00000000
[  240.505947] [<c0469530>] (__schedule) from [<c0469d3c>] (schedule_preempt_disabled+0x14/0x20)
[  240.514449] [<c0469d3c>] (schedule_preempt_disabled) from [<c046af08>] (mutex_lock_nested+0x1bc/0x458)
[  240.523736] [<c046af08>] (mutex_lock_nested) from [<c0287a98>] (regmap_read+0x30/0x60)
[  240.531647] [<c0287a98>] (regmap_read) from [<c032238c>] (max17042_get_property+0x2e8/0x350)
[  240.540055] [<c032238c>] (max17042_get_property) from [<c03247d8>] (charger_get_property+0x264/0x348)
[  240.549252] [<c03247d8>] (charger_get_property) from [<c0320764>] (power_supply_show_property+0x48/0x1e0)
[  240.558808] [<c0320764>] (power_supply_show_property) from [<c027308c>] (dev_attr_show+0x1c/0x48)
[  240.567664] [<c027308c>] (dev_attr_show) from [<c0141fb0>] (sysfs_kf_seq_show+0x84/0x104)
[  240.575814] [<c0141fb0>] (sysfs_kf_seq_show) from [<c0140b18>] (kernfs_seq_show+0x24/0x28)
[  240.584061] [<c0140b18>] (kernfs_seq_show) from [<c0104574>] (seq_read+0x1b0/0x484)
[  240.591702] [<c0104574>] (seq_read) from [<c00e1e24>] (vfs_read+0x88/0x144)
[  240.598640] [<c00e1e24>] (vfs_read) from [<c00e1f20>] (SyS_read+0x40/0x8c)
[  240.605507] [<c00e1f20>] (SyS_read) from [<c000e760>] (ret_fast_syscall+0x0/0x48)
[  240.612952] 4 locks held by cat/1393:
[  240.616589]  #0:  (&p->lock){+.+.+.}, at: [<c01043f4>] seq_read+0x30/0x484
[  240.623414]  #1:  (&of->mutex){+.+.+.}, at: [<c01417dc>] kernfs_seq_start+0x1c/0x8c
[  240.631086]  #2:  (s_active#31){++++.+}, at: [<c01417e4>] kernfs_seq_start+0x24/0x8c
[  240.638777]  #3:  (&map->mutex){+.+...}, at: [<c0287a98>] regmap_read+0x30/0x60

The charger-manager should get reference to fuel gauge power supply on
each use of get_property callback. The thermal zone 'tzd' field of
power supply should not be used because of the same reason.

Additionally this change solves also the issue with nested
thermal_zone_get_temp() calls and related false lockdep positive for
deadlock for thermal zone's mutex [1]. When fuel gauge is used as source of
temperature then the charger manager forwards its get_temp calls to fuel
gauge thermal zone. So actually different mutexes are used (one for
charger manager thermal zone and second for fuel gauge thermal zone) but
for lockdep this is one class of mutex.

The recursion is removed by retrieving temperature through power
supply's get_property().

In case external thermal zone is used ('cm-thermal-zone' property is
present in DTS) the recursion does not exist. Charger manager simply
exports POWER_SUPPLY_PROP_TEMP_AMBIENT property (instead of
POWER_SUPPLY_PROP_TEMP) thus no thermal zone is created for this power
supply.

[1] https://lkml.org/lkml/2014/10/6/309

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: <stable@vger.kernel.org>
Fixes: 3bb3dbb ("power_supply: Add initial Charger-Manager driver")
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants