Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify some array index operations #36

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Yankefei
Copy link

@Yankefei Yankefei commented Mar 2, 2022

Hi, Erik:
It's a great honor to see the lock free project you released on GitHub. The algorithm is very exquisite, and I've learned a lot. Thank you for your creation.
In the code of MPMC queue, I found two points that can further improve the performance in the processing of array index and the number of turns. I was inspired by the kfifo queue in the Linux kernel code. And my brief description is as follows:

// first point:  
constexpr size_t idx(size_t i) const noexcept { 
    return i % capacity_;
}

// second point:
constexpr size_t turn(size_t i) const noexcept {
    return i / capacity_;
}

​ Every time you access the array index, you need a '%' operation, If the capacity of the queue is an integer power of 2, when the subscript is increasing in the ring queue, a "&" operator can be used to locate the specific array position, which should be a faster operation. Like this:

constexpr size_t idx(size_t i) const noexcept {
    return i & (capacity_ - 1);
}
/*
	if capacity is 8, i is 20 sometimes. Expressed in binary:
	---------------------
	 8:   0 0 0 0  1 0 0 0
   8-1:   0 0 0 0  0 1 1 1
	20:   0 0 0 1  0 1 0 0
	----------------------
	that, 20 & ( 8 - 1) is 4, same as 20 % 8. 
	      0 0 0 0  0 1 0 0  
*/

​ The processing of the number of turns is the same, Each time the turn of the queue loop is calculated, the division method needs to be used. If the capacity keeps the above characteristics, the bit operation can be used to replace the division operation every time the turn is calculated. The process is as follows:

// capacity_ is an integer power of 2,Logarithmic operation:
size_t power_of_capacity_ = ::log2(capacity_); 
constexpr size_t turn(size_t i) const noexcept {
    return  i >> power_of_capacity_;
}
/*
	if capacity is 8, i is 35 sometimes. Expressed in binary:
	---------------------
	 8:   0 0 0 0  1 0 0 0  (power_of_capacity: 3)
	35:   0 0 1 0  0 0 1 1
	----------------------
	that: 35 >> 3 is 4, same as 35 / 8. 
	      0 0 0 0  0 1 0 0
*/

​ The above two points are what I want to express. At the same time, I have carefully modified your code ^_^. I tried to remove the capacity variable and replace it with its mask value. At the same time, some modifications have been made to the constructor. If the passed capacity parameter is less than the integer power of 2, I will round it up. In this way, the assignment of capacity has to be removed from the formal parameter list, and the const attribute of the capacity variable in the class is also removed.

​ These changes will have an impact: the user's capacity parameter is likely not to be the actual number of memory space requests in the queue. Please evaluate.

​ I have passed the test program under the src directory with the above modifications, which may need to be improved in other aspects.

​ These are some of my considerations. I hope to see your guidance and reply.

​ Ps:The Linux kernel code version I read is : linux-4.4.293. Here is the location of the kfifo code, a lightweight SPSC queue using memory barrier.

kfifo.h:    linux-4.4.293\include\linux\kfifo.h
kfifo.c:    linux-4.4.293\lib\kfifo.c

@rigtorp
Copy link
Owner

rigtorp commented Sep 18, 2023

I agree, ideally this should be configurable using a template parameter.

@Yankefei
Copy link
Author

Yankefei commented Mar 3, 2024

Template parameters are a good idea, I am thinking about how to optimize them~~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants