Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpmsg: bring back zero copy transfer #162

Closed
wants to merge 2 commits into from

Conversation

xiaoxiang781216
Copy link
Collaborator

@xiaoxiang781216 xiaoxiang781216 commented Feb 17, 2019

Two simplify the review, I split two small independent patch to new PR:
#240
#241

@xiaoxiang781216
Copy link
Collaborator Author

@wjliang, this patch bring back @MarekNovakNXP's zero copy transfer, please review it.

lib/rpmsg/rpmsg_virtio.c Outdated Show resolved Hide resolved
@wjliang
Copy link
Collaborator

wjliang commented Feb 26, 2019

thanks for you patch, could you share your use case to have this feature?
From your implementation, you add restrictions to the internal implementation, e.g. you use fixed buffer size, this will limit the future extension. And thus, want to know the need to bring back this feature.

@xiaoxiang781216
Copy link
Collaborator Author

thanks for you patch, could you share your use case to have this feature?

We have several drivers use this feature:
1.rpmsg uart driver for console/gps/modem
2.rpmsg syslog driver for printf/log redirection
3.rpmsg netdev driver to talk with onchip mac layer(modem/wifi/ethernet)
4.rpmsg hostfs driver for file API redirection
5.rpmsg usrsock driver for sock API redirection
6.rpmsg clk driver for clk API redirection
Since these drivers:
1.Transfer the huge or variable length packet
2.The data transfer is the critical performance path
zero-copy could avoid one extra memory copy and then boost the performance.

From your implementation, you add restrictions to the internal implementation, e.g. you use fixed buffer size, this will limit the future extension. And thus, want to know the need to bring back this feature.

Ok, I will refine the implementation to save the length in some reserved field as before.

@wjliang
Copy link
Collaborator

wjliang commented Feb 26, 2019

Ok, I will refine the implementation to save the length in some reserved field as before.

My concern is these rpmsg APIs may only work with single buffer. E.g. it will not work with chained buffers.
However, the existing RPMsg implementation only use single buffer. Maybe we can restrict those new APIs to single buffer only.

Also please also post the zero-copy to kernel RPMsg, as you are using the reserved fields from the RPMsg header.

Thanks for you use cases list, I just have some questions:

  • RPMsg netdev, have you considered virtio net?
  • this feature looks like it is for either a system with a very small memory, or for high message passing performance. For huge data, just curious (I haven't done performance measurement), don't you want the payload to be page aligned and cacheable in some cases?

@xiaoxiang781216
Copy link
Collaborator Author

xiaoxiang781216 commented Feb 27, 2019

Ok, I will refine the implementation to save the length in some reserved field as before.

My concern is these rpmsg APIs may only work with single buffer. E.g. it will not work with chained buffers.
However, the existing RPMsg implementation only use single buffer. Maybe we can restrict those new APIs to single buffer only.

If we extend the rpmsg to use the chained buffer, we can extend this API too(e.g. return the chained buffer). Actually, each offical OpenAMP release isn't compatible with the previous one, so I think it isn't a big issue to change this API for the chained buffer.

Also please also post the zero-copy to kernel RPMsg, as you are using the reserved fields from the RPMsg header.

I will provide one, but I think this change is different from:
#155
#160
Since the reserved fields is reused as the temporary storage internally, it can work even the kernel side:
1.Reject the zero copy transfer enhancement
2.Kernel use the reserved to pass the side band info to remote

Thanks for you use cases list, I just have some questions:

  • RPMsg netdev, have you considered virtio net?

Yes, I have, but:
1.viritio support on OpenAMP is still not mature
2.netdev is used between two NuttX instance(not NuttX<->Linux), because NuttX:
a.Have the own and unique net driver model which is incompatible with Linux Kernel
b.NuttX don't support the virtio net driver at all.
3.virito net is used to simulate a ethernet card, but the backend of rpmsg netdev is real mac layer. So rpmsg netdev expose ioctl like interface to let the up net layer control/query the hardware depend feature.
So I have to define the netdev protocol on top of RPMsg. If the communication is between NuttX<->Linux, VirtioIO net maybe a better opiton.

  • this feature looks like it is for either a system with a very small memory, or for high message passing performance. For huge data, just curious (I haven't done performance measurement), don't you want the payload to be page aligned and cacheable in some cases?

I expect that the message embedded directly in rpmsg buffer is around 1KB, the huge data(several ten/hundred KB) should pass through as a pointer, because:
1.Many service run on rpmsg send the small packet, ony few of them need send the huge data in one transfer. It isn't efficient to reserve the big buffer for the rare usecase.
2.If the service need send the huge data, normally it's will use the special memory manage strategy(e.g. ION) and is impossible to push the data into the buffer directly.

@xiaoxiang781216 xiaoxiang781216 force-pushed the zero-copy branch 2 times, most recently from 6804841 to 4b29dcd Compare February 28, 2019 06:54
@xiaoxiang781216
Copy link
Collaborator Author

Rebase to the latest mainline, no function change.

@xiaoxiang781216
Copy link
Collaborator Author

@wjliang fix the build break.

lib/rpmsg/rpmsg_virtio.c Outdated Show resolved Hide resolved
lib/rpmsg/rpmsg_virtio.c Outdated Show resolved Hide resolved
@wjliang wjliang requested review from galak and arnopo March 3, 2019 08:02
@xiaoxiang781216 xiaoxiang781216 force-pushed the zero-copy branch 2 times, most recently from c24071d to 5e3ed51 Compare March 4, 2019 09:43
Copy link
Collaborator

@wjliang wjliang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with minor comments

lib/rpmsg/rpmsg_virtio.c Show resolved Hide resolved
Copy link
Collaborator

@arnopo arnopo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this patch answers to a limited use cases vs big data need:

  • DMA transfer for copies ( cyclic mode)
  • big buffers
  • buffer reuses (data processing on same buffer).
    ...
    I don't want to block it but just a concerns vs a global solution base on a service on top of rpmsg...
    i suppose that you define some shared buffers for the Sound Open Firmware use cases, and rpmsg are used for control, right?

Concerning the patch itself: seem correct. Just one remark:
Something may be missing to release all buffers on application crash or on service destruction ( local or remote). But this would means that reserved buffers should be associated to the ept...

@wjliang
Copy link
Collaborator

wjliang commented Mar 13, 2019

Concerning the patch itself: seem correct. Just one remark:
Something may be missing to release all buffers on application crash or on service destruction ( local or remote). But this would means that reserved buffers should be associated to the ept...

Good point, just in some cases of crashing, one end may not even able to do any cleaning.
Do you think it good to make this as another OpenAMP issue for recovery from failure?

@xiaoxiang781216
Copy link
Collaborator Author

xiaoxiang781216 commented Mar 13, 2019

For me this patch answers to a limited use cases vs big data need:

  • DMA transfer for copies ( cyclic mode)
  • big buffers
  • buffer reuses (data processing on same buffer).
    ...

Since many above cases normally involve the dedicated memory management, I expect:
1.Use rpmsg to pass the pointer to these special memory block or
2.Develop the driver on virtio layer directly to remove the buffer management limitation imposed by rpmsg.

I don't want to block it but just a concerns vs a global solution base on a service on top of rpmsg...
i suppose that you define some shared buffers for the Sound Open Firmware use cases, and rpmsg are used for control, right?

Yes, the audio data buffer is preallocated after the hardware parameter is lock down and recycle in the whole playback/capture stage. The buffer location exchange when the state change from stop/pause to run and read/write directly by DSP, from that the rpmsg is just used to notify the buffer position, no real audo data send/receive through IPC.

Concerning the patch itself: seem correct. Just one remark:
Something may be missing to release all buffers on application crash or on service destruction ( local or remote). But this would means that reserved buffers should be associated to the ept...

Yes, it's a problem if application get a buffer but crash before return back. But since the memory protection is very weak for most OpenAMP usage, the action can be taken is to reboot MCU/DSP if some application crash, so what we need here is to teardown all resource and restart if the peer happen reboot.

@xiaoxiang781216
Copy link
Collaborator Author

@arnop2, the new pull request split the change into two patches.

@arnopo
Copy link
Collaborator

arnopo commented Mar 13, 2019

protection is very weak for most OpenAMP usage, the action can be taken is to reboot MCU/DSP if some application crash, so what we need here is to teardown all resource and restart if the peer happen reboot.
This could be OK for a MCU/DSP dedicated for a feature. But OpenAMP can also run on the main MPU or MCU processor... with multi services on top of RPMsg. In this case you would have to stop all the services and reinit the rpmsg.

@xiaoxiang781216 xiaoxiang781216 changed the title [DON'T MERGE] rpmsg: bring back zero copy transfer rpmsg: bring back zero copy transfer Nov 1, 2020
@xiaoxiang781216 xiaoxiang781216 marked this pull request as draft November 1, 2020 03:29
@xiaoxiang781216 xiaoxiang781216 force-pushed the zero-copy branch 2 times, most recently from 31c6a30 to a05f227 Compare November 1, 2020 14:48
@xiaoxiang781216 xiaoxiang781216 marked this pull request as ready for review November 1, 2020 15:00
@xiaoxiang781216
Copy link
Collaborator Author

@arnopo the sgement fault is fixed. @kr-satish, thanks for your help.

@kr-satish
Copy link

@arnopo the sgement fault is fixed. @kr-satish, thanks for your help.

@xiaoxiang781216 Thanks for the patches.

@arnopo arnopo self-requested a review November 3, 2020 18:23
@arnopo
Copy link
Collaborator

arnopo commented Nov 3, 2020

From my point of view the feature and associated patch are Ok.
Now the remaining things is to limit the increase of the library footprint that is also a strong requirement for low memory systems.
I'm going to make an assessment of the memory footprint based on Zephyr sample to evaluate the impact.
i suppose that the addition of new ops in rpmsg_device_ops will be the main contributor of the footprint increase...

If you have some suggestions to minimize the impact don't hesitate...

@arnopo
Copy link
Collaborator

arnopo commented Nov 4, 2020

I performed measurements for some zehyr samples, building for the NXP expresso and STm32MP1 platforms.
The compiler is GCC with the optimization flag -Os.

I note a library increase of about 10% for the ROM and 16 bytes for the RAM.
That's seems to me very reasonable.

w/o zero copy:
libmetal:	994 bytes 
openamp:	2378 bytes
w zero copy:
libmetal:	994 bytes 
openamp:	2664 bytes (+286)
w/o zero copy:
libmetal:	806 bytes 
openamp:	2300 bytes
w zero copy:
libmetal:	806 bytes 
openamp:	2552 bytes (+252)
w/o zero copy:
libmetal:	968 bytes 
openamp:	2696 bytes
w zero copy:
libmetal:	968 bytes 
openamp:	2868 bytes (+172)

@xiaoxiang781216
Copy link
Collaborator Author

The absolute number is from 172 bytes to 252 bytes. Should we still need add a macro to enable/disable this feature?

@arnopo
Copy link
Collaborator

arnopo commented Nov 5, 2020

The absolute number is from 172 bytes to 252 bytes. Should we still need add a macro to enable/disable this feature?

I tested using macros, which saves about half the extra footprint but adds complexity.
From my point of view, I'll leave it like that and if optimization is needed later, we'll take care of it if necessary.

@arnopo
Copy link
Collaborator

arnopo commented Nov 5, 2020

Hi @galak
As the footprint is also a concern for the Zephyr project, could you give your opinion?
Is the increase of the memory footprint, is reasonable from your point of view?

@xiaoxiang781216 xiaoxiang781216 force-pushed the zero-copy branch 2 times, most recently from 8649c90 to e0f42a4 Compare November 9, 2020 16:45
@xiaoxiang781216
Copy link
Collaborator Author

@arnopo how about we merge this PR now? so it get more time to test before release.

@arnopo
Copy link
Collaborator

arnopo commented Nov 12, 2020

@arnopo how about we merge this PR now? so it get more time to test before release.

I waiting @edmooring and @galak feedback first. As everyone is busy on there own stuff we need to allow a reasonable time for review.

@arnopo
Copy link
Collaborator

arnopo commented Nov 24, 2020

Hi @galak, @edmooring
Gentle reminder :-)

Copy link
Contributor

@edmooring edmooring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If @galak is OK with this, I think we're good to go.

anchao and others added 2 commits December 1, 2020 21:02
Commit-id: b16ca55
Adding RPMsg Extension layer implementing zero-copy send and receive.

Signed-off-by: Chao An <anchao@pinecone.net>
to demo the usage of zero copy API

Signed-off-by: Xiang Xiao <xiaoxiang@xiaomi.com>
@arnopo arnopo linked an issue Dec 3, 2020 that may be closed by this pull request
@arnopo
Copy link
Collaborator

arnopo commented Dec 4, 2020

Hi @xiaoxiang781216
I merged the series with 2 fixes of minor "check" point reported by the check-patch.
Thanks a lot for this major contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for zero copy send & receive
6 participants