Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get BT working on Pi3B. #1314

Closed
clivem opened this issue Mar 1, 2016 · 45 comments
Closed

How to get BT working on Pi3B. #1314

clivem opened this issue Mar 1, 2016 · 45 comments

Comments

@clivem
Copy link

clivem commented Mar 1, 2016

I figured this was going to be really simple, with something like using hciattach to the UART, to make the device visible.....
Any instructions anywhere?

@JamesH65
Copy link
Contributor

JamesH65 commented Mar 1, 2016

You really need to post this sort of question on the forum, but in short

sudo apt-get install pi-bluetooth

should install a load of stuff you need. Not sure how to get the panel icon
in LXDE though.

On 1 March 2016 at 15:42, Clive Messer notifications@github.com wrote:

I figured this was going to be really simple, with something like using
hciattach to the UART, to make the device visible.....
Any instructions anywhere?


Reply to this email directly or view it on GitHub
#1314.

@pelwell
Copy link
Contributor

pelwell commented Mar 1, 2016

sudo apt-get install blueman gets you the GUI (which is just about functional).

@clivem
Copy link
Author

clivem commented Mar 1, 2016

Guys, I'm not using Raspbian. I never will be using Raspbian, I have no interest in Raspbian.

I can see that the BT section of chip is connected to UART from DT config. I don't want to install Debian packages. I am asking for the technical info to be able to get it working on any distribution.
How do I get the device recognised, so that "hcitool dev" actually shows there is a device available.

Don't want blueman, LXDE, any desktop gui..... blah, blah, blah. Just want the device to be recognised.

@pelwell
Copy link
Contributor

pelwell commented Mar 1, 2016

But Raspbian is the supported way. All the information is out there for smart, motivated people to use, but don't expect to be spoon-fed.

@clivem
Copy link
Author

clivem commented Mar 1, 2016

Sorry, Raspbian may be the only supported OS. I can understand that. What I need is the lower level info to make this work with an OS that is not Raspbian. That's the info I am looking for.

Sorry, I must be really stupid. Would you please educate a stupid person where he should look, to find the information I seek. I don't require spoon-feeding, just a pointer to the info. I have looked!

@clivem
Copy link
Author

clivem commented Mar 1, 2016

@pelwell Phil, at least tell me what I am looking for. A Broadcom technical document? Documentation in the PiF documentation github?

@pelwell
Copy link
Contributor

pelwell commented Mar 1, 2016

You need the modified BlueZ package found in Raspbian (source available) and the systemd script called hciattach.service from Raspbian.

@XECDesign
Copy link
Contributor

http://archive.raspberrypi.org/debian/pool/main/b/bluez/bluez_5.23-2+rpi1.debian.tar.xz
(debian/patches/ 50-53)
/usr/bin/hciattach /dev/ttyAMA0 bcm43xx 921600 noflow -

@clivem
Copy link
Author

clivem commented Mar 1, 2016

@XECDesign Thank you for the help. Where are those params documented?

/usr/bin/hciattach /dev/ttyAMA0 bcm43xx 921600 noflow -
bcm43xx_init
Initialization timed out.

I have vague recollections of some sort of firmware upload needing to be done before the attach. (Think that was with Wandboard which used a BRCM chipset for wifi/BT). Does something like that need to happen with the Pi3B?

@ghollingworth
Copy link

Have you tried typing hciattach into google?

@clivem
Copy link
Author

clivem commented Mar 1, 2016

Yes, but it returns rather a lot of results. None of which seem to be telling me where to obtain the firmware file. Is it available from PiF github? Or only buried in a Raspbian package?

@clivem
Copy link
Author

clivem commented Mar 1, 2016

I'm going to ask nicely...... Is it possible to make any firmware files needed to support Pi3B wifi/BT operation available directly from the PiF github, rather than them just being available via specific distribution packages?

@Cheong2K
Copy link
Contributor

Cheong2K commented Mar 5, 2016

I am using a BT module on Pi2, first I enable the UART flow control by:
gpio -g mode 16 alt3
gpio -g mode 17 alt3
and then using the brcm-patchram-plus tool to load the BT firmware to the BT chip.
Try google it.

You also need to remove the console=ttyAMA0 in /boot/cmdline.txt

@Cheong2K
Copy link
Contributor

Cheong2K commented Mar 5, 2016

@clivem,

I don't have a Pi3 at this moment but I tried my Pi2 with NOOBS version 1.8.0, BT works natively, only need to make some scripts.

@davidpronk
Copy link

@clivem

Did you manage to get wifi to work on the Pi3B without Raspbian? I'm trying to figure out how to do this but I'm kind of in the dark on this...

@clivem
Copy link
Author

clivem commented Mar 5, 2016

@davidpronk Not yet. I got as far as grabbing those patches Phil made to modify the bluez hciattach from the Raspbian package, and rebuilt my Fedora bluez package..... Got the firmware, BCM43430A1.hcd, from Raspbian image..... At which point I was hoping it would just work, but still getting a timeout.....

$ sudo /usr/bin/hciattach -n /dev/ttyAMA0 bcm43xx 921600 noflow -
bcm43xx_init
Set Controller UART speed to 921600 bit/s
Flash firmware /lib/firmware/brcm/BCM43430A1.hcd
Initialization timed out.

I hope to find some time tomorrow to try and figure it out......

The last time I was fighting with Broadcom bluetooth on another SBC, it was still the "two shot" brcm-patchram-plus to initialize firmware, followed by hciattach..... At least now, it's just hciattach responsible for firmware upload and connecting the UART to the Bluetooth subsystem.

@Noltari
Copy link
Contributor

Noltari commented Mar 6, 2016

@pelwell "All the information is out there for smart, motivated people to use, but don't expect to be spoon-fed".

Completely disagree with this one.
The information isn't provided anywhere, but on several forums where people is asking how to get things working.
Official developers only made Raspbian compatible and the information needed to get things working must be extracted from there.
However, it should be properly documented somewhere...
Firmwares aren't available on linux-firmware...

@pelwell
Copy link
Contributor

pelwell commented Mar 6, 2016

The information isn't provided anywhere, but on several forums where people is asking how to get things working.
Official developers only made Raspbian compatible and the information needed to get things working must be extracted from there.

Do you not see the contradiction here?

And to reiterate:

You need the modified BlueZ package found in Raspbian (source available) and the systemd script called hciattach.service from Raspbian.

@Noltari
Copy link
Contributor

Noltari commented Mar 6, 2016

Do you not see the contradiction here?

Not at all, because I don't consider that a proper documentation.

@clivem
Copy link
Author

clivem commented Mar 6, 2016

And to reiterate:

@pelwell Are you familiar with the concept of a BSP/SDK? That's what is being asked for. (Not being told that one can find out the necessary info from a Raspbian image.)

@pelwell
Copy link
Contributor

pelwell commented Mar 6, 2016

Do you have any examples of previous RPi BSP/SDKs?

@clivem
Copy link
Author

clivem commented Mar 6, 2016

Do you have any examples of previous RPi BSP/SDKs?

Did you ever release a board, before now, that provided Bt/wifi, that required proprietary firmware to make it work, (not hosted in linux-firmware), and required modification to userspace software, to make it work?

Phil, I'm not trying to wind you up. Eben's blog, named you .....

Phil Elwell developed the wireless LAN and Bluetooth software.

.... so it would seem that you might be the best person to put the information together in a format that would be suitable, for other people, who do not wish to use Raspbian, to be able to get it working on whatever other OS, they choose to use.

@Noltari
Copy link
Contributor

Noltari commented Mar 6, 2016

@clivem
With 921600 or non defined baud rate I also get "Initialization timed out."
However, if I switch to 115200 I get the following:

root@OpenWrt:/# hciattach -n /dev/ttyAMA0 bcm43xx 115200 noflow -
[   29.461777] uart-pl011 3f201000.uart: no DMA platform data
bcm43xx_init
Set Controller UART speed to 115200 bit/s
Flash firmware /lib/firmware/BCM43430A1.hcd
Set Controller UART speed to 115200 bit/s
Device setup complete

And then it hangs...

By the way, try this with fresh boot, since it appears that device reset isn't properly working...

@Noltari
Copy link
Contributor

Noltari commented Mar 6, 2016

Okay, didn't notice the "-n" option prevents detaching from controlling terminal.

It's working now, but only at 115200...

root@OpenWrt:/# hciattach /dev/ttyAMA0 bcm43xx 115200 noflow -
[   22.634305] uart-pl011 3f201000.uart: no DMA platform data
bcm43xx_init
Set Controller UART speed to 115200 bit/s
Flash firmware /lib/firmware/BCM43430A1.hcd
Set Controller UART speed to 115200 bit/s
Device setup complete
root@OpenWrt:/# hciconfig
hci0:   Type: BR/EDR  Bus: UART
        BD Address: B8:27:EB:XX:XX:XX  ACL MTU: 1021:8  SCO MTU: 64:1
        DOWN
        RX bytes:654 acl:0 sco:0 events:33 errors:0
        TX bytes:419 acl:0 sco:0 commands:33 errors:0
root@OpenWrt:/# hcitool scan
Scanning ...
        90:B9:31:XX:XX:XX       NoltariPhone

@pelwell
Copy link
Contributor

pelwell commented Mar 6, 2016

Are you using the patched hciattach source from the Raspbian distribution?

@clivem
Copy link
Author

clivem commented Mar 6, 2016

Phil, where does your patch set fit in with what was committed upstream to bluez, from Loic Poulain, back in 2014.

http://git.kernel.org/cgit/bluetooth/bluez.git/commit?id=beb4892e96785ba9ff429fb940126a815efd47fd

Have you submitted your patches to upstream? No need to, because it has already been dealt with upstream and you just need to patch the Rasbian version?

@Noltari
Copy link
Contributor

Noltari commented Mar 6, 2016

@pelwell Nope, but I'm using a patched version of the OpenWrt one:
openwrt/packages#2464
As you can see I added the raspbian patches there.
@clivem yeah I had issues with patch 0051 because of that:
https://github.com/openwrt-es/openwrt-packages/blob/bcd5ea4533f5ddb650b3a413152b03a63d6ab09c/utils/bluez/patches/301-bcm43xx-The-UART-speed-must-be-reset-after-the-firmw.patch

@clivem
Copy link
Author

clivem commented Mar 6, 2016

This is madness..... If we need to modify userspace code, can we please have "reference" patches against current upstream bluez, not some antique version that Debian/Raspbian is using..... Pretty please?

@pelwell
Copy link
Contributor

pelwell commented Mar 6, 2016

@pelwell
Copy link
Contributor

pelwell commented Mar 6, 2016

Since the firmware is the most important data for a modem, increasing the speed before uploading the firmware on a modem without flow control is not a great idea. The other patches add h5 support, but this isn't being used yet.

@Noltari
Copy link
Contributor

Noltari commented Mar 6, 2016

I tested those patches and there are no more tiemouts, however with baud rates different to 115200 the device doesn't show up:

921600:

root@OpenWrt:/# hciattach /dev/ttyAMA0 bcm43xx-3wire 921600
[  316.863846] uart-pl011 3f201000.uart: no DMA platform data
bcm43xx_init
Flash firmware /lib/firmware/BCM43430A1.hcd
Set Controller UART speed to 921600 bit/s
Device setup complete
root@OpenWrt:/# hciconfig
root@OpenWrt:/# hciconfig hci0 up
Can't get device info: No such device

Default (3000000):

root@OpenWrt:/# hciattach /dev/ttyAMA0 bcm43xx-3wire
[   24.839780] uart-pl011 3f201000.uart: no DMA platform data
bcm43xx_init
Flash firmware /lib/firmware/BCM43430A1.hcd
Set Controller UART speed to 3000000 bit/s
Device setup complete
root@OpenWrt:/# hciconfig
root@OpenWrt:/# hciconfig hci0 up
Can't get device info: No such device

@clivem
Copy link
Author

clivem commented Mar 6, 2016

I've patched the bluez 5.36 Fedora 23 package....

http://www.squeezecommunity.org/repo/fedora/23/testing/SRPMS/bluez-5.36-1.fc23.2.src.rpm

$ sudo hciattach /dev/ttyAMA0 bcm43xx 921600 noflow -
bcm43xx_init
Flash firmware /lib/firmware/brcm/BCM43430A1.hcd
Set Controller UART speed to 921600 bit/s
Device setup complete

$ sudo hciconfig hci0 up

$ sudo hciconfig dev
hci0: Type: BR/EDR Bus: UART
BD Address: B8:27:EB:4C:9B:56 ACL MTU: 1021:8 SCO MTU: 64:1
UP RUNNING
RX bytes:1350 acl:0 sco:0 events:72 errors:0
TX bytes:1178 acl:0 sco:0 commands:72 errors:0

$ sudo hcitool scan
Scanning ...
00:1E:37:EA:A3:FF ANIKO-LAPTOP

@clivem
Copy link
Author

clivem commented Mar 6, 2016

@pelwell BTW, the gist patches.... Thank you, and not wishing to appear ungrateful, but the first three don't apply until the spaces are changed to tabs..... ;)

@Noltari
Copy link
Contributor

Noltari commented Mar 6, 2016

@clivem I get the following when I use your hciattach parameters:

root@OpenWrt:/# hciattach /dev/ttyAMA0 bcm43xx 921600 noflow -
[   36.624301] uart-pl011 3f201000.uart: no DMA platform data
bcm43xx_init
Flash firmware /lib/firmware/BCM43430A1.hcd
Set Controller UART speed to 921600 bit/s
Device setup complete
root@OpenWrt:/# [   42.967027] Bluetooth: hci0 command 0x1003 tx timeout
[   44.967021] Bluetooth: hci0 command 0x1001 tx timeout
[   46.967022] Bluetooth: hci0 command 0x1009 tx timeout

root@OpenWrt:/# hciconfig
hci0:   Type: BR/EDR  Bus: UART
        BD Address: 00:00:00:00:00:00  ACL MTU: 0:0  SCO MTU: 0:0
        DOWN
        RX bytes:0 acl:0 sco:0 events:0 errors:0
        TX bytes:12 acl:0 sco:0 commands:3 errors:0

Timeouts, timeouts everywhere xD

@pelwell
Copy link
Contributor

pelwell commented Mar 7, 2016

Gist drag & drop doesn't work for me - why isn't there an upload button? - and the tabs got lost in the cut & paste. It should be fixed now.

@pelwell
Copy link
Contributor

pelwell commented Mar 7, 2016

FYI there is a new overlay - pi3-miniuart-bt - that enables Bluetooth using the mini-UART. There may be throughput limitations, and it is even more important to clamp cpu_freq to 250, so caveat emptor. You will also need to edit the systemd service /lib/systemd/system/hciuart.service.

See 866cf94.

@clivem
Copy link
Author

clivem commented Mar 7, 2016

With thanks to @pelwell, I packaged the wifi and BT stuff for Fedora 23.

BCM43430 wifi (Broadcom proprietary firmware bin and nvram txt)
SRC: http://www.squeezecommunity.org/repo/fedora/23/SRPMS/brcm43430-firmware-1.0-3.fc23.src.rpm
RPM: http://www.squeezecommunity.org/repo/fedora/23/armhfp/brcm43430-firmware-1.0-3.fc23.noarch.rpm

BRCM4348 bluetooth (systemd service, udev rule to "bring up" hci interface, and Broadcom proprietary firmware hcd}
SRC: http://www.squeezecommunity.org/repo/fedora/23/SRPMS/brcm43438-bluetooth-1.0-3.fc23.src.rpm
RPM: http://www.squeezecommunity.org/repo/fedora/23/armhfp/brcm43438-bluetooth-1.0-3.fc23.noarch.rpm

Modified bluez (PhilE patches for hciattach firmware upload)
SRC: http://www.squeezecommunity.org/repo/fedora/23/SRPMS/bluez-5.36-1.fc23.2.src.rpm
RPM: http://www.squeezecommunity.org/repo/fedora/23/armhfp/bluez-5.36-1.fc23.2.armv7hl.rpm

Once the license conditions surrounding those Broadcom wifi/BT firmware binaries are clarified, and if they are re-distributable, I will make a Fedora 23 image available.

@Noltari
Copy link
Contributor

Noltari commented Mar 10, 2016

I did more tests yesterday and I was able to get WiFi working.
I was getting some exceptions on firmware loading that were fixed with these patches:
https://dev.openwrt.org/changeset/48959
(OpenWrt uses a v4.4 custom kernel)
I also sent a couple of patches to openwrt-devel which enable WiFi support on the Raspberry Pi 3:
http://patchwork.ozlabs.org/patch/595197/
http://patchwork.ozlabs.org/patch/595198/

On the other hand, I did some more tests and I couldn't get Bluetooth working at baud rates > 115200 (timeouts), but it was working perfectly with baud rates <= 115200.
It may be related to the kernel config used in OpenWrt:
https://github.com/openwrt-es/openwrt/blob/brcm2708-next/target/linux/brcm2708/bcm2710/config-4.4
I'm still investigating why this is happening.

BTW thanks for sharing those links @clivem.

@markab
Copy link

markab commented Mar 15, 2016

Can I just ask, is the internal bluetooth implementation in the standard release for rpi3 supposed to work? I seem to have many compatibility issues connecting devices, but the same devices work on the same pi3 with an external bluetooth dongle.

@agherzan
Copy link
Contributor

@pelwell I see a couple of patches you sent for bluez5. Are these patches intended to go upstream in bluez5 or are they rpi3 specific?

@pelwell
Copy link
Contributor

pelwell commented Apr 13, 2016

They are Pi3 specific, at least in their current form. Any upstream patch would have to be more extensive, adding extra command-line parameters to make the behaviour configurable.

@WayneKeenan
Copy link

@pelwell Thank you, I applied your patches to BlueZ 5.39, please look here for context: ukBaz/python-bluezero#30

There are a few other Bluetooth/BlueZ/BLE related issues going on but without going into details here I wondered if you happen have any other Pi3 (or Pi2) specific BlueZ patches?

Thanks
Wayne

@Ruffio
Copy link

Ruffio commented Aug 17, 2016

@clivem has your issue been resolved? If so, please close this issue. Thanks.

@oliv3r
Copy link
Contributor

oliv3r commented Mar 2, 2019

With thanks to @pelwell, I packaged the wifi and BT stuff for Fedora 23.

BCM43430 wifi (Broadcom proprietary firmware bin and nvram txt)
SRC: http://www.squeezecommunity.org/repo/fedora/23/SRPMS/brcm43430-firmware-1.0-3.fc23.src.rpm
RPM: http://www.squeezecommunity.org/repo/fedora/23/armhfp/brcm43430-firmware-1.0-3.fc23.noarch.rpm
Once the license conditions surrounding those Broadcom wifi/BT firmware binaries are clarified, and if they are re-distributable, I will make a Fedora 23 image available.

@clivem have you ever manage to wrap this up? What is the status of the firmware.bin? Since the raspberry pi org is redistributing them, they should be redistributable implicitly already? If so, what is the canonical source for these firmware's? They still haven't shown up in linux-firmware.git ...

@islight
Copy link

islight commented Apr 26, 2021

I am using a BT module on Pi2, first I enable the UART flow control by:
gpio -g mode 16 alt3
gpio -g mode 17 alt3
and then using the brcm-patchram-plus tool to load the BT firmware to the BT chip.
Try google it.

You also need to remove the console=ttyAMA0 in /boot/cmdline.txt

it works for me

popcornmix pushed a commit that referenced this issue Mar 21, 2023
Patch series "migrate_pages: fix deadlock in batched synchronous
migration", v2.

Two deadlock bugs were reported for the migrate_pages() batching series. 
Thanks Hugh and Pengfei.  Analysis shows that if we have locked some other
folios except the one we are migrating, it's not safe in general to wait
synchronously, for example, to wait the writeback to complete or wait to
lock the buffer head.

So 1/3 fixes the deadlock in a simple way, where the batching support for
the synchronous migration is disabled.  The change is straightforward and
easy to be understood.  While 3/3 re-introduce the batching for
synchronous migration via trying to migrate asynchronously in batch
optimistically, then fall back to migrate synchronously one by one for
fail-to-migrate folios.  Test shows that this can restore the TLB flushing
batching performance for synchronous migration effectively.


This patch (of 3):

Two deadlock bugs were reported for the migrate_pages() batching series. 
Thanks Hugh and Pengfei!  For example, in the following deadlock trace
snippet,

 INFO: task kworker/u4:0:9 blocked for more than 147 seconds.
       Not tainted 6.2.0-rc4-kvm+ #1314
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 task:kworker/u4:0    state:D stack:0     pid:9     ppid:2      flags:0x00004000
 Workqueue: loop4 loop_rootcg_workfn
 Call Trace:
  <TASK>
  __schedule+0x43b/0xd00
  schedule+0x6a/0xf0
  io_schedule+0x4a/0x80
  folio_wait_bit_common+0x1b5/0x4e0
  ? __pfx_wake_page_function+0x10/0x10
  __filemap_get_folio+0x73d/0x770
  shmem_get_folio_gfp+0x1fd/0xc80
  shmem_write_begin+0x91/0x220
  generic_perform_write+0x10e/0x2e0
  __generic_file_write_iter+0x17e/0x290
  ? generic_write_checks+0x12b/0x1a0
  generic_file_write_iter+0x97/0x180
  ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
  do_iter_readv_writev+0x13c/0x210
  ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
  do_iter_write+0xf6/0x330
  vfs_iter_write+0x46/0x70
  loop_process_work+0x723/0xfe0
  loop_rootcg_workfn+0x28/0x40
  process_one_work+0x3cc/0x8d0
  worker_thread+0x66/0x630
  ? __pfx_worker_thread+0x10/0x10
  kthread+0x153/0x190
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x29/0x50
  </TASK>

 INFO: task repro:1023 blocked for more than 147 seconds.
       Not tainted 6.2.0-rc4-kvm+ #1314
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 task:repro           state:D stack:0     pid:1023  ppid:360    flags:0x00004004
 Call Trace:
  <TASK>
  __schedule+0x43b/0xd00
  schedule+0x6a/0xf0
  io_schedule+0x4a/0x80
  folio_wait_bit_common+0x1b5/0x4e0
  ? compaction_alloc+0x77/0x1150
  ? __pfx_wake_page_function+0x10/0x10
  folio_wait_bit+0x30/0x40
  folio_wait_writeback+0x2e/0x1e0
  migrate_pages_batch+0x555/0x1ac0
  ? __pfx_compaction_alloc+0x10/0x10
  ? __pfx_compaction_free+0x10/0x10
  ? __this_cpu_preempt_check+0x17/0x20
  ? lock_is_held_type+0xe6/0x140
  migrate_pages+0x100e/0x1180
  ? __pfx_compaction_free+0x10/0x10
  ? __pfx_compaction_alloc+0x10/0x10
  compact_zone+0xe10/0x1b50
  ? lock_is_held_type+0xe6/0x140
  ? check_preemption_disabled+0x80/0xf0
  compact_node+0xa3/0x100
  ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30
  ? _find_first_bit+0x7b/0x90
  sysctl_compaction_handler+0x5d/0xb0
  proc_sys_call_handler+0x29d/0x420
  proc_sys_write+0x2b/0x40
  vfs_write+0x3a3/0x780
  ksys_write+0xb7/0x180
  __x64_sys_write+0x26/0x30
  do_syscall_64+0x3b/0x90
  entry_SYSCALL_64_after_hwframe+0x72/0xdc
 RIP: 0033:0x7f3a2471f59d
 RSP: 002b:00007ffe567f7288 EFLAGS: 00000217 ORIG_RAX: 0000000000000001
 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3a2471f59d
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005
 RBP: 00007ffe567f72a0 R08: 0000000000000010 R09: 0000000000000010
 R10: 0000000000000010 R11: 0000000000000217 R12: 00000000004012e0
 R13: 00007ffe567f73e0 R14: 0000000000000000 R15: 0000000000000000
  </TASK>

The page migration task has held the lock of the shmem folio A, and is
waiting the writeback of the folio B of the file system on the loop block
device to complete.  While the loop worker task which writes back the
folio B is waiting to lock the shmem folio A, because the folio A backs
the folio B in the loop device.  Thus deadlock is triggered.

In general, if we have locked some other folios except the one we are
migrating, it's not safe to wait synchronously, for example, to wait the
writeback to complete or wait to lock the buffer head.

To fix the deadlock, in this patch, we avoid to batch the page migration
except for MIGRATE_ASYNC mode.  In MIGRATE_ASYNC mode, synchronous waiting
is avoided.

The fix can be improved further.  We will do that as soon as possible.

Link: https://lkml.kernel.org/r/20230303030155.160983-1-ying.huang@intel.com
Link: https://lore.kernel.org/linux-mm/87a6c8c-c5c1-67dc-1e32-eb30831d6e3d@google.com/
Link: https://lore.kernel.org/linux-mm/874jrg7kke.fsf@yhuang6-desk2.ccr.corp.intel.com/
Link: https://lore.kernel.org/linux-mm/20230227110614.dngdub2j3exr6dfp@quack3/
Link: https://lkml.kernel.org/r/20230303030155.160983-2-ying.huang@intel.com
Fixes: 5dfab10 ("migrate_pages: batch _unmap and _move")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reported-by: Hugh Dickins <hughd@google.com>
Reported-by: "Xu, Pengfei" <pengfei.xu@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Stefan Roesch <shr@devkernel.io>
Cc: Tejun Heo <tj@kernel.org>
Cc: Xin Hao <xhao@linux.alibaba.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests