Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mmc0 Timeout waiting for hardware interrupt #2802

Open
btory opened this issue Jan 10, 2019 · 16 comments
Open

mmc0 Timeout waiting for hardware interrupt #2802

btory opened this issue Jan 10, 2019 · 16 comments

Comments

@btory
Copy link

btory commented Jan 10, 2019

Been getting the above error randomly for a few months but haven't been able to reliably reproduce until now. Rootfs always gets unmounted when it occurs.

Interestingly this only occurs with a 64gb Samsung EVO microSD card. The card passes read-write tests and I haven't had and issues with it in other devices so I don't believe that there is anything wrong with it. I cloned the contents to another cheap 64gb microSD card and that one works perfectly with no errors.

To reproduce
I can reproduce by doing a full system backup with rsync to the same card:
rsync -aAXvH --delete --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found"} / /mnt/p3

After a few seconds the command fails due to MMC errors.

Expected behaviour
No MMC errors and rootfs stays mounted

Actual behaviour
MMC errors immediately followed by rootfs getting unmounted

System
Model: 3 B+
OS: Arch Linux ARM
Firmware: 1f3414729f43ef3b977a910a0d811a759562e1cf (clean) (release)
Kernel: Raspberry Pi
Linux rpi 4.14.87-1-ARCH #1 SMP Wed Dec 12 00:59:49 UTC 2018 armv7l GNU/Linux
Linux rpi 4.19.13-1-ARCH #1 SMP Wed Jan 9 18:02:38 DST 2019 armv7l GNU/Linux
Issue present with both of these versions

Logs
https://pastebin.com/raw/kJnyZX8P

@pelwell
Copy link
Contributor

pelwell commented Jan 11, 2019

The SD controller was in the middle of writing 192 contiguous sectors close to the end of the card (around the 60GB mark) when it appears to have stalled. The driver waited 10 seconds for the operation to complete before timing out.

I'd be interested in seeing any other crash logs you have, to see if the position on the card is a common factor.

@lategoodbye
Copy link
Contributor

Is possible that in case the controller on the sd card does wear-leveling that the timeout isn't sufficient?

@btory
Copy link
Author

btory commented Jan 11, 2019

I'd be interested in seeing any other crash logs you have, to see if the position on the card is a common factor.

It's happened many times while writing closer to the start of the card too (e.g. during package upgrades)

@btory
Copy link
Author

btory commented Jan 12, 2019

@pelwell
Copy link
Contributor

pelwell commented Jan 17, 2019

I'm sorry this issue has gone quiet, but I ran out of things to suggest. Your logs do indeed show failures writing to other parts of the card. I'm reluctant to blame a faulty card, but so far this is a one-off report.

I can't believe that any kind of internal maintenance operation by a card would make it unresponsive for 10 seconds - imagine if you were trying to record a video to it.

  1. Are you running with an adequate power supply? Under heavy load, internal voltages can drop without a 2.5A supply on a Pi 3 or 3+, but with a recent image your kernel logs will show under-voltage warnings.

  2. You could try underclocking the system to see if that improves stability. In config.txt add:

core_freq=250
arm_freq=900

@lategoodbye
Copy link
Contributor

@btory Is it possible to simplify this scenario to something like this:
sudo dd if=/dev/mmcblk0 of=/dev/null bs=1M

@btory
Copy link
Author

btory commented Jan 20, 2019

@pelwell
I'm using the official 2.5A power supply. Tried underclocking as suggested and it didn't improve the stability, was still getting the same error just as fast.

@lategoodbye
Unfortunately not. That command causes the following output though:
https://pastebin.com/raw/98yjmqT7
Similar output when using the cheap card which doesn't suffer from the timeout error:
https://pastebin.com/raw/wuxRy3qi
Similar output again when writing to the Samsung card with dd if=/dev/zero of=~/test bs=1M
https://pastebin.com/raw/164q86YG

@btory
Copy link
Author

btory commented Jan 20, 2019

Tried increasing the timeout (10 -> 30 seconds) and it still times out.

@lategoodbye
Copy link
Contributor

Thanks for testing.

I don't think this is a power issue. According to the dump this seems to be related to #2810

Could you please try to pass "brcm,force-pio" to the sdhost overlay if this have any influence to the rsync or the dd scenario?

@btory
Copy link
Author

btory commented Jan 21, 2019

No difference with dtoverlay=sdhost,brcm,force-pio:
rsync - https://pastebin.com/raw/Dsm2gdKi
dd - https://pastebin.com/raw/VdidhMAY

@pelwell
Copy link
Contributor

pelwell commented Jan 21, 2019

Try with the force_pio parameter:

dtoverlay=sdhost,force_pio

which will result in the brcm,force-pio property being added.

@btory
Copy link
Author

btory commented Jan 21, 2019

dtoverlay=sdhost,force_pio fixed the warnings in the dd scenario but did nothing for rsync

@lategoodbye
Copy link
Contributor

Yesterday i was able to produce a "Timeout waiting for hardware interrupt" with the dd scenario within a few seconds.

Test setup: Raspberry Pi 3 B+, linux-next, Aarch64 defconfig, Samsung EVO+ 32 GB

@btory
Copy link
Author

btory commented Jan 25, 2019

Tested rsync scenario with underclocked SD and at 20MHz and lower the timeouts do not occur at all. Tested a bunch of times to be sure.

@JamesH65
Copy link
Contributor

@btory Does this workaround for an apparently similar issue help at all?

#2810 (comment)

@Blindfreddy
Copy link

I have the same issue with a sandisk 16GB card.
Definitely not HW, I moved the card from a Pi 3 Model B to a Pi 2 Model B v1.1 and the issue moved with it.
Also underclocking it to 20MHz as suggested above and using modified command found here ( https://www.jeffgeerling.com/blog/2016/how-overclock-microsd-card-reader-on-raspberry-pi-3 ) does not resolve the issue: sudo bash -c 'printf "dtoverlay=sdhost,overclock_50=20\n" >> /boot/config.txt'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants