-
Notifications
You must be signed in to change notification settings - Fork 5.2k
mcp251xfd based dual can HAT and 5.10.95 #4902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm going to leave this open for others to contribute to, but without a dual mcp251xfd HAT and some CAN hardware it's not going to get any attention from me. Have you tried asking @marckleinebudde, whose repo you linked to? |
Hey @nmbath, the BTW the problem:
is fixed with v5.15. Coming back to your problem:
The driver only reads The |
Hi @marckleinebudde, Thank you for your response. The HAT is a waveshare dual mcp251xfd and understand that it uses BCM pin 7 for CS. I have used their default 2xMCP2517FD.dtbo /dts-v1/; / {
}; Looking at this file they seem to be using GPIO26 for CS on the 2nd unit. I am now in an area I have not done much work on, and I do not have a scope to look at what is going on. GPIO26 is on pin 37, but reading their material CS_1 is physical pin 26, or BCM 7. I tried changing the CS from 26 to 7, but then I got [ 6.118410] pinctrl-bcm2835 3f200000.gpio: pin gpio7 already requested by 3f204000.spi; cannot claim for spi1.0 It is probably something very simple, I just cant see it and don't have enough experience with GPIO's! |
Ahhh waveshare - adding @pdp7 to the loop. |
You can release GPIO 7 by adding |
Once you have something that works I can help you tidy it up and get it into the standard kernel builds. |
Once it's working generate a "compiled" overlay with something like this (parameters have to be adjusted to the waveshare board):
See header of this file: |
Except with
|
ACK, that was a copy/paste from the |
And you can include the |
I have tried adding spi0-1cs, but didn't seem to do anything and still have the error. Will try again in the morning. |
I have just tried this configuration:
And had this..
Digging through Waveshares details, that is correct. Reading the wiki site again, they have two modes, both sharing spi0 and one on spi0 and the other on spi1. The default mode is apparently separate SPI's. They have a different table on the wiki and that looks like when sharing an SPI the CS for can1 is 7 and when separate SPI's is 26. So changed the configuration too
and got this..
ip link set can1 type can bitrate 250000 and
So maybe it's not the overlay but something else. Just to double check, I tried this to make sure it was not on the other mode!
and got
Thoughts? |
This mean both mcp251xfd are properly detected. But later, during ifup you got this error messages?
Strange. The driver only reads |
full dmesg:
Logged in at roughly 33.x Issued the following command, which triggered the log messages at 62.x
|
Which overlays are you using in this bootlog? This looks like the waveshare overlays? Which chips are on your waveshare hat? mcp2517fd or mcp2518fd? The driver says the firmware (the device tree + overlays) are specifying a mcp2517fd, but the driver has detected the mcp2518fd.
What's your
Later in the boot log we see these error messages:
This indicates that one driver instance is sending CAN messages to the Linux networking stack:
The question is, what happens between the interface is properly detected:
and
After the driver detects the chip, the chip is shut down again. When you issue a |
Yes I am using the waveshare overlay, and this is my full config.txt
can0 is automatically configured at 250kb and starte, it is connected to a can bus, with a device running can1 has nothing attached to it at present and no automatic configuration set. I can connect it to an active port, but had not until I actually get the device working. |
Which chips are on your waveshare hat? mcp2517fd or mcp2518fd? So let's recap:
The questions are:
|
I have looked at the CHIPs on the HAT and they are labelled MCP2518FD, which aligns with their Web site Also on their web site in the Wiki section is the attached table. This would imply that the details are spi0 CS0 is 8 In addition their published schematic supports this as well. When configured like this I get
With any other configuration of interrupt or cs0 I fail to get one or the other to initialise. |
I have managed to compile the mcp25xxfd driver from within the source code on the waveshare wiki Once I
No traffic as the can port is not connected, I also got this in It does seem to be a code/driver issue as opposed to overlay file. I will dig through and see if I can spot where the code differs between the example waveshare has provided and the based 5.10.y code. Also no |
If you don't receive any CAN messages you'll not get the |
I found something interesting:
Waveshare increased the timeout from 1 ms to 500ms. Can you increase the timeout and try the mainline driver? |
In case I forget later, thank you for looking into this, Marc. |
I strongly suggest to use the mainline driver, as the waveshare driver is old and contains several bugs, that have been fixed on mainline. |
It's unfortunate that in |
@marckleinebudde, I would also like to a dd my thanks for your support.
I agree that going with mainline is the correct approach. Wanted to test what waveshare supplied to try and help identify the problem. Looks like we have found a code issue rather than overlay!
I have applied the following
The outcome was:
and dmesg gives the same as before:
Is there any debugging within the mcp251xfd driver that can be enabled to give more detail. |
You can add |
The next step would be to enable the kernel event tracing for the duration of the
|
I have installed raspy-gpio on the device. I am not sure if this is a red hearing or not, but according to the tool it dosnt look like the SPI is properly configured in terms of PINs.
There is no SPI1_MISO on pin 19 and SPIO_SCLK on pin 21 |
You suggested
I add this to my build:
Having done this no extra messages are displayed. Do a grep for DEBUG in the mcp151xfd folder, there are no other occurrences of DEBUG. |
Hey @nmbath, See predicable network interface names for the classical Can you test if this combined overlay works for the waveshare 0001-overlays-add-support-for-waveshare-can-fd-hat-rev2.1.patch.TXT
Can you describe the problem? |
@marckleinebudde Please see my comments above, I did do as you suggest...
BTW, 9 of 11 boards work just fine with waveshare config. @nmbath I have 50 V2.1 boards, of the 11 I have tried, 9 work. Maybe it is the base board, I use a rpi CM4 with a base board from Waveshare. |
Some working boards and some not working boards of the same type on the same board suggests a hardware problem. |
2-CH CAN FD HAT Rev2.1 ip -d link show dev can0 uname -a In my application the /var/log/syslog is filled up with: I have two slaves on my network to which it is difficult to establish a connection. With two others slaves it seems to work (But I didn't check the logs) |
Hey @DavidBoJ, can you create a new issue please. Please mention me, so I can answer you questions. Thanks, Marc |
@marckleinebudde, I took your point and agree it really did look like a hardware problem and may well be for the specific problem I had. I have since taken mutiple new Waveshare 2-CH CAN FD HAT Rev2.1, CM4102032, Waveshare base board.
Any suggestions as to what I should try next. I am going to check same results on all devices. |
The |
Also check that you do not have anything else opening the GPIO pins that the old version uses. It dose not use standard GPIO pins. I had issues with my old revision board and can1 due to something else wanting to use the GPIO pins that can1 required. |
Sorry ... how do I get the patch attached to #4902 (comment) |
Click on the link that says "0001-overlays-add-support-for-waveshare-can-fd-hat-rev2.1.patch.TXT" |
Sorry ... how/where do apply this??? |
Do you build your kernel on your own from git? If not, I'll send you the complied overlay. Update: here's the compiled overlay: |
Thankyou. While I could build the kernel, I am OK to take overlays but new kernel is a bit too far.
|
I think I've messed up the filename...I'll test here first |
Is there anything in |
Note I have added the "-marc" suffix Error found...
|
That's a really long filename, and until very recently there was a severe file path length limitation in the firmware overlay code. Try giving it a shorter name... |
Name length was the problem. This works: BTW Thanks both Can0 and Can1 work. I will test the faulty boards tomorrow. too late now! |
Does |
Hmmm ... new overlay allows the 2 "faulty" boards to work!!!
|
Can you figure out what length still works? |
Ok. I think I'll change the filename to Do you get a |
@marckleinebudde sorry for delay I have some production deadlines.
I just did a test on the previous Waveshare version prior to 2.1 (maybe 2.0) using latest overlay The result is can0 OK and can1 not present on ifconfig
|
The old wareshare hat without a revision number on the back only works with the It makes no sense to test the old card with the new overlay or vice versa. |
OK Thanks, I thought the mainline might be better for the pre 2.1 version. |
I just bought a rev2.1 board, it should arrive this week. In the meantime you can test if your your rev2.1 board works with my rev2.1 overlay. Both CAN interfaces should be detected. |
I have tested the compiled overlay you sent me The results are above Is there another overlay I should test |
I have had a rev 2.1 board for several months. I attach the overlay I have been using on that. Only issues is occasionally on boot can0 and can1 get swapped |
Uh oh!
There was an error while loading. Please reload this page.
Describe the bug
The HAT has a dual channel mcp251xfd can chip, running on spi0.0 and spi1.0. While the driver loads and looks like it is working a number of issues are occurring,. A dmesg | grep can shows
[ 7.661604] mcp251xfd spi0.0 can0: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
[ 7.724663] mcp251xfd spi1.0 can1: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
When trying to send and receive can messages on can0 the following dmesg messages are received for a few seconds and then stops, but traffic is sent and received on can0
[ 23.398211] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #8!!!
[ 23.501499] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #8!!!
[ 23.581290] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #8!!!
[ 23.607407] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #8!!!
[ 23.619993] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #8!!!
[ 23.629911] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #8!!!
[ 23.653014] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #8!!!
[ 23.662845] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #8!!!
[ 23.673721] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #8!!!
However I can not get can1 running at all.
ifconfig can1 up
ifconfig: SIOCSIFFLAGS: Invalid argument
dmesg has these messages
[ 394.045735] mcp251xfd spi1.0 can1: Failed to detect MCP2518FD (osc=0x00000000).
[ 394.053266] mcp251xfd spi1.0 can1: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[ 394.064157] mcp251xfd spi1.0 can1: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[ 394.075031] mcp251xfd spi1.0 can1: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[ 394.085798] mcp251xfd spi1.0 can1: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000).
[ 394.096366] mcp251xfd spi1.0 can1: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[ 394.107287] mcp251xfd spi1.0 can1: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[ 394.118507] mcp251xfd spi1.0 can1: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[ 394.129372] mcp251xfd spi1.0 can1: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000).
[ 394.139661] mcp251xfd spi1.0 can1: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[ 394.150838] mcp251xfd spi1.0 can1: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[ 394.161713] mcp251xfd spi1.0 can1: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[ 394.172492] mcp251xfd spi1.0 can1: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000).
[ 394.182735] mcp251xfd spi1.0 can1: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[ 394.193606] mcp251xfd spi1.0 can1: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[ 394.204483] mcp251xfd spi1.0 can1: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[ 394.215283] mcp251xfd spi1.0 can1: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000).
I have installed various fixes and patches as baed on https://github.com/marckleinebudde/linux-rpi/commits/v5.10-rpi/backport-performance-improvements
Steps to reproduce the behaviour
Stock 5.10.95 pulled from https://github.com/raspberrypi/linux/tree/rpi-5.10.y and the can0 interface presented some crc error, which after installing this patch 0ecb42a can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register, stopped giving them.
Device (s)
Raspberry Pi 3 Mod. B
System
Linux raspberrypi2 5.10.95-v7 #1 SMP Fri Feb 18 07:45:17 UTC 2022 armv7l GNU/Linux
[ 0.080092] raspberrypi-firmware soc:firmware: Attached to firmware from 2022-01-20T13:58:22, variant start
[ 0.090104] raspberrypi-firmware soc:firmware: Firmware hash is bd88f66f8952d34e4e0613a85c7a6d3da49e13e2
Logs
No response
Additional context
I have a similar system using a basic mcp251x, this is also a dual can HAT. I have noticed that the mcp251xfd is also consuming far more CPU resources than the basic mcp251x, over double the resources ..
mcp251xfd:
1368 2 root SW 0 0% 10% [irq/199-spi0.0]
165 2 root RW 0 0% 6% [spi0]
compared to the mcp251x
29533 2 root SW 0 0% 4% [irq/166-mcp251x]
219 2 root SW 0 0% 2% [spi0]
The text was updated successfully, but these errors were encountered: