Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DaynaPort networking no longer working after latest build of 'develop' #1306

Closed
benjamink opened this issue Nov 4, 2023 · 99 comments · Fixed by #1318
Closed

DaynaPort networking no longer working after latest build of 'develop' #1306

benjamink opened this issue Nov 4, 2023 · 99 comments · Fixed by #1318

Comments

@benjamink
Copy link
Collaborator

benjamink commented Nov 4, 2023

Info

  • Which version of Pi are you using:

Raspberry Pi 4

  • Which github revision of software:

develop branch (built 2023-11-04 @ ~12pm UTC)

  • Which board version:

Current

  • Which computer is the PiSCSI connected to:

Macintosh Performa 475 w/ System 7.5.3

Describe the issue

Raspberry Pi connected via wired ethernet (eth0) but wireless interface (wlan0) is also active & gets an IP from DHCP.

PiSCSI is configured to bridge eth0:

root@piscsi-perf475:~# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master piscsi_bridge state UP group default qlen 1000
    link/ether dc:a6:32:1f:8c:8b brd ff:ff:ff:ff:ff:ff
3: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether dc:a6:32:1f:8c:8c brd ff:ff:ff:ff:ff:ff
    inet 192.168.3.11/16 brd 192.168.255.255 scope global dynamic noprefixroute wlan0
       valid_lft 6257sec preferred_lft 5357sec
    inet6 fe80::3bfe:ab:8ed7:3cbc/64 scope link
       valid_lft forever preferred_lft forever
4: piscsi_bridge: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 7e:47:31:f0:27:b0 brd ff:ff:ff:ff:ff:ff
    inet 192.168.3.14/16 brd 192.168.255.255 scope global dynamic piscsi_bridge
       valid_lft 6233sec preferred_lft 6233sec
    inet6 fe80::dea6:32ff:fe1f:8c8b/64 scope link
       valid_lft forever preferred_lft forever
6: piscsi0: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master piscsi_bridge state DOWN group default qlen 1000
    link/ether 7e:47:31:f0:27:b0 brd ff:ff:ff:ff:ff:ff

Netatalk appears to be configured properly.

/etc/netatalk/atalkd.conf (tried in both configurations):

piscsi_bridge -phase 2 -net 0-65534 -addr 65280.76
eth0 -phase 2 -net 0-65534 -addr 65280.76
root@piscsi-perf475:~# systemctl status atalkd
● atalkd.service - Netatalk AppleTalk daemon
     Loaded: loaded (/lib/systemd/system/atalkd.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2023-11-04 11:57:17 GMT; 2h 1min ago
      Tasks: 1 (limit: 472)
        CPU: 105ms
     CGroup: /system.slice/atalkd.service
             └─1728 /usr/local/sbin/atalkd

Nov 04 11:56:45 piscsi-perf475 systemd[1]: Starting Netatalk AppleTalk daemon...
Nov 04 11:56:45 piscsi-perf475 atalkd[1728]: Set syslog logging to level: LOG_DEBUG
Nov 04 11:56:45 piscsi-perf475 atalkd[1728]: restart (2.230302)
Nov 04 11:56:46 piscsi-perf475 atalkd[1728]: zip_getnetinfo for piscsi_bridge
Nov 04 11:56:56 piscsi-perf475 atalkd[1728]: zip_getnetinfo for piscsi_bridge
Nov 04 11:57:06 piscsi-perf475 atalkd[1728]: zip_getnetinfo for piscsi_bridge
Nov 04 11:57:16 piscsi-perf475 atalkd[1728]: config for no router
Nov 04 11:57:17 piscsi-perf475 atalkd[1728]: ready 0/0/0
Nov 04 11:57:17 piscsi-perf475 systemd[1]: Started Netatalk AppleTalk daemon.

/etc/netatalk/afpd.conf:

- -transall -uamlist uams_guest.so,uams_clrtxt.so,uams_dhx2.so -icon -setuplog "default log_maxdebug /var/log/afpd.log"
root@piscsi-perf475:~# systemctl status afpd
● afpd.service - Netatalk AFP fileserver for Macintosh clients
     Loaded: loaded (/lib/systemd/system/afpd.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2023-11-04 11:57:22 GMT; 2h 2min ago
      Tasks: 2 (limit: 472)
        CPU: 450ms
     CGroup: /system.slice/afpd.service
             └─1761 /usr/local/sbin/afpd -c 20

Nov 04 11:57:22 piscsi-perf475 systemd[1]: Starting Netatalk AFP fileserver for Macintosh clients...
Nov 04 11:57:22 piscsi-perf475 systemd[1]: Started Netatalk AFP fileserver for Macintosh clients.
Nov 04 11:57:22 piscsi-perf475 afpd[1761]: Set syslog logging to level: LOG_INFO

Dyanaport networking was working on the Performa prior to rebuilding PiSCSI today but it hasn't been used for a couple months. Only new change is the domain used for my LAN but this is provided by DHCP as expected.

I'll provide additional details including screenshots later when I'm able to get back on the Macintosh.

@uweseimet
Copy link
Contributor

uweseimet commented Nov 4, 2023

@benjamink Can you please provide logs from v23.04.01 and the develop branch? Please restrict the logs to the SCSI ID of the daynaport by launching piscsi with "-L trace:SCSI_ID" and please use debug builds, not optimized builds.

Fixing daynaport issues can be quite an effort, so please ensure that this is definitely not a problem with your environment. Just replacing the respective piscsi binaries (v23.04.01 and develop) without changing anything else should be how you can switch between a working and non-working setup.

@benjamink
Copy link
Collaborator Author

@uweseimet how can I build piscsi with debugging? I assume you're asking for something different than what I get when I run option #1 from easyinstall.sh?

I don't have specific logs for each version. I have a huge piscsi.log that goes back to August though so it includes logs from when things were working but not specific logs like you're asking for. I'm going to revert to v23.04.01 & see if I can get networking working again. Possibly with a bare-bones disk image & then re-build with current develop to see if it breaks again. If things are still broken when I go back to v23.04.01 then we know it's my setup & I'll need to get that working first.

I'll capture logs from both versions when I have things all setup.

@uweseimet
Copy link
Contributor

uweseimet commented Nov 4, 2023

@benjamink In the cpp folder run these commands to compile with debug information:

DEBUG=1
make clean; make piscsi

The resulting binary will be placed in cpp/fullspec/bin. The binaries built this way are the ones to use for any testing. Please run any tests on a regular Raspberry Pi OS, bullseye or bookworm. But I assume you are already doing this, because the develop branch does not compile anymore on buster.
Please also see the build notes on https://github.com/PiSCSI/piscsi/wiki/Setup-Instructions. And please remember to configure logging to only log anything (on trace level) for the daynaport device.

@benjamink
Copy link
Collaborator Author

I was able to revert back to v23.04.01 & things worked again. I did have to force an old version of the Werkzeug library (due to the older version of Flask) & rebuild the web-ui but that should be unrelated to anything I think. I did this by adding the following to python/web/requirements.txt, deleting the venv & running sudo ./start.sh to rebuild the venv:

Werkzeug==2.3.7

Attaching screenshot & relevant logs/configs from test run:

image

20231105-issue-1306.tar.gz

I'm re-compiling current develop now & will test that later when I have time.

@uweseimet
Copy link
Contributor

uweseimet commented Nov 5, 2023

@benjamink OK. For testing please backup your 23.04.01 piscsi binary, and then replace it by the piscsi binary from the develop build. Do not change anything else (!!!) If the develop binary fails, in order to double-check replace it by the 23.04.01 binary, and do not change anything else (!!!). If things start working again we can be sure that this is a regression issue with the binary. Anything related to the UI is not relevant for proceeding further. We have to concentrate on the binary only.
Please ensure that you build both binaries with debug code enabled, i.e. set DEBUG to 1 before compiling. Then ensure that before re-building with "make" you have "make clean". Just to be 100% sure, please verify than the "-g" option is displayed as part of the compiler invocations.

@benjamink
Copy link
Collaborator Author

I'm suspecting that my real problem is with my netatalk configuration. When I originally setup this PiSCSI I was using wlan0 but have since moved the PiSCSI externally & am using eth0. However it appears that regardless of my configurations AFP shares are only working over wlan0 still.

I didn't note your comment about compiling with the -g option until just now. I can do that & re-run my tests later if you'd like. DEBUG should be enabled in the builds I used below, however.

Again, with wlan0 enabled I have a networking config that looks like the following:

$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master piscsi_bridge state UP group default qlen 1000
    link/ether dc:a6:32:1f:8c:8b brd ff:ff:ff:ff:ff:ff
3: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether dc:a6:32:1f:8c:8c brd ff:ff:ff:ff:ff:ff
    inet 192.168.3.11/16 brd 192.168.255.255 scope global dynamic noprefixroute wlan0
       valid_lft 7199sec preferred_lft 6299sec
    inet6 fe80::3bfe:ab:8ed7:3cbc/64 scope link
       valid_lft forever preferred_lft forever
4: piscsi_bridge: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether dc:a6:32:1f:8c:8b brd ff:ff:ff:ff:ff:ff
    inet 192.168.3.14/16 brd 192.168.255.255 scope global dynamic piscsi_bridge
       valid_lft 7022sec preferred_lft 7022sec
    inet6 fe80::dea6:32ff:fe1f:8c8b/64 scope link
       valid_lft forever preferred_lft forever

I took wlan0 offline with this command & confirmed wlan0 was marked DOWN afterwards:

sudo ifconfig wlan0 down

I performed a number of different iterations of tests & attached the logs in a tarball. I ran my tests by running the piscsi command manually with the following command line (changed as appropriate for each iteration - trace device, disk image & daynaport all remaining the same):

sudo fullspec-v23.04.01-debug/piscsi -L trace:6 -id1 /home/pi/images/HD0-Performa-7.5.3.hda -id6 daynaport| tee piscsi-v23.04.01-fixatalkd-wlan0up.log

scsiutil shows the following output:

$ scsictl -i 6 -c show
  6:0  SCDP  Dayna:SCSI/Link:1.4a  inet=10.10.20.1/24:interface=eth0,wlan0

atalkd mis-configured for 'wlan0'

atalkd.conf:

wlan0 -phase 2 -net 0-65534 -addr 65280.76
  • piscsi-v23.04.01-debug.log
    • Works as expected but the AppleShare mount is using the wlan0 IP instead of the eth0 IP
    • Both piscsi-perf475 & piscsi-macse shares are visible in Chooser
  • piscsi-v23.10.01-debug.log
    • Stored AppleShare mount hangs at boot & times out
    • nothing visible in Chooser

atalkd reconfigured to use piscsi_bridge

atalkd.conf:

piscsi_bridge -phase 2 -net 0-65534 -addr 65280.76
  • piscsi-v23.04.01-wlan0down.log
    • wlan0 forced down/offline
    • Stored AppleShare mount hangs at boot & times out
    • Only piscsi-macse is visible in Chooser
    • Attempting to connecto to eth0 IP in Chooser times out/fails
  • piscsi-v23.04.01-fixatalkd-wlan0up.log
    • wlan0 online w/ expected IP address
    • Boots just fine w/ stored AppleShare mounted automatically on desktop
    • Both piscsi-perf475 & piscsi-macse shares are visible in Chooser
    • Attempt to mount PiSCSI Images from piscsi-perf475 share works but mount is using the wlan0 IP address instead of eth0
    • Attempt to manually use eth0 IP address in Chooser shows AppleShares but mounting a share still shows the share is using the wlan0 IP address
  • piscsi-v23.10.01-fixatalkd-wlan0down.log
    • wlan0 forced down/offline
    • Stored AppleShare mount hangs at boot & times out
    • nothing visible in Chooser
    • Attempting to connecto to eth0 IP in Chooser times out/fails
  • piscsi-v23.10.01-fixatalkd-wlan0up.log
    • wlan0 online w/ expected IP address
    • Stored AppleShare mount hangs at boot & times out
    • nothing visible in Chooser
    • Attempting to connecto to eth0 IP in Chooser times out/fails

piscsi-testing-logs-2023-11-06.tar.gz

@uweseimet
Copy link
Contributor

uweseimet commented Nov 6, 2023

@benjamink Please note that if there is an issue with your setup and not with the binaries there is no need for logs, and I cannot help you I am afraid. I don't use a Mac. The logs usually only help if there is a bug in the piscsi code.

@rdmark Maybe you can assist with any setup issue?

@benjamink
Copy link
Collaborator Author

@uweseimet Understood. I included all of the above moreso because I found it interesting the v23.04.01 seems to tolerate whatever my netatalk config issues are just fine but v23.10.01 does not. I'm not sure why that makes a difference but there is definitely something different about the two versions related to this.

@rdmark
Copy link
Member

rdmark commented Nov 6, 2023 via email

@benjamink
Copy link
Collaborator Author

@rdmark I have access to the Internet when I run the v23.04.01 build but not when I run the v23.10.01 build (with nothing else changed).

Works (v23.04.01):

sudo fullspec-v23.04.01-debug/piscsi -L trace:6 -id1 /home/pi/images/HD0-Performa-7.5.3.hda -id6 daynaport

Does not work (v23.10.01):

sudo fullspec/piscsi -L trace:6 -id1 /home/pi/images/HD0-Performa-7.5.3.hda -id6 daynaport

All I've changed is the piscsi binary being run. I was even able to switch from v23.10.01 not working to v23.04.01 working while the Mac was on & powered up. Chooser instantly populated when I switched the piscsi binaries under the covers.

@benjamink
Copy link
Collaborator Author

Internet continues to work even after doing the following:

Mac running the whole time booted into System 7.5.3

  1. Stop picsci daemon (ctrl-c from foreground process)
  2. Take down wlan0 w/ ifconfig wlan0 down
  3. Start piscsi daemon again w/ the same commands in the previous comment
  4. Navigate to a random URL I haven't been to before in iCab

@uweseimet
Copy link
Contributor

uweseimet commented Nov 6, 2023

@benjamink Please provide separate logs for each piscsi binary. It is important that the logs are as short as possible and that logging is only enabled for ID 6. Limit the number of daynaport accesses to a minimum, so that the logs are as short as possible.

Your previous logs show that you tried to create two daynaport devices:

Nov  5 14:26:15 piscsi-perf475 PISCSI[16835]: [2023-11-05 14:26:15.406] [error] Duplicate ID 6, unit 0

This cannot work.

@benjamink
Copy link
Collaborator Author

@uweseimet I don't understand where that duplicate came from. I'm not seeing that when I run the v23.10.01 binary right now with this command:

sudo fullspec/piscsi -L trace:6 -id1 /home/pi/images/HD0-Performa-7.5.3.hda -id6 daynaport | grep -i duplicate

I ran each binary command independently & sent the output to a separate log file each time. I ran the piscsi binary with the -L trace:6 (daynaport device) as you requested. It spews A LOT of output constantly so I'm not sure how to limit it without editing the log file & I wasn't sure if you'd get everything you needed if I did that.

@uweseimet
Copy link
Contributor

@benjamink I suggest that you add the two logs to this ticket. I will have a look at them and then let's see what the next steps might be. Thank you for spending your time on this.

@benjamink
Copy link
Collaborator Author

More info... I find this very odd. When I run the v23.04.01 binary & run nmap against what I believe is it's virtual IP(?) the nmap completes in about 10s:

$ nmap -Pn 192.168.15.141
Starting Nmap 7.94 ( https://nmap.org ) at 2023-11-06 18:30 EST
Nmap scan report for 192.168.15.141
Host is up (0.036s latency).
All 1000 scanned ports on 192.168.15.141 are in ignored states.
Not shown: 1000 closed tcp ports (conn-refused)

Nmap done: 1 IP address (1 host up) scanned in 10.05 seconds

If I run the v23.10.01 binary with the same command & run nmap again it takes 85s:

$ nmap -Pn 192.168.15.141
Starting Nmap 7.94 ( https://nmap.org ) at 2023-11-06 18:31 EST
Nmap scan report for 192.168.15.141
Host is up (0.000031s latency).
All 1000 scanned ports on 192.168.15.141 are in ignored states.
Not shown: 1000 filtered tcp ports (no-response)

Nmap done: 1 IP address (1 host up) scanned in 85.30 seconds

I am watching tcpdump when I run the above nmaps & confirm the nmap is runnag against piscsi:

tcpdump -i piscsi0 tcp

I'm not sure if that suggests anything but the difference seems odd to me.

@benjamink
Copy link
Collaborator Author

@uweseimet I'm happy to run whatever commands you'd like or reconfigure things however. Whatever I can do to help determine if this is a problem with the new piscsi version of it is me. I'm in no rush & I'll keep digging as well.

@uweseimet
Copy link
Contributor

uweseimet commented Nov 6, 2023

I cannot really comment on this, but it might be helpful to have an overview on the initial setup created by both binaries. Can you please do this:

  1. Ensure that piscsi is not automatically launched on your Pi, e.g. by systemd
  2. Reboot your Pi
  3. Launch the 23.04.01 binary with ID 6 daynaport and trace enabled, but do not do anything else, do not use your Mac in order to avoid any network traffic. Just disconnect the Pi from the Mac before running these steps maybe.
  4. Now run "ifconfig -a" to get the complete network information
  5. Save the piscsi log and the ifconfig output

Now do the same steps including the reboot with the develop binary.

Please attach the resulting outputs to this ticket. This is just to ensure that when setting up the bridge both piscsi binaries do the same. The logs should reflect that they do the same, and the ifconfig output should also reflect this.
If there is anything wrong with piscsi it can either be that the bridge is not set up correctly or that the daynaport emulation is broken. These are completely different things, and from the collected output one can see which part of piscsi may be affected.

@benjamink
Copy link
Collaborator Author

Disabled piscsi & piscsi-web services with:

systemctl disable piscsi
systemctl disable piscsi-web

Rebooted

v23.04.01 tests

Ran ps ax before running anything else:

v23.04.01-ps-output.txt

Ran ifconfig -a before running anything else:

v23.04.01-pre-ifconfig-a.txt

Started piscsi with this command (Mac is off):

sudo fullspec-v23.04.01-debug/piscsi -L trace:6 -id1 /home/pi/images/HD0-Performa-7.5.3.hda -id6 daynaport | tee piscsi-v23.04.01.log

piscsi-v23.04.01.log

Ran ifconfig -a again from a different terminal while piscsi was running:

v23.04.01-post-ifconfig-a.txt

Rebooted

v23.10.01 tests

Ran ps ax before running anything else:

v23.10.01-ps-output.txt

Ran ifconfig -a before running anything else:

v23.10.01-pre-ifconfig-a.txt

Started piscsi with this command (Mac is off):

sudo fullspec/piscsi -L trace:6 -id1 /home/pi/images/HD0-Performa-7.5.3.hda -id6 daynaport | tee piscsi-v23.10.01.log

piscsi-v23.10.01.log

Ran ifconfig -a again from a different terminal while piscsi was running:

v23.10.01-post-ifconfig-a.txt

@uweseimet
Copy link
Contributor

uweseimet commented Nov 7, 2023

The pre-ifconfig logs show that the piscsi_bridge is already present before you launch piscsi manually. Where does the bridge come from? Please ensure that nothing is running that creates the bridge before you manually launch piscsi.
As soon as you have ensure that there is no bridge at all after a reboot, please run steps 1.-5. again for each binary.

@benjamink
Copy link
Collaborator Author

It's being setup by the file in /etc/network/interfaces.d/piscsi_bridge that the installer creates:

pi@piscsi-perf475:/etc $ cat network/interfaces.d/piscsi_bridge
#
# Defines the 'piscsi_bridge' bridge that connects the PiSCSI network
# interface (ex DaynaPort SCSI/Link) to the outside world.
#
# Depending upon your system configuration, you may need to update this
# file to change 'eth0' to your Ethernet interface
#
# This file should be place in /etc/network/interfaces.d

auto piscsi_bridge
iface piscsi_bridge inet dhcp
	bridge_ports eth0

I'll disable that & re-run the tests.

@uweseimet
Copy link
Contributor

@rdmark I don't think anything else than piscsi should deal with the bridge, because piscsi sets up the bridge automatically, depending on the interface parameter settings.

@benjamink
Copy link
Collaborator Author

benjamink commented Nov 7, 2023

I shouldn't be surprised but commenting out the piscsi_bridge file means eth0 never gets an IP from DHCP (it's never brought up).

root@piscsi-perf475:/etc# ifconfig -a
eth0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether dc:a6:32:1f:8c:8b  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 64  bytes 5144 (5.0 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 64  bytes 5144 (5.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.3.11  netmask 255.255.0.0  broadcast 192.168.255.255
        inet6 fe80::3bfe:ab:8ed7:3cbc  prefixlen 64  scopeid 0x20<link>
        ether dc:a6:32:1f:8c:8c  txqueuelen 1000  (Ethernet)
        RX packets 1209  bytes 300942 (293.8 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 244  bytes 31841 (31.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

I've SSH'd in via wlan0 & will re-run the tests.

@benjamink
Copy link
Collaborator Author

Same exact process as above but with the piscsi_bridge file commented out so it doesn't get setup.

v23.04.01 tests*

v23.04.01-ps-output.txt
v23.04.01-pre-ifconfig-a.txt
piscsi-v23.04.01.log

v23.10.01 tests*

v23.10.01-ps-output.txt
v23.10.01-pre-ifconfig-a.txt
piscsi-v23.10.01.log
v23.10.01-post-ifconfig-a.txt

@benjamink
Copy link
Collaborator Author

Should I configure eth0 to get an IP address as a normal interface but leave the piscsi_bridge script commented out so that only the piscsi binary creates the bridge?

@rdmark
Copy link
Member

rdmark commented Nov 7, 2023 via email

@rdmark
Copy link
Member

rdmark commented Nov 7, 2023 via email

@benjamink
Copy link
Collaborator Author

@rdmark I went through that page & confirmed the original setup was correct. I've also disabled wlan0 completely by setting dtoverlay=disable-wifi in /boot/config.txt. I re-enabled the bridge interface & restored things back to how the wiki describes. All networking inv23.04.01 works perfectly, v23.10.01 does not at all.

/etc/network/interfaces.d/piscsi_bridge:

auto piscsi_bridge
iface piscsi_bridge inet dhcp
	bridge_ports eth0

/etc/dhcpcd.conf (at the bottom):

#interface eth0
#fallback static_eth0
denyinterfaces eth0

/etc/sysctl.conf still has forwarding commented out as expected:

#net.ipv4.ip_forward=1

Rebooted & ifconfig -a looks like this (remember, wlan0 is disabled in the /boot/config.txt):

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether dc:a6:32:1f:8c:8b  txqueuelen 1000  (Ethernet)
        RX packets 22833  bytes 1943809 (1.8 MiB)
        RX errors 0  dropped 34  overruns 0  frame 0
        TX packets 19917  bytes 6797434 (6.4 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 102  bytes 9161 (8.9 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 102  bytes 9161 (8.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

piscsi0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::14a2:fcff:fecb:6bc  prefixlen 64  scopeid 0x20<link>
        ether 16:a2:fc:cb:06:bc  txqueuelen 1000  (Ethernet)
        RX packets 219  bytes 18295 (17.8 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2266  bytes 315931 (308.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

piscsi_bridge: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.3.14  netmask 255.255.0.0  broadcast 192.168.255.255
        inet6 fe80::dea6:32ff:fe1f:8c8b  prefixlen 64  scopeid 0x20<link>
        ether 16:a2:fc:cb:06:bc  txqueuelen 1000  (Ethernet)
        RX packets 22780  bytes 1610311 (1.5 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 19123  bytes 6751008 (6.4 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

At the top of the logs when running piscsi is this:

[2023-11-07 01:16:51.879] [info] piscsi_bridge is already available
[2023-11-07 01:16:51.882] [info] Tap device piscsi0 created

Without the /etc/network/interfaces.d/piscsi_bridge in place eth0 will not get an IP address & thus nothing can work.

@uweseimet
Copy link
Contributor

uweseimet commented Nov 7, 2023

@benjamink In the last set of logs the 23.04.01 post-ifconfig is missing. Can you please add it? No need for the ps output, by the way.

@rdmark Would you agree (please double-check) that the commands for setting up the bridge (the log items starting with ">" in the develop logfile) are identical in both logs? These items reflect the respective command line actions (this is why there is a leading ">") executed programmatically by piscsi.

@uweseimet
Copy link
Contributor

uweseimet commented Nov 7, 2023

@benjamink Besides adding the missing logfile (see my previous comment), please also run this test:

  • Disconnect your Mac
  • Reboot the Pi and ensure that the bridge is not set up (no need to add logfiles for this)
  • Launch the develop piscsi manually and stop it again. The bridge should now be available, please verify with ifconfig. No need for attaching a log.
  • Without a reboot now launch the 23.04.01 piscsi
  • Please add logs of this piscsi startup sequence and with the ifconfig output after launching the 23.04.01 piscsi
  • Check whether accessing the network with your Mac works. Does this work?

This way we have the develop piscsi set up the bridge and have the 23.04.01piscsi use it.

@uweseimet
Copy link
Contributor

@rdmark I just noticed that the template for new issues does not indicate to add the Raspberry Pi os version, e.g. bullseye or bookworm. Can you please add this?
@benjamink Are you using buster or bullseye?

@benjamink
Copy link
Collaborator Author

@uweseimet testing_daynaport at commit fb00033cfa3f0ee6ec65428068295e81e770bb79 works

@uweseimet
Copy link
Contributor

@benjamink Thanks. The next change set is available :).

@benjamink
Copy link
Collaborator Author

@uweseimet is there any benefit to me doing make clean between these builds or should I just do make piscsi?

@uweseimet
Copy link
Contributor

@benjamink No, there is no benefit, unless there are compile-time errors. With the latest change set instead of "make piscsi" it is sufficient to run "make". Only the piscsi binary will be compiled.

@benjamink
Copy link
Collaborator Author

@uweseimet testing_daynaport at commit 29655958834eaa49a8be208c96ac18bc80e2feef works

@uweseimet
Copy link
Contributor

@benjamink OK. There is a new change set, please test.

@uweseimet
Copy link
Contributor

@benjamink I think I just found somethingt that is wrong. I suggest you stop your current test, update your source, and then re-test.

@benjamink
Copy link
Collaborator Author

Ok, I already had the last commit built & ready so I tested it anyhow, commit 06b4140adc00e12271fc6d43e3b8b70ad1b6081f worked. Building the new one now.

@uweseimet
Copy link
Contributor

@benjamink Thank you. After testing with the latest sources, if it still fails, this time please attach a logfile created with trace:6. It is sufficient if the logs contains some of the initial data when the Mac tries to acces the network, i.e. just limit its size to something reasonable.

@benjamink
Copy link
Collaborator Author

commit 84f94d6b09303f64f8be1f92a5693a70a2632431 works

@uweseimet
Copy link
Contributor

@benjamink The next change set is available.

@benjamink
Copy link
Collaborator Author

commit 9b1ce0adaa4fd88f3705a52c521dd1544830ce38 works

@uweseimet
Copy link
Contributor

@benjamink OK. The next change set is already available. Please test.

@benjamink
Copy link
Collaborator Author

commit d7b6ca4ea78f318a263ad206f2bda603e8e5172c works

@uweseimet
Copy link
Contributor

@benjamink OK. I just pushed the next set of changes.

@uweseimet
Copy link
Contributor

@benjamink Again I think I have just found something suspicous. Please update once again before testing.

@benjamink
Copy link
Collaborator Author

benjamink commented Nov 8, 2023

commit 0fdd9d80ac6c0e448cb3ad13bebf7b22a3572891 DID NOT work!

I tried to cut out a bunch of the log where it was sitting waiting for the mount to fail at boot. There's a bunch of lines up into it hanging trying to mount the AFP share then remove some lines off the end after it started the desktop. Let me know if you want the complete log file instead.

piscsi-testing-daynaport-branch.log.gz

NOTE: I rebooted between EVERY build/run including this one.

@uweseimet
Copy link
Contributor

@benjamink No need for a log. I'm quite sure I found what was wrong. Just pushed a new change set, which I guess will work again. Please test.

@benjamink
Copy link
Collaborator Author

SUCCESS! commit bc8d69d24a0442a41b686fb3c0fd2f7dd411114e works

@uweseimet
Copy link
Contributor

uweseimet commented Nov 8, 2023

@benjamink Great! The bug was caused by a condition that was inverted with "!", but should not have been. After stumbling upon this when reviewing the code I also found proof of that in the logs. This gets easier when you know what you have to look for ;-).
I have already prepared a branch "issue_1306", to be applied to the develop branch in order to fix this issue. Before creating a PR I would like to ask you for testing the issue_1306 branch. Please check it out, and this time run a "make clean" before building.

@uweseimet uweseimet linked a pull request Nov 8, 2023 that will close this issue
@benjamink
Copy link
Collaborator Author

@uweseimet confirmed the issue_1306 branch at commit c493ec7c396dd7fcada3ab38acdb734132024890 works fine

@uweseimet
Copy link
Contributor

@benjamink Very good news! Thank you for spending your time on this! Without your help it would have been impossible to find out what's wrong. I will create the final PR soon, and once it has been merged the develop branch should be fine.

@benjamink
Copy link
Collaborator Author

Very happy to help! Do you want me to go ahead & do an easyinstall.sh update using the current PR branch? Does that provide any value?

@uweseimet
Copy link
Contributor

uweseimet commented Nov 8, 2023

@benjamink No, it wouldn't. The changes only affect the C++ code which you have already tested. Everything else remains unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants