Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Post Update Release Lock Email Issue #352

Closed
wants to merge 7 commits into from

Conversation

ExtremeFiretop
Copy link
Owner

Untested at this time.

Currently there is an issue where the email does not always send post-reboot after an upgrade.
It was initially reported by Tom (visortgw) and I was able to replicate it.

Essentially to me the issue in short so far seems to be the services-start can be triggered before an active internet connection; or before amtm is ready, which is required to send the email.

@ExtremeFiretop
Copy link
Owner Author

ExtremeFiretop commented Nov 13, 2024

Suggested fix so far:

  1. Move the hook from services-start to wan-event "connected" (not post-mount since not all MerlinAU installs use a USB)
  2. Increase the potential delay/sleep timers
  3. Add a check to confirm that AMTM files are configured and accessible before we try to send the email

@ExtremeFiretop
Copy link
Owner Author

ExtremeFiretop commented Nov 13, 2024

@Martinski4GitHub

This is untested still; just a theory on a potential solution to the problem.
We can of course go another way; and instead do our own check for internet/wan access ourselves and continue to use services-start?

Now the reason I have trouble believing it’s the WAN connection is due to the check for “nvram get ntp_ready” which already exists. This is the reason I included a check for AMTM as well and increased the delays.

Your opinion is appreciated.

@ExtremeFiretop
Copy link
Owner Author

Upon further review; I believe I found the root cause.
The router is rebooting itself before it gets to the "ReleaseLock"

Which is crrently only found after the 3 minute wait; if it reboots itself after 1 or 2 minutes; then the release lock hasn't been done and the email won't send upon reboot. Since the lock file only goes stale after 10 minutes.

@ExtremeFiretop ExtremeFiretop changed the title Fix Post Update Timing Issue with Services-Start Fix Post Update Release Lock Email Issue Nov 13, 2024
@ExtremeFiretop ExtremeFiretop added the bug Something isn't working label Nov 13, 2024
@Martinski4GitHub
Copy link
Collaborator

Untested at this time.

Currently there is an issue where the email does not always send post-reboot after an upgrade. It was initially reported by Tom (visortgw) and I was able to replicate it.

Essentially to me the issue in short so far seems to be the services-start can be triggered before an active internet connection; or before amtm is ready, which is required to send the email.

Yes, that's correct. The "services-start" script could be executed before WAN connection is active, and I thought checking for the "ntp_ready" NVRAM var would be sufficient, but perhaps that there might be cases (outliers?) where that's not enough.

@Martinski4GitHub
Copy link
Collaborator

Suggested fix so far:

1. Move the hook from services-start to wan-event "connected" (not **post-mount** since not all MerlinAU installs use a USB)

I don't think the "wan-event" script would have been a good spot to hook our post-reboot script calls.

2. Increase the potential delay/sleep timers

Yeah, that would help.

3. Add a check to confirm that AMTM files are configured and accessible before we try to send the email

The AMTM email config file is in the JFFS partition filesystem which is mounted very early on before services can ever be started. If the JFFS filesystem is not mounted or is mounted as read-only, the user has bigger problems to deal with.

@Martinski4GitHub
Copy link
Collaborator

@Martinski4GitHub

This is untested still; just a theory on a potential solution to the problem. We can of course go another way; and instead do our own check for internet/wan access ourselves and continue to use services-start?

Now the reason I have trouble believing it’s the WAN connection is due to the check for “nvram get ntp_ready” which already exists.

Yes, that's what I thought too, but there migh be some edge cases where that check might not be sufficient.

@Martinski4GitHub
Copy link
Collaborator

Martinski4GitHub commented Nov 13, 2024

Upon further review; I believe I found the root cause. The router is rebooting itself before it gets to the "ReleaseLock"

Which is crrently only found after the 3 minute wait; if it reboots itself after 1 or 2 minutes; then the release lock hasn't been done and the email won't send upon reboot. Since the lock file only goes stale after 10 minutes.

No, the Lock file cannot possibly be the root cause because the file is located in the temporary virtual disk (i.e. tmpfs is in RAM) where everything is wiped during a reboot so the Lock file gets deleted and not found after reboot.

@ExtremeFiretop
Copy link
Owner Author

Untested at this time.
Currently there is an issue where the email does not always send post-reboot after an upgrade. It was initially reported by Tom (visortgw) and I was able to replicate it.
Essentially to me the issue in short so far seems to be the services-start can be triggered before an active internet connection; or before amtm is ready, which is required to send the email.

Yes, that's correct. The "services-start" script could be executed before WAN connection is active, and I thought checking for the "ntp_ready" NVRAM var would be sufficient, but perhaps that there might be cases (outliers?) where that's not enough.

it happens pretty consistently after a reboot to what I can tell; I've only tested once but going back I can see that I haven't received the "successful" email for the last production release either.

@ExtremeFiretop
Copy link
Owner Author

ExtremeFiretop commented Nov 13, 2024

I don't think the "wan-event" script would have been a good spot to hook our post-reboot script calls.

I'm not sure of a better location; it's either services-start (after everything has started) or a wan event (once WAN has started); or when AMTM is available (JFFS Available). As I mentioned we can't use post-mount for obvious reasons.

If we want to move the hook call to "wan" we can specifically use wan-event and for AMTM; it would probably need to be "init-start?"

The theory/benefit to moving it to one of those is that we would be sure those steps would be actioned (JFFS Loaded and WAN Connected) and we can also rule out services-start not being triggered due to a possible false error with a service at boot time. (AKA services-start is only triggered once ALL other services are started) which to me means if ddns failure happened (at boot) due to some firmware error, that it would stop the email from sending. (But we we don't rely on ddns service for example)

Yeah, that would help.

I see you included those changes in your PR :) Happy that stuck around!

Although I noticed you didn't increase the sleep timers from 30 to 60. I figured it didn't hurt to add an additional static 30 seconds to the wait after a connection is ready. Lots of the time with AIMesh I notice the primary connection "flip-flops" while the nodes are reconnecting post reboot. As in, the primary is first to come up and connect to WAN, I can test connectivity from the primary or my desktop and confirm access; and then as the nodes come online and connect to the primary we have little "blips" of no network connectivity at all. (Even on the primary) until all the nodes are connected, this gets worse the more nodes you have. It's quick; both my nodes connect within seconds; but that's seconds of network "unavailable" for 2 nodes.

The AMTM email config file is in the JFFS partition filesystem which is mounted very early on before services can ever be started. If the JFFS filesystem is not mounted or is mounted as read-only, the user has bigger problems to deal with.

Fair enough; agreed; I just didn't want us to try and call those files and exit pre-maturely if they weren't available.

@ExtremeFiretop
Copy link
Owner Author

@Martinski4GitHub
This is untested still; just a theory on a potential solution to the problem. We can of course go another way; and instead do our own check for internet/wan access ourselves and continue to use services-start?
Now the reason I have trouble believing it’s the WAN connection is due to the check for “nvram get ntp_ready” which already exists.

Yes, that's what I thought too, but there migh be some edge cases where that check might not be sufficient.

I see this is the route you took in your PR :) I'll review it in a moment

@ExtremeFiretop
Copy link
Owner Author

ExtremeFiretop commented Nov 13, 2024

Upon further review; I believe I found the root cause. The router is rebooting itself before it gets to the "ReleaseLock"
Which is crrently only found after the 3 minute wait; if it reboots itself after 1 or 2 minutes; then the release lock hasn't been done and the email won't send upon reboot. Since the lock file only goes stale after 10 minutes.

No, the Lock file cannot possibly be the root cause because the file is located in the temporary virtual disk (i.e. tmpfs is in RAM) where everything is wiped during a reboot so the Lock file gets deleted and not found after reboot.

Re-tested and confirmed this morning.
I may have been getting tired last night, was a busy day. Looks to me like I rebooted and didn't refresh my WinSCP window so the lock file still existed in my "view" but I should of just refreshed the screen lol.

Actually overnight I was thinking about this, and /opt/etc/init.d/rc.unslung is called with a "start" it loads a bunch of Environment Variables and Paths for AMTM to function, is it possible that we need to validate those are ready/available?

image

I've had a project before where post-reboot the environment variables didn't load and caused odd behavior which is why I thought of it. After a reboot, the environment in which the script runs may not have all the necessary environment variables, paths, or functions loaded right?

We can also check the profile environment variables? [ -f /etc/profile ] && source /etc/profile

image

@ExtremeFiretop ExtremeFiretop deleted the ExtremeFiretop-PostUpdateEmailFix branch November 13, 2024 14:29
@Martinski4GitHub
Copy link
Collaborator

I don't think the "wan-event" script would have been a good spot to hook our post-reboot script calls.

I'm not sure of a better location; it's either services-start (after everything has started) or a wan event (once WAN has started); or when AMTM is available (JFFS Available). As I mentioned we can't use post-mount for obvious reasons.

If we want to move the hook call to "wan" we can specifically use wan-event and for AMTM; it would probably need to be "init-start?"

The theory/benefit to moving it to one of those is that we would be sure those steps would be actioned (JFFS Loaded and WAN Connected) and we can also rule out services-start not being triggered due to a possible false error with a service at boot time. (AKA services-start is only triggered once ALL other services are started) which to me means if ddns failure happened (at boot) due to some firmware error, that it would stop the email from sending. (But we we don't rely on ddns service for example)

That's not how the "services-start" script works. The script is called in a non-blocking mode after all built-in services have been called and returned, regardless of success or failure. Remember that users don't always have all built-in services enabled (e.g. FTP, Samba, NFS, AiCloud, UPnP Media Server, etc.) so when called they do not have to succeed.

Although I noticed you didn't increase the sleep timers from 30 to 60. I figured it didn't hurt to add an additional static 30 seconds to the wait after a connection is ready. Lots of the time with AIMesh I notice the primary connection "flip-flops" while the nodes are reconnecting post reboot. As in, the primary is first to come up and connect to WAN, I can test connectivity from the primary or my desktop and confirm access; and then as the nodes come online and connect to the primary we have little "blips" of no network connectivity at all. (Even on the primary) until all the nodes are connected, this gets worse the more nodes you have. It's quick; both my nodes connect within seconds; but that's seconds of network "unavailable" for 2 nodes.

Got you!!

@Martinski4GitHub
Copy link
Collaborator

Actually overnight I was thinking about this, and /opt/etc/init.d/rc.unslung is called with a "start" it loads a bunch of Environment Variables and Paths for AMTM to function, is it possible that we need to validate those are ready/available?

I'm not sure what you're referring to here. Our script does not have any Entware dependencies at all, whether Entware environment vars or paths. For email notifications, we need only a valid AMTM email configuration file - nothing else from AMTM.

I've had a project before where post-reboot the environment variables didn't load and caused odd behavior which is why I thought of it. After a reboot, the environment in which the script runs may not have all the necessary environment variables, paths, or functions loaded right?

We can also check the profile environment variables? [ -f /etc/profile ] && source /etc/profile

The /etc/profile is used only for interactive shell sessions. When the script is executed following a post-reboot setup, it's launched in a non-interactive shell process.

@ExtremeFiretop
Copy link
Owner Author

ExtremeFiretop commented Nov 14, 2024

Actually overnight I was thinking about this, and /opt/etc/init.d/rc.unslung is called with a "start" it loads a bunch of Environment Variables and Paths for AMTM to function, is it possible that we need to validate those are ready/available?

I'm not sure what you're referring to here. Our script does not have any Entware dependencies at all, whether Entware environment vars or paths. For email notifications, we need only a valid AMTM email configuration file - nothing else from AMTM.

Right; but what does AMTM need to send the email? It likely relies/uses some of the environment variables; which we must wait until they are loaded for it to function; correct?

I've had a project before where post-reboot the environment variables didn't load and caused odd behavior which is why I thought of it. After a reboot, the environment in which the script runs may not have all the necessary environment variables, paths, or functions loaded right?
We can also check the profile environment variables? [ -f /etc/profile ] && source /etc/profile

The /etc/profile is used only for interactive shell sessions. When the script is executed following a post-reboot setup, it's launched in a non-interactive shell process.

Exactly!!! When I run this line in the interactive console:

[ -f /jffs/scripts/MerlinAU.sh ] && sh /jffs/scripts/MerlinAU.sh postUpdateEmail &

it sent.

But when it runs early and non-interactively; it crashes and creates a weird file I mentioned in your PR.
Could this not be because of AMTM or profile environment variables not being loaded at the time of services-start being called was my overnight thought?

@ExtremeFiretop
Copy link
Owner Author

That's not how the "services-start" script works. The script is called in a non-blocking mode after all built-in services have been called and returned, regardless of success or failure. Remember that users don't always have all built-in services enabled (e.g. FTP, Samba, NFS, AiCloud, UPnP Media Server, etc.) so when called they do not have to succeed.

Genius, thank you for clarifying!

@ExtremeFiretop
Copy link
Owner Author

ExtremeFiretop commented Nov 14, 2024

Cough Cough ;)

Nov 14 01:47:37 custom_script: Running /jffs/scripts/services-start
Nov 14 01:47:37 dbg: =====================
Nov 14 01:47:37 dbg: command: cat /proc/meminfo
Nov 14 01:47:37 dbg: MemTotal:        2048044 kB
Nov 14 01:47:37 dbg: MemFree:         1296780 kB
Nov 14 01:47:37 dbg: MemAvailable:    1296144 kB
Nov 14 01:47:37 dbg: Buffers:           13280 kB
Nov 14 01:47:37 dbg: Cached:            40540 kB
Nov 14 01:47:37 dbg: SwapCached:            0 kB
Nov 14 01:47:37 dbg: Active:            67124 kB
Nov 14 01:47:37 dbg: Inactive:          21052 kB
Nov 14 01:47:37 dbg: Active(anon):      34712 kB
Nov 14 01:47:37 dbg: Inactive(anon):      896 kB
Nov 14 01:47:37 dbg: Active(file):      32412 kB
Nov 14 01:47:37 dbg: Inactive(file):    20156 kB
Nov 14 01:47:37 dbg: Unevictable:           0 kB
Nov 14 01:47:37 dbg: Mlocked:               0 kB
Nov 14 01:47:37 dbg: SwapTotal:             0 kB
Nov 14 01:47:37 dbg: SwapFree:              0 kB
Nov 14 01:47:37 dbg: Dirty:                 8 kB
Nov 14 01:47:37 dbg: Writeback:             0 kB
Nov 14 01:47:37 dbg: AnonPages:         34576 kB
Nov 14 01:47:37 dbg: Mapped:            17340 kB
Nov 14 01:47:37 dbg: Shmem:              1256 kB
Nov 14 01:47:37 dbg: Slab:             149340 kB
Nov 14 01:47:37 dbg: SReclaimable:       6232 kB
Nov 14 01:47:37 dbg: SUnreclaim:       143108 kB
Nov 14 01:47:37 dbg: KernelStack:        2928 kB
Nov 14 01:47:37 dbg: PageTables:         1940 kB
Nov 14 01:47:37 dbg: NFS_Unstable:          0 kB
Nov 14 01:47:37 dbg: Bounce:                0 kB
Nov 14 01:47:37 dbg: WritebackTmp:          0 kB
Nov 14 01:47:37 dbg: CommitLimit:     2048044 kB
Nov 14 01:47:37 dbg: Committed_AS:      94544 kB
Nov 14 01:47:37 dbg: VmallocTotal:   263061440 kB
Nov 14 01:47:37 dbg: VmallocUsed:           0 kB
Nov 14 01:47:37 dbg: VmallocChunk:          0 kB
Nov 14 01:47:37 dbg: Percpu:              752 kB
Nov 14 01:47:37 dbg: CmaTotal:         483328 kB
Nov 14 01:47:37 dbg: CmaFree:           64232 kB
Nov 14 01:47:37 dbg: command: free
Nov 14 01:47:37 dbg:              total       used       free     shared    buffers     cached
Nov 14 01:47:37 dbg: Mem:       2048044     751768    1296276       1256      13280      40724
Nov 14 01:47:37 dbg: -/+ buffers/cache:     697764    1350280
Nov 14 01:47:37 dbg: Swap:            0          0          0
Nov 14 01:47:37 dbg: command: cat /proc/slabinfo
Nov 14 01:47:37 dbg: slabinfo - version: 2.1
Nov 14 01:47:37 dbg: # name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
Nov 14 01:47:37 dbg: bcm_mcast_host_client_cache      0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcm_mcast_if_netdev_cache    512    512     32  128    1 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: bcm_mcast_rxinfo_node_cache      0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcm_mcast_clientinfo_node_cache      0      0    128   32    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcm_mcast_grpinfo_node_cache      0      0    128   32    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcm_mcast_fc_flowkey_cache      0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcm_mcast_fc_flowhdl_cache      0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcm_mcast_mld_exception_cache      0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcm_mcast_mld_rep_cache      0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcm_mcast_mld_grp_cache      0      0    192   21    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcm_mcast_igmp_exception_cache      0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcm_mcast_igmp_rep_cache      0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcm_mcast_igmp_grp_cache      0      0    128   32    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: cmdlist_buffer         0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcmVlan_blogRule_id      0      0     16  256    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcmVlan_flowPath       0      0     24  170    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcmVlan_flowDev        0      0     24  170    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bcmvlan_tableEntry     36     36    448   18    2 : tunables    0    0    0 : slabdata      2      2      0
Nov 14 01:47:37 dbg: bcmvlan_realDev       50     50    648   25    4 : tunables    0    0    0 : slabdata      2      2      0
Nov 14 01:47:37 dbg: bcmvlan_vlanDev      306    306     40  102    1 : tunables    0    0    0 : slabdata      3      3      0
Nov 14 01:47:37 dbg: bpmbuf              7010   7016  11136    1    4 : tunables    0    0    0 : slabdata   7016   7016      0
Nov 14 01:47:37 dbg: ubi_wl_entry_slab   2048   2048     32  128    1 : tunables    0    0    0 : slabdata     16     16      0
Nov 14 01:47:37 dbg: ubifs_inode_slab     345    345    704   23    4 : tunables    0    0    0 : slabdata     15     15      0
Nov 14 01:47:37 dbg: SCTPv6                22     22   1472   22    8 : tunables    0    0    0 : slabdata      1      1      0
Nov 14 01:47:37 dbg: SCTP                   0      0   1344   24    8 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: sctp_chunk             0      0    256   16    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: sctp_bind_bucket       0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bridge_fdb_cache     128    128    128   32    1 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: nf-frags               0      0    192   21    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: xfrm6_tunnel_spi       0      0    128   32    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: ip6-frags              0      0    192   21    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: fib6_nodes           128    128     64   64    1 : tunables    0    0    0 : slabdata      2      2      0
Nov 14 01:47:37 dbg: ip6_dst_cache         16     16    256   16    1 : tunables    0    0    0 : slabdata      1      1      0
Nov 14 01:47:37 dbg: ip6_mrt_cache          0      0    192   21    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: PINGv6                 0      0   1152   28    8 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: RAWv6                112    112   1152   28    8 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: UDPLITEv6              0      0   1216   26    8 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: UDPv6                 52     52   1216   26    8 : tunables    0    0    0 : slabdata      2      2      0
Nov 14 01:47:37 dbg: tw_sock_TCPv6          0      0    224   18    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: request_sock_TCPv6      0      0    296   27    2 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: TCPv6                  0      0   2304   14    8 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: nf_conncount_rb        0      0     96   42    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: nf_conncount_tuple      0      0     72   56    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: nf_conntrack_expect      0      0    216   18    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: nf_conntrack         288    288    448   18    2 : tunables    0    0    0 : slabdata     16     16      0
Nov 14 01:47:37 dbg: sgpool-128             8      8   4096    8    8 : tunables    0    0    0 : slabdata      1      1      0
Nov 14 01:47:37 dbg: sgpool-64             16     16   2048   16    8 : tunables    0    0    0 : slabdata      1      1      0
Nov 14 01:47:37 dbg: sgpool-32             16     16   1024   16    4 : tunables    0    0    0 : slabdata      1      1      0
Nov 14 01:47:37 dbg: sgpool-16             16     16    512   16    2 : tunables    0    0    0 : slabdata      1      1      0
Nov 14 01:47:37 dbg: sgpool-8              16     16    256   16    1 : tunables    0    0    0 : slabdata      1      1      0
Nov 14 01:47:37 dbg: fuse_request           0      0    400   20    2 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: fuse_inode             0      0    768   21    4 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: jffs2_inode_cache      0      0     40  102    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: jffs2_node_frag        0      0     40  102    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: jffs2_refblock         0      0    240   17    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: jffs2_tmp_dnode        0      0     56   73    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: jffs2_raw_inode        0      0    128   32    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: jffs2_raw_dirent       0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: jffs2_full_dnode       0      0     24  170    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: jffs2_i                0      0    664   24    4 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: squashfs_inode_cache    900    900    640   25    4 : tunables    0    0    0 : slabdata     36     36      0
Nov 14 01:47:37 dbg: jbd2_transaction_s      0      0    256   16    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: jbd2_inode             0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: jbd2_journal_handle      0      0     48   85    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: jbd2_journal_head      0      0    120   34    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: jbd2_revoke_table_s      0      0     16  256    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: jbd2_revoke_record_s      0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: ext4_inode_cache       0      0   1016   16    4 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: ext4_free_data         0      0     56   73    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: ext4_allocation_context      0      0    128   32    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: ext4_prealloc_space      0      0    104   39    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: ext4_system_zone       0      0     40  102    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: ext4_io_end            0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: ext4_extent_status      0      0     40  102    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: mbcache                0      0     56   73    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: dnotify_mark           0      0     80   51    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: dnotify_struct         0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: dio                    0      0    640   25    4 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: bio-1                 16     16    256   16    1 : tunables    0    0    0 : slabdata      1      1      0
Nov 14 01:47:37 dbg: fasync_cache           0      0     48   85    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: posix_timers_cache      0      0    240   17    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: UNIX                 102    102    960   17    4 : tunables    0    0    0 : slabdata      6      6      0
Nov 14 01:47:37 dbg: ip4-frags              0      0    208   19    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: ip_mrt_cache           0      0    192   21    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: UDP-Lite               0      0   1024   16    4 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: tcp_bind_bucket      256    256     64   64    1 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: inet_peer_cache       42     42    192   21    1 : tunables    0    0    0 : slabdata      2      2      0
Nov 14 01:47:37 dbg: secpath_cache          0      0    128   32    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: xfrm_dst_cache         0      0    320   25    2 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: xfrm_state             0      0    768   21    4 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: ip_fib_trie          340    340     48   85    1 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: ip_fib_alias         292    292     56   73    1 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: ip_dst_cache          84     84    192   21    1 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: PING                   0      0    896   18    4 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: RAW                   68     68    960   17    4 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: UDP                   64     64   1024   16    4 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: tw_sock_TCP           54     54    224   18    1 : tunables    0    0    0 : slabdata      3      3      0
Nov 14 01:47:37 dbg: request_sock_TCP      81     81    296   27    2 : tunables    0    0    0 : slabdata      3      3      0
Nov 14 01:47:37 dbg: TCP                   60     60   2112   15    8 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: eventpoll_pwq        168    168     72   56    1 : tunables    0    0    0 : slabdata      3      3      0
Nov 14 01:47:37 dbg: eventpoll_epi         96     96    128   32    1 : tunables    0    0    0 : slabdata      3      3      0
Nov 14 01:47:37 dbg: inotify_inode_mark    153    153     80   51    1 : tunables    0    0    0 : slabdata      3      3      0
Nov 14 01:47:37 dbg: blog_rule              0      0    256   16    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: request_queue         15     15   2184   15    8 : tunables    0    0    0 : slabdata      1      1      0
Nov 14 01:47:37 dbg: blkdev_requests        0      0    296   27    2 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: blkdev_ioc           156    156    104   39    1 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: bio-0                416    416    256   16    1 : tunables    0    0    0 : slabdata     26     26      0
Nov 14 01:47:37 dbg: biovec-max            24     24   4096    8    8 : tunables    0    0    0 : slabdata      3      3      0
Nov 14 01:47:37 dbg: biovec-128             0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: biovec-64              0      0   1024   16    4 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: biovec-16              0      0    256   16    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: ksm_mm_slot            0      0     48   85    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: ksm_stable_node        0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: ksm_rmap_item          0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: uid_cache             21     21    192   21    1 : tunables    0    0    0 : slabdata      1      1      0
Nov 14 01:47:37 dbg: sock_inode_cache     300    300    640   25    4 : tunables    0    0    0 : slabdata     12     12      0
Nov 14 01:47:37 dbg: skbuff_cb_store_cache      0      0    128   32    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: skbuff_fclone_cache     78     78   1216   26    8 : tunables    0    0    0 : slabdata      3      3      0
Nov 14 01:47:37 dbg: skbuff_head_cache   6468   6468    576   28    4 : tunables    0    0    0 : slabdata    231    231      0
Nov 14 01:47:37 dbg: configfs_dir_cache      0      0     96   42    1 : tunables    0    0    0 : slabdata      0      0      0
Nov 14 01:47:37 dbg: file_lock_cache       80     80    200   20    1 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: file_lock_ctx        292    292     56   73    1 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: fsnotify_mark_connector    510    510     24  170    1 : tunables    0    0    0 : slabdata      3      3      0
Nov 14 01:47:37 dbg: shmem_inode_cache    744    744    664   24    4 : tunables    0    0    0 : slabdata     31     31      0
Nov 14 01:47:37 dbg: proc_dir_entry       735    735    192   21    1 : tunables    0    0    0 : slabdata     35     35      0
Nov 14 01:47:37 dbg: pde_opener           408    408     40  102    1 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: proc_inode_cache    1000   1000    648   25    4 : tunables    0    0    0 : slabdata     40     40      0
Nov 14 01:47:37 dbg: seq_file             128    128    128   32    1 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: sigqueue             100    100    160   25    1 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: bdev_cache            19     19    832   19    4 : tunables    0    0    0 : slabdata      1      1      0
Nov 14 01:47:37 dbg: kernfs_node_cache  21090  21090    136   30    1 : tunables    0    0    0 : slabdata    703    703      0
Nov 14 01:47:37 dbg: mnt_cache             84     84    384   21    2 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: filp                2016   2016    256   16    1 : tunables    0    0    0 : slabdata    126    126      0
Nov 14 01:47:37 dbg: inode_cache         2492   2492    576   28    4 : tunables    0    0    0 : slabdata     89     89      0
Nov 14 01:47:37 dbg: dentry              6363   6363    192   21    1 : tunables    0    0    0 : slabdata    303    303      0
Nov 14 01:47:37 dbg: names_cache           32     32   4096    8    8 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: key_jar               16     16    256   16    1 : tunables    0    0    0 : slabdata      1      1      0
Nov 14 01:47:37 dbg: buffer_head        13377  13377    104   39    1 : tunables    0    0    0 : slabdata    343    343      0
Nov 14 01:47:37 dbg: nsproxy               73     73     56   73    1 : tunables    0    0    0 : slabdata      1      1      0
Nov 14 01:47:37 dbg: vm_area_struct      6666   6666    184   22    1 : tunables    0    0    0 : slabdata    303    303      0
Nov 14 01:47:37 dbg: mm_struct            108    108    896   18    4 : tunables    0    0    0 : slabdata      6      6      0
Nov 14 01:47:37 dbg: fs_cache             256    256     64   64    1 : tunables    0    0    0 : slabdata      4      4      0
Nov 14 01:47:37 dbg: files_cache          115    115    704   23    4 : tunables    0    0    0 : slabdata      5      5      0
Nov 14 01:47:37 dbg: signal_cache         425    425    960   17    4 : tunables    0    0    0 : slabdata     25     25      0
Nov 14 01:47:37 dbg: sighand_cache        195    195   2112   15    8 : tunables    0    0    0 : slabdata     13     13      0
Nov 14 01:47:37 dbg: task_struct          274    300   2688   12    8 : tunables    0    0    0 : slabdata     25     25      0
Nov 14 01:47:37 dbg: cred_jar             756    756    192   21    1 : tunables    0    0    0 : slabdata     36     36      0
Nov 14 01:47:37 dbg: anon_vma_chain      6464   6464     64   64    1 : tunables    0    0    0 : slabdata    101    101      0
Nov 14 01:47:37 dbg: anon_vma            3042   3042    104   39    1 : tunables    0    0    0 : slabdata     78     78      0
Nov 14 01:47:37 dbg: pid                  416    416    128   32    1 : tunables    0    0    0 : slabdata     13     13      0
Nov 14 01:47:37 dbg: pool_workqueue        32     32    256   16    1 : tunables    0    0    0 : slabdata      2      2      0
Nov 14 01:47:37 dbg: radix_tree_node     1036   1036    584   28    4 : tunables    0    0    0 : slabdata     37     37      0
Nov 14 01:47:37 dbg: kmalloc-8192         307    316   8192    4    8 : tunables    0    0    0 : slabdata     79     79      0
Nov 14 01:47:37 dbg: kmalloc-4096        1704   1704   4096    8    8 : tunables    0    0    0 : slabdata    213    213      0
Nov 14 01:47:37 dbg: kmalloc-2048         608    608   2048   16    8 : tunables    0    0    0 : slabdata     38     38      0
Nov 14 01:47:37 dbg: kmalloc-1024         887    896   1024   16    4 : tunables    0    0    0 : slabdata     56     56      0
Nov 14 01:47:37 dbg: kmalloc-512         1264   1264    512   16    2 : tunables    0    0    0 : slabdata     79     79      0
Nov 14 01:47:37 dbg: kmalloc-256         1824   1824    256   16    1 : tunables    0    0    0 : slabdata    114    114      0
Nov 14 01:47:37 dbg: kmalloc-128        37519  40704    128   32    1 : tunables    0    0    0 : slabdata   1272   1272      0
Nov 14 01:47:37 dbg: kmem_cache_node      192    192    128   32    1 : tunables    0    0    0 : slabdata      6      6      0
Nov 14 01:47:37 dbg: kmem_cache           176    176    256   16    1 : tunables    0    0    0 : slabdata     11     11      0
Nov 14 01:47:37 dbg: command: bpm status
Nov 14 01:47:37 kernel: ------------------ BPM Config : Dynamic ----------------
Nov 14 01:47:37 kernel: bpm_max_buf_count                = 28926
Nov 14 01:47:37 kernel: bpm_initial_buf_count            = 5785
Nov 14 01:47:37 kernel: bpm_buf_avail_lower_th           = 512
Nov 14 01:47:37 kernel: bpm_buf_expand_step_size         = 128
Nov 14 01:47:37 kernel: bpm_buf_shrink_step_size         = 32
Nov 14 01:47:37 kernel: bpm_buf_expn_pool_fast_shrink_th = 5785
Nov 14 01:47:37 kernel: bpm_max_skb_count                = 28926
Nov 14 01:47:37 kernel: bpm_initial_skb_count            = 5785
Nov 14 01:47:37 kernel: bpm_skb_avail_lower_th           = 512
Nov 14 01:47:37 kernel: bpm_skb_expand_step_size         = 128
Nov 14 01:47:37 kernel: bpm_skb_shrink_step_size         = 32
Nov 14 01:47:37 kernel: bpm_soak_timer_duration          = 10000
Nov 14 01:47:37 kernel: ----------------------- BPM Status --------------------------
Nov 14 01:47:37 kernel: base pool tot_buf       = 5785
Nov 14 01:47:37 kernel: cur                     = 5767
Nov 14 01:47:37 kernel: exp pool tot_buf        = 1223
Nov 14 01:47:37 kernel: exp pool cur            = 663
Nov 14 01:47:37 kernel: avail                   = 578
Nov 14 01:47:37 kernel: no_buf_err              = 0
Nov 14 01:47:37 kernel: cum_alloc               = 9728
Nov 14 01:47:37 kernel: cum_free                = 3298
Nov 14 01:47:37 kernel: exp cum_alloc           = 1278
Nov 14 01:47:37 kernel: exp cum_free            = 615
Nov 14 01:47:37 kernel: min_availability        = 0
Nov 14 01:47:37 kernel: max_inuse_watermark     = 6448
Nov 14 01:47:37 kernel: max_watermark           = 7040
Nov 14 01:47:37 kernel: buf_expansion_count     = 10
Nov 14 01:47:37 kernel: buf_shrink_count        = 1
Nov 14 01:47:37 kernel: buf_exp_fail_count      = 0
Nov 14 01:47:37 kernel: buf_inthr_alloc_count   = 231
Nov 14 01:47:37 kernel: skb_inthr_alloc_count   = 0
Nov 14 01:47:37 kernel: max_dyn                 = 28926
Nov 14 01:47:37 kernel: -------------------- BPM SKB Pool Status -----------------------
Nov 14 01:47:37 kernel: skb_avail                       = 5785
Nov 14 01:47:37 kernel: skb_total                       = 5785
Nov 14 01:47:37 kernel: skb_bound                       = 28926
Nov 14 01:47:37 kernel: min_skb_avail_cnt               = 5785
Nov 14 01:47:37 kernel: max_skb_wm                      = 5785
Nov 14 01:47:37 kernel: max_inuse_skb_wm                = 0
Nov 14 01:47:37 kernel: skb_alloc_cnt                   = 0
Nov 14 01:47:37 kernel: skb_bpm_alloc_cnt               = 0
Nov 14 01:47:37 kernel: skb_free_cnt                    = 0
Nov 14 01:47:37 kernel: skb_fast_free_cnt               = 0
Nov 14 01:47:37 kernel: skb_fail_cnt                    = 0
Nov 14 01:47:37 kernel: skb_grow_cnt                    = 0
Nov 14 01:47:37 dbg: ==================
Nov 14 01:47:37 kernel: -------------------------------------------------------------
Nov 14 01:47:37 kernel: link down LAN1
Nov 14 01:47:37 scMerlin: Waiting for NTP to sync...
Nov 14 01:47:38 YazDHCP: **INFO**:  Backup directory [/opt/var/SavedUserIcons] NOT FOUND.
Nov 14 01:47:38 YazDHCP: Trying again with directory [/opt/var/SavedUserIcons]
Nov 14 01:47:38 YazDHCP: **INFO**:  Backup directory [/opt/var/SavedUserIcons] NOT FOUND.
Nov 14 01:47:38 YazDHCP: Trying again with directory [/opt/var/SavedUserIcons]
Nov 14 01:47:38 YazDHCP: **INFO**:  Backup directory [/opt/var/SavedUserIcons] NOT FOUND.
Nov 14 01:47:38 YazDHCP: Trying again with directory [/jffs/configs/SavedUserIcons]
Nov 14 01:47:38 YazDHCP: *WARNING*: Temporary Backup directory [/jffs/configs/SavedUserIcons]
Nov 14 01:47:38 [MerlinAU.sh] 5076: **ERROR**: The shell script 'MerlinAU.sh' is already running [Lock file: 0 secs.]
Nov 14 01:47:38 [MerlinAU.sh] 5076: Exiting...

@Martinski4GitHub
Copy link
Collaborator

Actually overnight I was thinking about this, and /opt/etc/init.d/rc.unslung is called with a "start" it loads a bunch of Environment Variables and Paths for AMTM to function, is it possible that we need to validate those are ready/available?

I'm not sure what you're referring to here. Our script does not have any Entware dependencies at all, whether Entware environment vars or paths. For email notifications, we need only a valid AMTM email configuration file - nothing else from AMTM.

Right; but what does AMTM need to send the email? It likely relies/uses some of the environment variables; which we must wait until they are loaded for it to function; correct?

AMTM does not send the email; our script does all the work. I don't want to repeat myself but:

For email notifications, we need only a valid AMTM email configuration file - nothing else from AMTM.

Again, all we need is to read a text file where variables have been defined by AMTM - that's it. Our script does all the heavy lifting to send emails. We don't rely on or depend on any other third-party script or environment vars & paths.

@Martinski4GitHub
Copy link
Collaborator

Cough Cough ;)

...
Nov 14 01:47:38 YazDHCP: **INFO**:  Backup directory [/opt/var/SavedUserIcons] NOT FOUND.
Nov 14 01:47:38 YazDHCP: Trying again with directory [/opt/var/SavedUserIcons]
Nov 14 01:47:38 YazDHCP: **INFO**:  Backup directory [/opt/var/SavedUserIcons] NOT FOUND.
Nov 14 01:47:38 YazDHCP: Trying again with directory [/opt/var/SavedUserIcons]
Nov 14 01:47:38 YazDHCP: **INFO**:  Backup directory [/opt/var/SavedUserIcons] NOT FOUND.
Nov 14 01:47:38 YazDHCP: Trying again with directory [/jffs/configs/SavedUserIcons]
Nov 14 01:47:38 YazDHCP: *WARNING*: Temporary Backup directory [/jffs/configs/SavedUserIcons]
Nov 14 01:47:38 [MerlinAU.sh] 5076: **ERROR**: The shell script 'MerlinAU.sh' is already running [Lock file: 0 secs.]
Nov 14 01:47:38 [MerlinAU.sh] 5076: Exiting...

I think you're looking at a "red herring." One clue is that the Lock file "age" is ZERO seconds, which means it was just created right before the MerlinAU script shown in the log started to execute. The file "age" would be at least 30 seconds if the Lock file had been created by a previous execution that went through the F/W Update process and then the router rebooted. That takes time - much longer than ZERO seconds.

Are you absolutely sure that there is not another instance of the script that was launched before the instance that got terminated?

I would triple-check and make sure that the services-start script is not launching more than one instance of the script.

I really think something else is going on and it's not related to the Lock file.

@ExtremeFiretop
Copy link
Owner Author

I think you're looking at a "red herring." One clue is that the Lock file "age" is ZERO seconds, which means it was just created right before the MerlinAU script shown in the log started to execute. The file "age" would be at least 30 seconds if the Lock file had been created by a previous execution that went through the F/W Update process and then the router rebooted. That takes time - much longer than ZERO seconds.

Are you absolutely sure that there is not another instance of the script that was launched before the instance that got terminated?

I would triple-check and make sure that the services-start script is not launching more than one instance of the script.

I think you just solved this for us!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants