Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warning about FS incompatibility to the log #1188

Closed
2 of 3 tasks
rufengsuixing opened this issue Nov 20, 2019 · 26 comments
Closed
2 of 3 tasks

Add warning about FS incompatibility to the log #1188

rufengsuixing opened this issue Nov 20, 2019 · 26 comments
Assignees
Milestone

Comments

@rufengsuixing
Copy link

rufengsuixing commented Nov 20, 2019

Prerequisites

  • I am running the latest version
  • I checked the documentation and found no answer
  • I checked to make sure that this issue has not already been filed

Issue Details

  • Version of AdGuard Home server:
    0.99.2
  • How did you setup DNS configuration:
    Router manual config
  • If it's a router or IoT, please write device model:
    newifi d1 with mipsel mt7621
  • Operating system and version:
    openwrt 18.06

Expected Behavior

start the software

Actual Behavior

root@newifi-d1-home:/usr/bin/AdGuardHome/data# /usr/bin/AdGuardHome/AdGuardHome
-v
2019/11/20 16:16:32 1118#1 [info] AdGuard Home, version v0.99.2, channel release

2019/11/20 16:16:32 1118#1 [debug] Current working directory is /usr/bin/AdGuardHome
2019/11/20 16:16:33 1118#36 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).AddHost(): '127.0.0.1' -> 'localhost' [1]
2019/11/20 16:16:33 1118#36 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).AddHost(): '::1' -> 'localhost' [2]
2019/11/20 16:16:33 1118#36 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).AddHost(): 'ff02::1' -> 'ip6-allnodes' [3]
2019/11/20 16:16:33 1118#36 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).AddHost(): 'ff02::2' -> 'ip6-allrouters' [4]
2019/11/20 16:16:33 1118#36 [debug] Added 4 client aliases from /etc/hosts
2019/11/20 16:16:33 1118#36 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).addFromSystemARP(): executing arp [arp -a]
2019/11/20 16:16:33 1118#36 [debug] command arp has failed: exec: "arp": executable file not found in $PATH code:-1
2019/11/20 16:16:33 1118#36 [debug] Added 0 client aliases from DHCP
2019/11/20 16:16:33 1118#1 [debug] github.com/AdguardTeam/AdGuardHome/home.upgradeConfig(): got schema version 5
2019/11/20 16:16:33 1118#1 [debug] Reading config file: /usr/bin/AdGuardHome/AdGuardHome.yaml
2019/11/20 16:16:33 1118#1 [debug] Writing YAML file: /usr/bin/AdGuardHome/AdGuardHome.yaml
2019/11/20 16:16:33 1118#1 [debug] github.com/AdguardTeam/AdGuardHome/stats.(*statsCtx).dbOpen(): db.Open...
2019/11/20 16:16:33 1118#1 [error] Stats: open DB: /usr/bin/AdGuardHome/data/stats.db: invalid argument
2019/11/20 16:16:33 1118#1 [fatal] Couldn't initialize statistics module

Screenshots

Additional Information

https://github.com/etcd-io/bbolt

@szolin
Copy link
Contributor

szolin commented Nov 20, 2019

As a workaround we can resolve this issue by not loading Stats module if it's disabled in config (statistics_interval: 0)

@rufengsuixing
Copy link
Author

no still cannot start ,the same problem

@szolin
Copy link
Contributor

szolin commented Nov 20, 2019

This isn't implemented yet, I'm just proposing one solution (for us to think about), so in next versions users can overcome similar problems.

@ameshkov
Copy link
Member

@szolin it would be better to fix this properly instead of disabling the whole module

@rufengsuixing
Copy link
Author

rufengsuixing commented Nov 20, 2019

i run ln -s /tmp/stats.db /usr/bin/AdGuardHome/data/stats.db and it start
maybe it is a bug due to file system
similar to etcd-io/bbolt#102
my file system info
root@newifi-d1-home:/etc# mount
/dev/root on /rom type squashfs (ro,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,noatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,noatime)
cgroup on /sys/fs/cgroup type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset,cpu,cpuacct,blkio,memory,devices,freezer,net_cls,perf_event,pids)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,noatime)
/dev/mtdblock6 on /overlay type jffs2 (rw,noatime)
overlayfs:/overlay on / type overlay (rw,noatime,lowerdir=/,upperdir=/overlay/upper,workdir=/overlay/work)
tmpfs on /dev type tmpfs (rw,nosuid,relatime,size=512k,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,mode=600,ptmxmode=000)
debugfs on /sys/kernel/debug type debugfs (rw,noatime)

@ameshkov
Copy link
Member

Hmm, maybe the stats DB was corrupted? What happens if you restart the service?

@szolin
Copy link
Contributor

szolin commented Nov 20, 2019

[error] Stats: open DB: /usr/bin/AdGuardHome/data/stats.db: invalid argument

bolt.Open returns this error and it's not easy to determine what could have caused this. Almost all system methods can return this type of error.

@ameshkov
Copy link
Member

bolt.Open returns this error and it's not easy to determine what could have caused this. Almost all system methods can return this type of error.

"invalid argument"? This is weird, they don't provide any details?

@rufengsuixing
Copy link
Author

with the code here,i can use it ,but reboot will lost all the database

        touch /usr/bin/AdGuardHome/data/stats.db
	if [ ! -L /usr/bin/AdGuardHome/data/stats.db ]; then
	mv -f /usr/bin/AdGuardHome/data/stats.db /tmp/stats.db
	ln -s /tmp/stats.db /usr/bin/AdGuardHome/data/stats.db
	fi
	touch /usr/bin/AdGuardHome/data/sessions.db
	if [ ! -L /usr/bin/AdGuardHome/data/sessions.db ]; then
	mv -f /usr/bin/AdGuardHome/data/sessions.db /tmp/sessions.db
	ln -s /tmp/sessions.db /usr/bin/AdGuardHome/data/sessions.db
	fi

@ameshkov
Copy link
Member

So this is a bug of boltdb then.

@rufengsuixing could you please archive & attach one of these databases? We'd like to check what exactly is wrong with them.

@szolin
Copy link
Contributor

szolin commented Nov 21, 2019

archive & attach one of these databases

I think it's not necessary, because it's most likey this FS isn't supported (tested by bbolt devs):

overlayfs:/overlay on / type overlay

@ameshkov
Copy link
Member

@szolin hmm, can we maybe handle this type of issues automatically?

@szolin
Copy link
Contributor

szolin commented Nov 21, 2019

Unless we know the exact line of code where this error is returned in bbolt source code - we can't do anything about it except stop loading the whole Stats module and continue.
We also can file a bug or try to reproduce the issue ourselves.

@ameshkov ameshkov changed the title mipsel go bolt package bug can not start software Incompatibility with overlayfs Nov 21, 2019
@ameshkov
Copy link
Member

because it's most likey this FS isn't supported (tested by bbolt devs):

Is there any link to the test or boltdb issue?

@szolin
Copy link
Contributor

szolin commented Dec 3, 2019

@rufengsuixing I improved logging in bbolt a little bit - it will point us to the line of code where the error is returned.
I built a binary for your CPU (mipsle) - AdGuardHome.gz
It's a v0.99.3 with bbolt patch, nothing else. It will also fail with the same error, but now with a more specific message.

  1. Make a backup of your current binary - mv AdGuardHome AdGuardHome.release
  2. Unpack it: gunzip AdGuardHome.gz
  3. Start it: ./AdGuardHome

@rufengsuixing
Copy link
Author

root@newifi-d1-home:~# /tmp/AdGuardHome -c /etc/AdGuardHome.yaml -w /root -v
2019/12/03 16:08:06 4843#1 [info] AdGuard Home, version v0.99.3, channel release

2019/12/03 16:08:06 4843#1 [debug] Current working directory is /root
2019/12/03 16:08:06 4843#38 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).AddHost(): '127.0.0.1' -> 'localhost' [1]
2019/12/03 16:08:06 4843#38 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).AddHost(): '::1' -> 'localhost' [2]
2019/12/03 16:08:06 4843#38 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).AddHost(): 'ff02::1' -> 'ip6-allnodes' [3]
2019/12/03 16:08:06 4843#38 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).AddHost(): 'ff02::2' -> 'ip6-allrouters' [4]
2019/12/03 16:08:06 4843#38 [debug] Added 4 client aliases from /etc/hosts
2019/12/03 16:08:06 4843#38 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).addFromSystemARP(): executing arp [arp -a]
2019/12/03 16:08:06 4843#38 [debug] command arp has failed: exec: "arp": executable file not found in $PATH code:-1
2019/12/03 16:08:06 4843#38 [debug] Added 0 client aliases from DHCP
2019/12/03 16:08:06 4843#1 [debug] github.com/AdguardTeam/AdGuardHome/home.upgradeConfig(): got schema version 5
2019/12/03 16:08:06 4843#1 [debug] Reading config file: /etc/AdGuardHome.yaml
2019/12/03 16:08:06 4843#1 [debug] Writing YAML file: /etc/AdGuardHome.yaml
2019/12/03 16:08:06 4843#1 [debug] github.com/AdguardTeam/AdGuardHome/stats.(*statsCtx).dbOpen(): db.Open...
2019/12/03 16:08:06 4843#1 [error] Stats: open DB: /root/data/stats.db: db.mmap(): invalid argument
2019/12/03 16:08:06 4843#1 [fatal] Couldn't initialize statistics module

@szolin
Copy link
Contributor

szolin commented Dec 3, 2019

Aha, so it's mmap()...

I'm gonna ask you to update the testing binary once again:
AdGuardHome.gz
Now it will print the details of mmap() call.

Also, please execute this system command:
getconf PAGESIZE
It will just print the system page size.

And this command:
stat /root/data/stats.db
will print the size of IO Block for this file.

@rufengsuixing
Copy link
Author

root@newifi-d1-home:~# /tmp/AdGuardHome -c /etc/AdGuardHome.yaml -w /root -v
2019/12/03 16:49:23 9944#1 [info] AdGuard Home, version v0.99.3, channel release

2019/12/03 16:49:23 9944#1 [debug] Current working directory is /root
2019/12/03 16:49:23 9944#51 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).AddHost(): '127.0.0.1' -> 'localhost' [1]
2019/12/03 16:49:23 9944#51 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).AddHost(): '::1' -> 'localhost' [2]
2019/12/03 16:49:23 9944#51 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).AddHost(): 'ff02::1' -> 'ip6-allnodes' [3]
2019/12/03 16:49:23 9944#51 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).AddHost(): 'ff02::2' -> 'ip6-allrouters' [4]
2019/12/03 16:49:23 9944#51 [debug] Added 4 client aliases from /etc/hosts
2019/12/03 16:49:23 9944#51 [debug] github.com/AdguardTeam/AdGuardHome/home.(*clientsContainer).addFromSystemARP(): executing arp [arp -a]
2019/12/03 16:49:23 9944#51 [debug] command arp has failed: exec: "arp": executable file not found in $PATH code:-1
2019/12/03 16:49:23 9944#51 [debug] Added 0 client aliases from DHCP
2019/12/03 16:49:23 9944#1 [debug] github.com/AdguardTeam/AdGuardHome/home.upgradeConfig(): got schema version 5
2019/12/03 16:49:23 9944#1 [debug] Reading config file: /etc/AdGuardHome.yaml
2019/12/03 16:49:24 9944#1 [debug] Writing YAML file: /etc/AdGuardHome.yaml
2019/12/03 16:49:25 9944#1 [debug] github.com/AdguardTeam/AdGuardHome/stats.(*statsCtx).dbOpen(): db.Open...
2019/12/03 16:49:25 9944#1 [error] Stats: open DB: /root/data/stats.db: db.mmap(): syscall.Mmap(len=32768): invalid argument; pagesize=4096
2019/12/03 16:49:25 9944#1 [fatal] Couldn't initialize statistics module
there is no getconf and stat on op ,i will try to opkg install it

@szolin
Copy link
Contributor

szolin commented Dec 3, 2019

32k size and 4k page size are OK
If stat and getconf will also print normal results, it might be that mmap() isn't supported on overlayfs (?)

@rufengsuixing
Copy link
Author

rufengsuixing commented Dec 3, 2019

@rufengsuixing
Copy link
Author

root@newifi-d1-home:~# stat /root/data/stats.db
File: /root/data/stats.db
Size: 16384 Blocks: 32 IO Block: 4096 regular file
Device: 1f06h/7942d Inode: 5120 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2019-12-04 00:07:59.000000000 +0800
Modify: 2019-12-04 00:07:59.000000000 +0800
Change: 2019-12-04 00:07:59.000000000 +0800
Birth: -
and i can`t find a tool to see getconf PAGESIZE on op ,maybe try find in /proc?

@szolin
Copy link
Contributor

szolin commented Dec 3, 2019

and i can`t find a tool to see getconf PAGESIZE

Well, it will probably print 4k, the same as we printed from Go code (I just wanted to be sure).

Now I think the only thing we can do is to disable Stats module and file a new issue where we should either stop using bbolt or patch it so it won't use mmap().

@rufengsuixing
Copy link
Author

maybe move the db into /tmp is a better choose,or let users to choose

@szolin
Copy link
Contributor

szolin commented Dec 3, 2019

or let users to choose

Now you can set the database path using --work-dir argument to ./AdGuardHome binary.

The problem is that every user who installs AGH on overlayfs will face the same issue.

@rufengsuixing rufengsuixing changed the title Incompatibility with overlayfs Incompatibility with jffs2 Dec 4, 2019
@rufengsuixing
Copy link
Author

someone found the filesystem which not support is behind overlay,for me is jffs2,other filesystem which support mmap is ok

@ameshkov ameshkov changed the title Incompatibility with jffs2 Add warning about FS incompatibility to the log Dec 4, 2019
@ameshkov
Copy link
Member

ameshkov commented Dec 4, 2019

Here's what we're going to do:

  1. Add a wiki article about this kind of incompatibilities and possible solutions
  2. Add a link to that article to the log message when AGH fails to start

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants