Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"juicefs gc --delete" fall into infinite loop #5335

Closed
frostwind opened this issue Dec 2, 2024 · 11 comments
Closed

"juicefs gc --delete" fall into infinite loop #5335

frostwind opened this issue Dec 2, 2024 · 11 comments
Assignees

Comments

@frostwind
Copy link

frostwind commented Dec 2, 2024

What happened:
when running gc with "--delete" option , eg
"juicefs gc postgres://jfs_admin:'xxxx'@jfs_meta_url:5432/jfs --delete"
It fall into infinite loop as below.

2024/12/02 08:35:24.920074 juicefs[18479] <WARNING>: Get directory parent of inode 11496018: no such file or directory [quota.go:347]
2024/12/02 08:35:24.920132 juicefs[18479] <WARNING>: Get directory parent of inode 11496018: no such file or directory [quota.go:347]
2024/12/02 08:35:24.920480 juicefs[18479] <WARNING>: Get directory parent of inode 11496018: no such file or directory [quota.go:347]
2024/12/02 08:35:24.920999 juicefs[18479] <WARNING>: Get directory parent of inode 11496018: no such file or directory [quota.go:347]

What you expected to happen:
It should stop or skip the non-existed inode.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?
I am using below mount option during most of my test
juicefs mount -d -o allow_other --writeback --backup-meta 0 --buffer-size 2000 --cache-partial-only

jfs=# select * from jfs_node where inode=11496018;
 inode | type | flags | mode | uid | gid | atime | mtime | ctime | atimensec | mtimensec | ctimensec | nlink | length | rdev | parent | access_acl_id | default_acl_id 
-------+------+-------+------+-----+-----+-------+-------+-------+-----------+-----------+-----------+-------+--------+------+--------+---------------+----------------
(0 rows)

jfs=# select * from jfs_node where parent=11496018;
  inode   | type | flags | mode | uid | gid |      atime       |      mtime       |      ctime       | atimensec | mtimensec | ctimensec | nlink | length | rdev |  parent  | access_acl_id | default_acl_id 
----------+------+-------+------+-----+-----+------------------+------------------+------------------+-----------+-----------+-----------+-------+--------+------+----------+---------------+----------------
 25152661 |    2 |     0 |  493 |   0 |   0 | 1728496776274948 | 1728496776274948 | 1728496776274948 |       943 |       943 |       943 |   249 |   4096 |    0 | 11496018 |             0 |              0
(1 row)

[root@xxx jfs]# juicefs info -i 11496018
2024/12/02 08:40:42.871177 juicefs[18913] <FATAL>: info: no such file or directory [info.go:152]

[root@xxx jfs]# juicefs info -i 25152661
25152661 :
  inode: 25152661
  files: 0
   dirs: 19
 length: 1.75 MiB (1840395 Bytes)
   size: 2.73 MiB (2863104 Bytes)
   path: unknown

Create table broken_records as 
WITH RECURSIVE c AS (
   SELECT 11496018::bigint AS inode , 0::bigint as parent 
   UNION ALL
   SELECT sa.inode , sa.parent 
   FROM jfs_node AS sa
      JOIN c ON c.inode = sa.parent
)
 SELECT *  FROM c;

SELECT 1212423

I use above SQL to dump broken directory structure and saw it populate about 1.2M records.

From the timestamp , eg "mtime" =1728496776274948 , it is on Oct 9 2024, seems to belonging to a directory created by "juicefs clone". eg , as below , inode 25358790 is a directory with 100 files under it , 25358800 is one of the files under 25358790 , this size match how I create the test directory with 100 empty files under it. During my test , I create a directory "dir1" with about totally 20 million files , each layer has many subdirectories, each subdirectory has 100 direct empty files under it. After I have such "dir1" , then I use "juicefs clone" to clone dir1 to dir2. My best guess this broken inode is somehow related to the cloned directory.
I also tried "juicefs fsck --path / --repair --recursive" seems can not fix the issue.

jfs=# select count(*) from broken_records where parent=25358790;
 count 
-------
   100
(1 row)

jfs=# select parent,inode from broken_records where parent=25358790 limit 1;
  parent  |  inode   
----------+----------
 25358790 | 25358800
(1 row)


[root@xxx jfs]# juicefs info -i 25358790
25358790 :
  inode: 25358790
  files: 100
   dirs: 1
 length: 200 Bytes
   size: 404.00 KiB (413696 Bytes)
   path: unknown
[root@xxx jfs]# juicefs info -i  25358800
25358800 :
  inode: 25358800
  files: 1
   dirs: 0
 length: 2 Bytes
   size: 4.00 KiB (4096 Bytes)
   path: unknown
 objects:
+------------+---------------------------------+------+--------+--------+
| chunkIndex |            objectName           | size | offset | length |
+------------+---------------------------------+------+--------+--------+
|          0 | myjfs/chunks/6/6875/6875461_0_2 |    2 |      0 |      2 |
+------------+---------------------------------+------+--------+--------+

### further check distribution of the broken directory , most of them(11998) has 100 direct files/dirs under each of them , pretty match how I generate "dir1"  with about 20 million files. 
jfs=# select count(*),child_num from (select count(*) as child_num ,parent from broken_records group by parent order by count(*)) t group by child_num;
 count | child_num 
-------+-----------
     3 |         1
     1 |         7
     1 |        10
     1 |        13
     1 |        15
     1 |        18
     1 |        22
     1 |        23
     1 |        26
     2 |        35
     2 |        59
     1 |        60
     1 |        63
     1 |        68
     1 |        93
 11998 |       100
     2 |       603
     5 |       604
     1 |       605
     2 |       606
     1 |       607
     1 |       665
     1 |       672
     1 |       673
     1 |       675
     1 |       679
     2 |      1000
(27 rows)

Environment:

  • JuiceFS version (use juicefs --version) or Hadoop Java SDK version: juicefs version 1.2.1+2024-08-30.cd871d19

  • Cloud provider or hardware configuration running JuiceFS: on-prem hardware with ceph storage backend.

  • OS (e.g cat /etc/os-release): CentOS Linux release 7.9.2009 (Core)

  • Kernel (e.g. uname -a): 5.4.206-200.el7.x86_64 #1 SMP Thu Jul 28 14:58:01 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

  • Object storage (cloud provider and region, or self maintained): Ceph , self hosted

  • Metadata engine info (version, cloud provider managed or self maintained): PostgreSQL 17.2

  • Network connectivity (JuiceFS to metadata engine, JuiceFS to object storage): all local network in the same datacenter.

  • Others:

@frostwind frostwind added the kind/bug Something isn't working label Dec 2, 2024
@jiefenghuang
Copy link
Contributor

Does the edge exist? It seems like the clone process is not complete.

@frostwind
Copy link
Author

frostwind commented Dec 3, 2024

@jiefenghuang

jfs=# select * from jfs_edge where parent=11496018;
 id | parent | name | inode | type 
----+--------+------+-------+------
(0 rows)

jfs=# select * from jfs_edge where inode=11496018;
 id | parent | name | inode | type 
----+--------+------+-------+------
(0 rows)


Seems the edge does not exist either?

@frostwind
Copy link
Author

jfs=# delete from jfs_node where inode in (select inode from broken_records);
DELETE 1141955

Seems after clean up orphan records from DB, "gc --delete" can succeed and I am seeing object number in ceph goes down.

@jiefenghuang
Copy link
Contributor

jiefenghuang commented Dec 3, 2024

So it seems to be detachedNode left over from the clone process. The CleanupDetachedNodesBefore in the GC tool should remove them. The quota warning log is not the critical issue. You can check the detachedNode table.

@frostwind
Copy link
Author

jfs=# select * from jfs_detached_node;
  inode   |   added    
----------+------------
 25152661 | 1728496776
(1 row)

Seems the table contains the orphan inode 25152661. How to call CleanupDetachedNodesBefore ? Not seeing any option in juicefs gc" related to this function.

@jiefenghuang
Copy link
Contributor

gc with --delete flag will call CleanupDetachedNodesBefore.

@frostwind
Copy link
Author

But the original problem is "gc --delete" encounter the error below and fall into infinite loop. "gc --delete" can only clean up those detached nodes after I manually delete those orphan inodes from jfs_node table. This is suggesting that the "clone" feature need to be used with carefulness.

2024/12/02 08:35:24.920074 juicefs[18479] <WARNING>: Get directory parent of inode 11496018: no such file or directory [quota.go:347]

@zhijian-pro zhijian-pro self-assigned this Dec 4, 2024
@zhijian-pro
Copy link
Contributor

@frostwind I don't think clone caused the problem.
2024/12/02 08:35:24.920074 juicefs[18479] <WARNING>: Get directory parent of inode 11496018: no such file or directory [quota.go:347]
Is this log the output of the gc command ? What is the console output of the 'gc' command loop?

@frostwind
Copy link
Author

@zhijian-pro
Above log is from "juicefs gc --delete" output. It print out the same line of log repeatedly without stopped. Sorry I don't have record of "juicefs gc" but I remember "juicefs gc" can do its job without reporting error. I already manually deleted orphan inodes from DB and run "gc --delete" without error.

@zhijian-pro zhijian-pro removed the kind/bug Something isn't working label Dec 6, 2024
@zhijian-pro
Copy link
Contributor

I think this may just be a large amount of data leading to a slow gc, in addition, this repeated log makes you think that you have entered a dead loop, in fact, gc is running, when you manually delete a large number of data, the operation becomes faster.

@frostwind
Copy link
Author

Make sense. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants