From 96d42e12f8cd89e05041ac3e050c710169477f5e Mon Sep 17 00:00:00 2001 From: geoffreyweal Date: Thu, 30 Oct 2025 17:35:46 +1300 Subject: [PATCH 1/9] Added initial GUFI documentation to nesi support --- README.md | 6 +- .../Supported_Applications/GUFI.md | 77 +++++++++++++++++++ 2 files changed, 82 insertions(+), 1 deletion(-) create mode 100644 docs/Scientific_Computing/Supported_Applications/GUFI.md diff --git a/README.md b/README.md index 51982f138..8c7661521 100644 --- a/README.md +++ b/README.md @@ -56,7 +56,11 @@ pip-compile --allow-unsafe > requirements.txt pip install -r requirements.txt ``` -Make sure to test it on a GitHub runner (not just locally), as this is the actual build environment. +Make sure to test it on a GitHub runner (not just locally), as this is the actual build environment. To run locally: + +```sh +mkdocs serve +``` ## Migration diff --git a/docs/Scientific_Computing/Supported_Applications/GUFI.md b/docs/Scientific_Computing/Supported_Applications/GUFI.md new file mode 100644 index 000000000..9e4d58189 --- /dev/null +++ b/docs/Scientific_Computing/Supported_Applications/GUFI.md @@ -0,0 +1,77 @@ +--- +created_at: '2025-10-30T17:00:00Z' +tags: [] +title: GUFI +vote_count: 0 +vote_sum: 0 +--- + +GUFI (Grand Unified File Index) is a file system metadata indexing tool designed for large-scale data centers to enable fast, secure, and comprehensive searches of files and directories. It works by creating a hierarchical index that preserves file access permissions, allowing users to efficiently find and characterize data across multiple, potentially disparate file systems. This results in significantly faster search times compared to traditional methods. + +There are two commands that GUFI provides: + +* `gufi_find`: For finding files and subfolders in a directory +* `gufi_du`: For obtaining the size of files and folders + +!!! warning + This method uses a database that is updated on a weekly basis. It may not find or measure the size of files that were created or moved about mahuika within a week of them being created or moved. + +## Finding Files and Folders using GUFI + +The usual method for searching for a file using the terminal is: + +```sh +find path/to/folder/to/search -name "*NameOfFileToSearchFor*" +``` + +In GUFI, you provide the same arguments to `gufi_find` as you do with `find`: + +```sh +module load .gufi +gufi_find path/to/folder/to/search -name "*NameOfFileToSearchFor*" +``` + +If you want to find the largest file in your folder: + +```sh +gufi_find home/johndoe -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1 +gufi_find project/nesi12345 -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1 +gufi_find nobackup/nesi12345 -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1 +``` + +!!! warning + You must give the path name beginning from `project` or `nobackup`. Local addresses starting with `.` will also not work. For example: + + * `nobackup/nesi12345/a_folder` is acceptible, but + * `/nesi/nobackup/nesi12345/a_folder`, `nesi12345/a_folder`, or `./nesi12345/a_folder` will not work. + + +## Obtaining the Size of Files and Folders using GUFI + +The usual method for obtaining the size of a file or folder using the terminal is: + +```sh +du -s path/to/file/or/folder +``` + +In GUFI, you provide the same arguments to `gufi_du` as you do with `du`: + +```sh +module load .gufi +gufi_du -s path/to/file/or/folder +``` + +If you want to obtain the number of files in a folder, such as `nobackup/nesi12345/a_folder`, you would do the following: + +```sh +gufi_du --inodes -s nobackup/nesi12345/a_folder +``` + +For more options, see `gufi_du --help` + +!!! warning + You must give the path name beginning from `project` or `nobackup`. Local addresses starting with `.` will also not work. For example: + + * `nobackup/nesi12345/a_folder` is acceptible, but + * `/nesi/nobackup/nesi12345/a_folder`, `nesi12345/a_folder`, or `./nesi12345/a_folder` will not work. + From bb4175ff6efbc0364d1ccdde454a93283f370f52 Mon Sep 17 00:00:00 2001 From: geoffreyweal Date: Thu, 30 Oct 2025 17:40:45 +1300 Subject: [PATCH 2/9] minor --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 8c7661521..39cf47c59 100644 --- a/README.md +++ b/README.md @@ -59,7 +59,7 @@ pip install -r requirements.txt Make sure to test it on a GitHub runner (not just locally), as this is the actual build environment. To run locally: ```sh -mkdocs serve +mkdocs serve -c ``` ## Migration From c938ad14573ba1033a3075b51f23db66ad84dbef Mon Sep 17 00:00:00 2001 From: geoffreyweal Date: Fri, 31 Oct 2025 12:47:02 +1300 Subject: [PATCH 3/9] Updated GUFI documentation --- .gitignore | 1 + .../Supported_Applications/GUFI.md | 94 +++++++++++++------ 2 files changed, 65 insertions(+), 30 deletions(-) diff --git a/.gitignore b/.gitignore index c19f27416..a03f8f03c 100644 --- a/.gitignore +++ b/.gitignore @@ -3,3 +3,4 @@ production/* **.pyc .venv/* dictionary.dic +.DS_Store diff --git a/docs/Scientific_Computing/Supported_Applications/GUFI.md b/docs/Scientific_Computing/Supported_Applications/GUFI.md index 9e4d58189..283d3b84b 100644 --- a/docs/Scientific_Computing/Supported_Applications/GUFI.md +++ b/docs/Scientific_Computing/Supported_Applications/GUFI.md @@ -13,65 +13,99 @@ There are two commands that GUFI provides: * `gufi_find`: For finding files and subfolders in a directory * `gufi_du`: For obtaining the size of files and folders +!!! note + The filesystems that `gufi_find` and `gufi_du` work on are: + + * `\home` + * `\projects` + * `\nobackup` + + For `gufi_find` and `gufi_du` to work, you must give the path name beginning from `project` or `nobackup`. Local addresses starting with `.` will also not work. For example: + + * `nobackup/nesi12345/a_folder` and `/nobackup/nesi12345/a_folder` are acceptible, but + * `/nesi/nobackup/nesi12345/a_folder`, `nesi12345/a_folder`, or `./nesi12345/a_folder` will not work. + !!! warning This method uses a database that is updated on a weekly basis. It may not find or measure the size of files that were created or moved about mahuika within a week of them being created or moved. -## Finding Files and Folders using GUFI +## Prerequisite: Must Load `gufi` Module -The usual method for searching for a file using the terminal is: +To use `gufi_find` and `gufi_du`, you must load them by entering in the following command in mahuika: ```sh -find path/to/folder/to/search -name "*NameOfFileToSearchFor*" +module load .gufi ``` -In GUFI, you provide the same arguments to `gufi_find` as you do with `find`: +Without this, `gufi_find` and `gufi_du` will not be loaded. + + +## Finding Files and Folders using GUFI + +The usual method for searching for a file using the terminal is: ```sh -module load .gufi -gufi_find path/to/folder/to/search -name "*NameOfFileToSearchFor*" +find path/to/folder/to/search -name "*NameOfFileToSearchFor*" ``` -If you want to find the largest file in your folder: +In GUFI, you provide the same arguments to `gufi_find` as you do with `find` (but starting with either `home`, `projects`, or `nobackup`): ```sh -gufi_find home/johndoe -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1 -gufi_find project/nesi12345 -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1 -gufi_find nobackup/nesi12345 -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1 +gufi_find path/to/folder/to/search -name "*NameOfFileToSearchFor*" ``` -!!! warning - You must give the path name beginning from `project` or `nobackup`. Local addresses starting with `.` will also not work. For example: +!!! example + If you want to find `.bashrc` in your home directory: + + ```sh + gufi_find home/USERNAME -name .bashrc + ``` - * `nobackup/nesi12345/a_folder` is acceptible, but - * `/nesi/nobackup/nesi12345/a_folder`, `nesi12345/a_folder`, or `./nesi12345/a_folder` will not work. + If you want to find the largest file in your folder: + ```sh + gufi_find home/USERNAME -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1 + gufi_find project/nesi12345 -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1 + gufi_find nobackup/nesi12345 -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1 + ``` ## Obtaining the Size of Files and Folders using GUFI The usual method for obtaining the size of a file or folder using the terminal is: ```sh -du -s path/to/file/or/folder -``` - -In GUFI, you provide the same arguments to `gufi_du` as you do with `du`: - -```sh -module load .gufi -gufi_du -s path/to/file/or/folder +du -hs path/to/file/or/folder ``` -If you want to obtain the number of files in a folder, such as `nobackup/nesi12345/a_folder`, you would do the following: +In GUFI, you provide the same arguments to `gufi_du` as you do with `du` (but starting with either `home`, `projects`, or `nobackup`): ```sh -gufi_du --inodes -s nobackup/nesi12345/a_folder +gufi_du -hs path/to/file/or/folder ``` For more options, see `gufi_du --help` -!!! warning - You must give the path name beginning from `project` or `nobackup`. Local addresses starting with `.` will also not work. For example: - - * `nobackup/nesi12345/a_folder` is acceptible, but - * `/nesi/nobackup/nesi12345/a_folder`, `nesi12345/a_folder`, or `./nesi12345/a_folder` will not work. - +!!! example + If you want to find the size of your in your home directory: + + ```sh + gufi_du -hs home/USERNAME + ``` + + If you want to find the size of your `.bashrc` in your home directory: + + ```sh + gufi_du -hs home/USERNAME/.bashrc + ``` + + If you want to find the size of your project folder in `projects` or `nobackup`: + + ```sh + gufi_du -hs project/nesi12345 + gufi_du -hs nobackup/nesi12345 + ``` + + If you want to obtain the number of files in a folder, such as `nobackup/nesi12345/a_folder`, you would do the following: + + ```sh + gufi_du --inodes -s nobackup/nesi12345/a_folder + ``` From cc0dd02008c28c37346dd59fa1802db07af98ee0 Mon Sep 17 00:00:00 2001 From: geoffreyweal Date: Fri, 31 Oct 2025 13:07:47 +1300 Subject: [PATCH 4/9] Updates from Blair --- docs/Scientific_Computing/Supported_Applications/GUFI.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/Scientific_Computing/Supported_Applications/GUFI.md b/docs/Scientific_Computing/Supported_Applications/GUFI.md index 283d3b84b..600ba7cb8 100644 --- a/docs/Scientific_Computing/Supported_Applications/GUFI.md +++ b/docs/Scientific_Computing/Supported_Applications/GUFI.md @@ -6,13 +6,16 @@ vote_count: 0 vote_sum: 0 --- -GUFI (Grand Unified File Index) is a file system metadata indexing tool designed for large-scale data centers to enable fast, secure, and comprehensive searches of files and directories. It works by creating a hierarchical index that preserves file access permissions, allowing users to efficiently find and characterize data across multiple, potentially disparate file systems. This results in significantly faster search times compared to traditional methods. +GUFI (Grand Unified File Index) is a file system metadata indexing tool designed for large-scale data centers to enable fast, secure, and comprehensive searches of files and directories. It works by creating a hierarchical index that preserves file access permissions, allowing users to efficiently find and characterize data across multiple, potentially disparate file systems. This results in significantly faster search times and lessens impact/load on parallel filesystems compared to traditional methods. There are two commands that GUFI provides: * `gufi_find`: For finding files and subfolders in a directory * `gufi_du`: For obtaining the size of files and folders +!!! warning + This method uses a database that is updated on a weekly basis. It may not find or measure the size of files that were created or moved about mahuika within a week of them being created or moved. + !!! note The filesystems that `gufi_find` and `gufi_du` work on are: @@ -25,8 +28,6 @@ There are two commands that GUFI provides: * `nobackup/nesi12345/a_folder` and `/nobackup/nesi12345/a_folder` are acceptible, but * `/nesi/nobackup/nesi12345/a_folder`, `nesi12345/a_folder`, or `./nesi12345/a_folder` will not work. -!!! warning - This method uses a database that is updated on a weekly basis. It may not find or measure the size of files that were created or moved about mahuika within a week of them being created or moved. ## Prerequisite: Must Load `gufi` Module From 2ce6fce002033cdad4d3073e7434b40e985bdbf0 Mon Sep 17 00:00:00 2001 From: geoffreyweal Date: Fri, 31 Oct 2025 13:18:41 +1300 Subject: [PATCH 5/9] Updates from MattB --- .../Supported_Applications/GUFI.md | 20 +++++++++---------- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/docs/Scientific_Computing/Supported_Applications/GUFI.md b/docs/Scientific_Computing/Supported_Applications/GUFI.md index 600ba7cb8..e3d296bbc 100644 --- a/docs/Scientific_Computing/Supported_Applications/GUFI.md +++ b/docs/Scientific_Computing/Supported_Applications/GUFI.md @@ -14,7 +14,7 @@ There are two commands that GUFI provides: * `gufi_du`: For obtaining the size of files and folders !!! warning - This method uses a database that is updated on a weekly basis. It may not find or measure the size of files that were created or moved about mahuika within a week of them being created or moved. + This method uses a database that is updated on a weekly basis. It may not find or measure the size of files that were created or moved around Mahuika within the last week. !!! note The filesystems that `gufi_find` and `gufi_du` work on are: @@ -29,42 +29,40 @@ There are two commands that GUFI provides: * `/nesi/nobackup/nesi12345/a_folder`, `nesi12345/a_folder`, or `./nesi12345/a_folder` will not work. -## Prerequisite: Must Load `gufi` Module +## Prerequisite: Must Load `gufi` Module -To use `gufi_find` and `gufi_du`, you must load them by entering in the following command in mahuika: +To use `gufi_find` and `gufi_du`, you must load them by entering in the following command in Mahuika: ```sh module load .gufi ``` -Without this, `gufi_find` and `gufi_du` will not be loaded. - ## Finding Files and Folders using GUFI The usual method for searching for a file using the terminal is: ```sh -find path/to/folder/to/search -name "*NameOfFileToSearchFor*" +find path/to/folder/to/search -name ``` In GUFI, you provide the same arguments to `gufi_find` as you do with `find` (but starting with either `home`, `projects`, or `nobackup`): ```sh -gufi_find path/to/folder/to/search -name "*NameOfFileToSearchFor*" +gufi_find path/to/folder/to/search -name ``` !!! example If you want to find `.bashrc` in your home directory: ```sh - gufi_find home/USERNAME -name .bashrc + gufi_find home/$USER -name .bashrc ``` If you want to find the largest file in your folder: ```sh - gufi_find home/USERNAME -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1 + gufi_find home/$USER -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1 gufi_find project/nesi12345 -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1 gufi_find nobackup/nesi12345 -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 1 ``` @@ -89,13 +87,13 @@ For more options, see `gufi_du --help` If you want to find the size of your in your home directory: ```sh - gufi_du -hs home/USERNAME + gufi_du -hs home/$USER ``` If you want to find the size of your `.bashrc` in your home directory: ```sh - gufi_du -hs home/USERNAME/.bashrc + gufi_du -hs home/$USER/.bashrc ``` If you want to find the size of your project folder in `projects` or `nobackup`: From 8d1c0e2a682d253c91855231e6373450f17a6f53 Mon Sep 17 00:00:00 2001 From: geoffreyweal Date: Mon, 3 Nov 2025 11:22:09 +1300 Subject: [PATCH 6/9] Added troubleshooting to GUFI --- .../Supported_Applications/GUFI.md | 34 +++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/docs/Scientific_Computing/Supported_Applications/GUFI.md b/docs/Scientific_Computing/Supported_Applications/GUFI.md index e3d296bbc..889b4f295 100644 --- a/docs/Scientific_Computing/Supported_Applications/GUFI.md +++ b/docs/Scientific_Computing/Supported_Applications/GUFI.md @@ -108,3 +108,37 @@ For more options, see `gufi_du --help` ```sh gufi_du --inodes -s nobackup/nesi12345/a_folder ``` + +## Troubleshooting + +### My file or folder exists, but `GUFI` does not include it during its search + +If your file or folder exists but `GUFI` does not find it, it is likely that your folder has not been indexed by `GUFI` yet. You will need to wait until the end of the week for those files and folders to be indexed by `GUFI`. + +* In the meantime, you can either use `find` or `du` instead of `gufi_find` or `gufi_du` as alternatives for these functions. + +### I get the message: `Does "XYZ" have treesummary data?` + +If you get a message like this: + +```sh +gufi_du --inodes -s nobackup/nesi99991 +Error: Skipping directory "/search/nobackup/nesi99991": Permission denied (13) +0 nobackup/nesi99991 +Warning: Did not get any results from gufi_query. +Does "nobackup/nesi99991" have treesummary data? +``` + +This means that `gufi_find` or `gufi_du` was not able to find any information about the path you gave it to search. This could be because: + +1. The folder you are search in doesn't exist, +2. You don't have permissions to look at the files and folders that you were trying to search in, or +3. Your files and folders were created before they were included in the `GUFI` database. + +If 3 applies to you, you will need to wait until the end of the week for those files and folders to be indexed by `GUFI`. + +### I get the message: `Error: Skipping directory "nobackup/XYZ": Permission denied (13)` + +If you see this error, this is because you do not have the correct permissions to view this directory. + +* If you want to use `GUFI` on this directory, you will need to get read permissions from the person who created this directory. From 9e67c29a21dfcd33fcc154b379509ace4722e328 Mon Sep 17 00:00:00 2001 From: geoffreyweal Date: Mon, 3 Nov 2025 17:15:59 +1300 Subject: [PATCH 7/9] Added examples to troubleshooting --- .../Supported_Applications/GUFI.md | 30 +++++++++++++++---- 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/docs/Scientific_Computing/Supported_Applications/GUFI.md b/docs/Scientific_Computing/Supported_Applications/GUFI.md index 889b4f295..3188df02f 100644 --- a/docs/Scientific_Computing/Supported_Applications/GUFI.md +++ b/docs/Scientific_Computing/Supported_Applications/GUFI.md @@ -72,13 +72,13 @@ gufi_find path/to/folder/to/search -name The usual method for obtaining the size of a file or folder using the terminal is: ```sh -du -hs path/to/file/or/folder +du -s path/to/file/or/folder ``` In GUFI, you provide the same arguments to `gufi_du` as you do with `du` (but starting with either `home`, `projects`, or `nobackup`): ```sh -gufi_du -hs path/to/file/or/folder +gufi_du -s path/to/file/or/folder ``` For more options, see `gufi_du --help` @@ -87,20 +87,20 @@ For more options, see `gufi_du --help` If you want to find the size of your in your home directory: ```sh - gufi_du -hs home/$USER + gufi_du -s home/$USER ``` If you want to find the size of your `.bashrc` in your home directory: ```sh - gufi_du -hs home/$USER/.bashrc + gufi_du -s home/$USER/.bashrc ``` If you want to find the size of your project folder in `projects` or `nobackup`: ```sh - gufi_du -hs project/nesi12345 - gufi_du -hs nobackup/nesi12345 + gufi_du -s project/nesi12345 + gufi_du -s nobackup/nesi12345 ``` If you want to obtain the number of files in a folder, such as `nobackup/nesi12345/a_folder`, you would do the following: @@ -117,6 +117,13 @@ If your file or folder exists but `GUFI` does not find it, it is likely that you * In the meantime, you can either use `find` or `du` instead of `gufi_find` or `gufi_du` as alternatives for these functions. +!!! example + + ```sh + john.doe@login03:~$ gufi_find home/new_folder + Could not get realpath of "/search/home/new_folder": No such file or directory (2) + ``` + ### I get the message: `Does "XYZ" have treesummary data?` If you get a message like this: @@ -142,3 +149,14 @@ If 3 applies to you, you will need to wait until the end of the week for those f If you see this error, this is because you do not have the correct permissions to view this directory. * If you want to use `GUFI` on this directory, you will need to get read permissions from the person who created this directory. + +!!! example + + ```sh + john.doe@login03:/nesi/nobackup/nesi12345$ gufi_du nobackup/nesi12345/test.txt + Error: Skipping directory "nobackup/nesi12345/test.txt": Permission denied (13) + ``` + +### I can not use tab to autocomplete, and sometimes autocompleting using tab logs me out of my Mahuika login + +This is a known problem. We are currently looking for a fix for this. \ No newline at end of file From 5017233b43359c33b660a67ac358c947ecea78c9 Mon Sep 17 00:00:00 2001 From: geoffreyweal Date: Mon, 10 Nov 2025 16:24:57 +1300 Subject: [PATCH 8/9] meow --- docs/Scientific_Computing/Supported_Applications/GUFI.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/Scientific_Computing/Supported_Applications/GUFI.md b/docs/Scientific_Computing/Supported_Applications/GUFI.md index 3188df02f..fc2bdd25d 100644 --- a/docs/Scientific_Computing/Supported_Applications/GUFI.md +++ b/docs/Scientific_Computing/Supported_Applications/GUFI.md @@ -159,4 +159,4 @@ If you see this error, this is because you do not have the correct permissions t ### I can not use tab to autocomplete, and sometimes autocompleting using tab logs me out of my Mahuika login -This is a known problem. We are currently looking for a fix for this. \ No newline at end of file +This is a known problem. We are currently looking for a fix for this. From 0f904bd9088f30a859417abb7e93af654c8c6415 Mon Sep 17 00:00:00 2001 From: geoffreyweal Date: Mon, 10 Nov 2025 16:29:29 +1300 Subject: [PATCH 9/9] test --- docs/Scientific_Computing/Supported_Applications/GUFI.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/Scientific_Computing/Supported_Applications/GUFI.md b/docs/Scientific_Computing/Supported_Applications/GUFI.md index fc2bdd25d..c946974a7 100644 --- a/docs/Scientific_Computing/Supported_Applications/GUFI.md +++ b/docs/Scientific_Computing/Supported_Applications/GUFI.md @@ -6,6 +6,8 @@ vote_count: 0 vote_sum: 0 --- +test + GUFI (Grand Unified File Index) is a file system metadata indexing tool designed for large-scale data centers to enable fast, secure, and comprehensive searches of files and directories. It works by creating a hierarchical index that preserves file access permissions, allowing users to efficiently find and characterize data across multiple, potentially disparate file systems. This results in significantly faster search times and lessens impact/load on parallel filesystems compared to traditional methods. There are two commands that GUFI provides: