Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZIM Filesystem Fuse Module #79

Open
rgaudin opened this issue Mar 3, 2023 · 9 comments · May be fixed by openzim/zim-tools#400
Open

ZIM Filesystem Fuse Module #79

rgaudin opened this issue Mar 3, 2023 · 9 comments · May be fixed by openzim/zim-tools#400
Assignees
Labels

Comments

@rgaudin
Copy link
Member

rgaudin commented Mar 3, 2023

This ticket tracks the Kiwix GSoC 2023 Project ZIM Filesystem Fuse Module until code contributions and/or specific tickets requires creating its own dedicated repository (or falls into zim-tools).

Candidates, contributors, this ticket is the preferred location to discuss this project. Ask your questions here and mentors (@mgautierfr, @kelson42) shall respond.

Mandatory reads:


ZIM Filesystem Fuse Module

Objective: we need to create a filesystem fuse module that enables access to the content of a ZIM file, allowing users to view entries as files without using zimdump.

Technologies: C++, Linux internals, FUSE

Description:

Kiwix provides offline access to Wikipedia and other educational content through its ZIM file format. Inspecting ZIM files is very useful for developers and ZIM creators. While the zimdump tool exists, it is not as convenient and easy to use as a filesystem. Therefore, we want to make it easier for users to access the contents of a ZIM file by creating a (read-only) filesystem fuse module.

The ZIM filesystem fuse module will be written in C++ and will use the libzim and FUSE library to enable access to the contents of a ZIM file as if it were a regular (yet read-only) filesystem. The module will allow users to view the ZIM entries as a tree or folders and files, the latter being readable as regular ones. This will make it easier for users to access the content of a ZIM file and will provide a more user-friendly interface for exploring its contents.

Key Deliverables:

  • A C++ implementation of the ZIM filesystem fuse module.
  • Documentation for how to install and use the module.
  • Test cases to verify the correct functioning of the module.

Skills required:

  • Strong knowledge of C++
  • Familiarity with Linux internals, specifically the FUSE library
  • Knowledge of filesystems and file access methods.

Difficulty: Hard. Expect 350 hours of work.

@lyc8503
Copy link

lyc8503 commented Mar 8, 2023

Hello, I am a student from GSoC and I am interested in Kiwix's idea of persisting web pages offline.

I am familiar with C++ and Linux development, and I also have some open-source repos of my own on GitHub, now I want to join a bigger open-source project to experience the open-source community and help with the project.

I have already read the implementation of the zimdump tool, I think I can help implement the FUSE module. I would appreciate it if I could get the opportunity to work on this idea.

@lyc8503
Copy link

lyc8503 commented Mar 9, 2023

Hello, I am a student from GSoC and I am interested in Kiwix's idea of persisting web pages offline.

I am familiar with C++ and Linux development, and I also have some open-source repos of my own on GitHub, now I want to join a bigger open-source project to experience the open-source community and help with the project.

I have already read the implementation of the zimdump tool, I think I can help implement the FUSE module. I would appreciate it if I could get the opportunity to work on this idea.

https://github.com/lyc8503/zimfuse
I spared some time and wrote a tiny demo which utilizes libzim and libfuse3 to implement the readdir and getattr function, which allows users to use cd and ls to see the structure of a zim file, there's still much work to do, but I think I can dig deeper and write a better and more complete implementation if given enough time when participating in GSoC.

@Darkcoder011
Copy link

Hey all, I'm Abhijit Dengale (2nd year B.Tech in CSE), I am here because i want to take part in this project
I already solve more than 90 DSA problem in C++
and I have work on linux more than a year
soo can I take part

@rgaudin

@lyc8503
Copy link

lyc8503 commented Mar 22, 2023

Hi, I have submitted my proposal on GSoC platform and I am willing to hear any suggestions and improve it. I am not very familiar with Slack but I will check messages there regularly. I am wondering what's the preferred way to get in touch with a mentor, or should I just wait for a mentor to contact me?

@juuz0
Copy link

juuz0 commented Mar 29, 2023

I've submitted my proposal for this on the GSoC website :>

@opk12
Copy link

opk12 commented Apr 4, 2023

libarchive is a well-known compression library, supported by a number of free software. archivemount is a FUSE filesystem based on libarchive.

What about integrating with libarchive? The user would be able to

  • run archivemount file.zim /directory/ without you developing FUSE
  • reuse the zim format in any libarchive-based software, regardless if it's FUSE related or not

@rgaudin
Copy link
Member Author

rgaudin commented Apr 4, 2023

That's an interesting possibility. Can you share examples of such libarchive-using softwares?

@opk12
Copy link

opk12 commented Apr 4, 2023

@rgaudin A quick look at libarchive's website gives LibarchiveUsers where you can find arch linux's pacman(!), gvfs, ark. Any regular GNU/Linux user will recognize more than a few names in the 88 packages listed by running apt-cache rdepends libarchive13 on the current Debian stable (bullseye). Others are on the Internet but not in Debian.

@rgaudin
Copy link
Member Author

rgaudin commented Apr 4, 2023

Well, I don't see any in this list that would benefit more than just a mounted fuse fs. I thus don't see the value for libzim.

FUSE module are very simple and easy to distribute/deploy and use. Integrating into libarchive would definitely be more difficult and more importantly, we'd be on the libarchive release schedule which would be less flexible.

Just my uninformed opinion though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants