Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to store metaInfo files #302

Open
2 tasks done
LordMike opened this issue Aug 14, 2024 · 7 comments
Open
2 tasks done

Ability to store metaInfo files #302

LordMike opened this issue Aug 14, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@LordMike
Copy link

  • I have checked the existing issues to avoid duplicates
  • I have redacted any info hashes and content metadata from any logs or screenshots attached to this issue

Is your feature request related to a problem? Please describe

I have a project that acquires torrent metafiles en-mass, and I've been on the lookout for a DHT Crawler. I'm trying out bitmagnet, but having reviewed the database scheme, it doesn't seem like bitmagnet keeps the torrent metadata files on hand after it's been ingested. It'd be cool to have the option to persist the original metadata files either on disk or in the postgres database.

Describe the solution you'd like

An option that could be enabled, to persist metainfo files, possibly on disk (e.g. in a tree structure like 00/11/22/001122.....torrent.

Once a new torrent is identified and it's stored to the DB (at which point I assume it's "new" and doesn't exist on disk), the binary blob that is the torrent file could additionally be saved to disk.

Describe alternatives you've considered

N/A

Additional context

N/A

@LordMike LordMike added the enhancement New feature or request label Aug 14, 2024
@DerBunteBall
Copy link

That's available by the save_pieces option of the DHT Crawler.

Valid torrents can be assambled from the DB. Only in a few special situations you can only produce "dummy torrents".

@LordMike
Copy link
Author

LordMike commented Aug 14, 2024 via email

@DerBunteBall
Copy link

Bitmagnet stores all needed information unless there are not really specific cases.

https://en.wikipedia.org/wiki/Torrent_file#File_struct

Every needed information is stored in the database, when you store the pieces. The filelist with length and path and so on is stored. You can write a simple Python script that's able to generate the Torrent e.g. with Torf which has a valid dict hash.

This only doesn't work in cases where clients did more or less not specified stuff in the info dict part which is hashed. That's only in situation where optional md5sums are stored or .utf-8 keys are stored. In that cases the generated torrent wouldn't have the correct info dict hash (torrent checksum). But these cases are relatively rare. In this case you only can generate dummy torrents e.g. for checking data validity on disk.

@LordMike
Copy link
Author

LordMike commented Aug 14, 2024 via email

@LordMike
Copy link
Author

Note: I implemented the feature here. If it has any interest at all, I can make a PR back.

@leofidus
Copy link

Another advantage of storing the metainfo as files is that it removes load and storage requirement from the postgres database when compared to save_pieces. For example I have my bigmagnet postgres data on an SSD, but would prefer storing .torrent files on cheap spinning rust.

I know I can technically already do that with table spaces, but dumping it as files in a directory structure would be more user-friendly and make it easier to integrate with other software

@Dobatymo
Copy link

Dobatymo commented Nov 8, 2024

@DerBunteBall

.utf-8 keys

Actually these are quite common in the Chinese community in my experience. So losing all these torrents is bad. Torrents with these keys are also currently discarded due to the utf8 check on the normal keys, which usually fails on torrents with these extra keys.

So interpreting these keys when they exist would be preferred imo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants