Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Progress of wildcard download #38

Open
Bandler opened this issue Jun 22, 2022 · 3 comments
Open

Progress of wildcard download #38

Bandler opened this issue Jun 22, 2022 · 3 comments

Comments

@Bandler
Copy link

Bandler commented Jun 22, 2022

If I use DownloadWildcard() together with a ProgressCallback, the given progress is always just showing the progress of the currently downloaded file and not of the overall progress. So if there are several files matched by the wildcard, there is no way of telling the overall progress.

@embeddedmz
Copy link
Owner

Following your question, I have tested something to fix this, unfortunately, there are no simple solutions to solve this problem. We need to know the size of the folder (to accurately calculate the progression rate in the callback + needless to say you will have to manage the size outside DownloadWildcard since this last is recursive, there's solutions for this problem) but the FTP servers usually can't give you this information (I tried the 'Info' method and it doesn't work with folders maybe there's a solution with SFTP servers https://stackoverflow.com/questions/45614242/bash-script-get-folder-size-through-ftp I think I can send the command 'du' to the server and this last will maybe send me a reply, this command can take so long to execute so maybe it will timeout I don't know).

Another interesting resource https://curl.se/mail/lib-2009-05/0225.html

@Bandler
Copy link
Author

Bandler commented Jun 23, 2022

Thanks for your reply.

In my case, no recursion will be needed, because I address only special files in one directory. So lets say, all *.txt files in the given download folder. Would it be possible to provide this special (easier) case? It would only require to get the file list matching the wildcard expression in the directory and add their sizes.

@embeddedmz
Copy link
Owner

embeddedmz commented Jun 23, 2022

@Bandler I think we will need a new method for that. The current DownloadWildcard method uses libcurl features and is complex. I prefer not to touch it.

the best way is to fetch the list of the file names that match with an expression (that uses the asterisk and maybe '?' too), to compute the total size of all these files, to pass that size to the structure that you can get from the first parameter of the progress callback (cast the void* to ProgressFnStruct) and if that size is not equal to zero, you know that you must use it to compute the progress rate.

When you use List() with "bOnlyNames" set to "false" this is what it returns in a std::string :

drwxrwxrwx   1 user     group           0 Apr 26  2021 download_wildcard
drwxrwxrwx   1 user     group           0 Jun  2 15:23 nom avec des espaces
-rw-rw-rw-   1 user     group         161 Jun  2 15:26 test espace.txt
drwxrwxrwx   1 user     group           0 Jun 23 09:29 upload

and with bOnlyNames set to true (some ftp servers will add '.' and '..' in both lists at the beginning but they are directories so it's fine)

download_wildcard
nom avec des espaces
test espace.txt
upload

You can use these two lists to create a list of regular files that matches an expression : first you identify the indexes of the lines that begin with '-', then use these indexes to have the names of all regular files from the second non-detailed list and then filter the items that satisfy a regular expression (you can use regex or a simple code just google 'c++ wildcard').

Always check that indexes are not out of bound since the administrator of the FTP server can delete files between the 2 calls of List and report errors/use a retry mechanism in that case.

Since, the detailed list is tricky to parse, you can iterate over the filtered elements and use the method Info() and get the size of each file and sum all of them.

In the struct ProgressFnStruct, add a field "size_t filesTotalSize", in the constructor of the struct it should be initialized to zero and before performing a CURL operation, update it with the total file size.

Iterate over the list to download the items. Do not forget to append the folder where the files belong (e.g. if the file is located on the root folder, you don't have to add anything, however if it's in sub-directories, don't forget to append the full path to that folder, it's the path you have already used with the List() method, e.g. "myfolder/myfile.txt").

Be careful with curl_easy_reset(m_pCurlSession); since it clears the options you will set with curl_easy_setopt(). it must be only done once before Perform() and setting options.

In the progress callback, if filesTotalSize is different from zero you can use it otherwise, it's a single file so use another logic to compute the rate of progression. DO NOT use the same callback with this new method and the other existing ones since actually the progress structure is not reset and if you the existing methods (upload, download) you might compute an incorrect progress rate methods (and be careful of zero divisions :p). That's why I'm reluctant to manage the progress bar for DownloadWildcard. It will create many issues + the methods are not reentrant if they are used together (e.g. if we update the total file size in a new method then inside it we call another method can set it to zero to notify the progress bar that it is performing a single transfer = problems ! so it's better to use the new field filesTotalSize only for this new method. Otherwise, in the progress callback, you have also the parameter void *pOwner you can set it with SetProgressFnCallback and use it to get the total file size...)

I noticed that we can't disable progress callback after setting it. I will push a fix for that.

I hope this helps you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants