Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REQ] Add percentage using log lines like Chia GUI #297

Open
Fmstrat opened this issue May 9, 2021 · 8 comments
Open

[REQ] Add percentage using log lines like Chia GUI #297

Fmstrat opened this issue May 9, 2021 · 8 comments
Labels
enhancement New feature or request

Comments

@Fmstrat
Copy link
Contributor

Fmstrat commented May 9, 2021

The Chia GUI app estimates percentages based on the number of lines in the log files. This has been detailed here: https://www.reddit.com/r/chia/comments/mx3iy1/gui_plotting_and_percent_done/

For a u=128 plot for instance, it totals out at 2626 lines in the log file.

With that in mind, you can calculate out the percent progress in bash:

wc -l /opt/chia/logs/2021-05-09-16:46:30.log |awk '{print $1}'); PROGRESS=$(bc <<< "scale=5;100/2626*$LINES"); echo $(printf %.$2f $PROGRESS

Since it's python, the code to mirror this could be pulled from the GUI: https://github.com/Chia-Network/chia-blockchain-gui/blob/main/src/modules/plotQueue.ts#L11

Specifically (with log being the text of the log):

const FINISHED_LOG_LINES = 2626; // 128
const lines = log.trim().split(/\r\n|\r|\n/).length;
progress = lines > FINISHED_LOG_LINES ? 1 : lines / FINISHED_LOG_LINES;

Could we get this same calculation into interactive and status modes?

@Fmstrat
Copy link
Contributor Author

Fmstrat commented May 9, 2021

Well, that proved easy. The PR I submitted is working for me

@Fmstrat Fmstrat closed this as completed May 9, 2021
@altendky
Copy link
Collaborator

Lines in the log doesn't strike me as a particularly interesting measure of completion. Is there a useful relationship between that and time through the plot? Even presuming that plots are progressing at a consistent rate from one to the next.

@altendky altendky reopened this May 12, 2021
@Fmstrat
Copy link
Contributor Author

Fmstrat commented May 12, 2021

@altendky The timing does seem mostly linear and in testing fairly accurate. I'm sure this is why the Chia team chose this method in the official chia-blockchain-gui

@altendky
Copy link
Collaborator

Interesting. I just did a quick glance and it seems like my jobs tend to spend nearly half their time in phase 1 which is only 800 of 2600 or so lines. It seems like this would depend on the thread count specified. But, perhaps it is enough to be a useful addition.

@BasilHorowt BasilHorowt added the enhancement New feature or request label May 13, 2021
@BasilHorowt
Copy link
Collaborator

My jobs generally spend half their time in phase 1 too.

About this approach in general: what happens when the drive is full or disappears and it just keeps writing "retrying" lines? Won't it eventually incorrectly approach 100%? I would be quite dismissive of this idea except for the fact that the official gui is apparently doing it....

@hackerzhuli
Copy link
Contributor

hackerzhuli commented May 16, 2021

I think it is reasonable to get a percentage this way, using the phases:

in job.py:

class Phase:
...

    def get_percentage(self):
        if self.major == 1:
            return (self.minor - 1) / 7 * 0.45    
        elif self.major == 2:
            return (self.minor - 1) / 7 * 0.15 + 0.45    
        elif self.major == 3:
            return (self.minor - 1) / 6 * 0.35 + 0.60
        elif self.major == 4:
            return 0.95    
        return 0      

example log file from my computer, many lines are skipped for brevity:

2021-05-16T02:53:01.697  chia.plotting.create_plots       : �[32mINFO    �[0m Creating 1 plots of size 32, pool public key:  aa6a801dfb2df9306dde50f1fad5d71e05d2b095c2b7642a20658fdb9ceca379c3059b9022144aa31a49f108139ad207 farmer public key: af0b5de5e0810acb0a83d5caadf9626ae1c72a7491b1e1a7870a250f7fa0bc5e4ff7a2c7dd8edd58c7d80283468b5616�[0m
2021-05-16T02:53:01.706  chia.plotting.create_plots       : �[32mINFO    �[0m Memo: aa6a801dfb2df9306dde50f1fad5d71e05d2b095c2b7642a20658fdb9ceca379c3059b9022144aa31a49f108139ad207af0b5de5e0810acb0a83d5caadf9626ae1c72a7491b1e1a7870a250f7fa0bc5e4ff7a2c7dd8edd58c7d80283468b56165c325f765816a6b1f17e95614a6c547d1decefeb2a113d27c0ae7c7420a0e016�[0m
2021-05-16T02:53:01.706  chia.plotting.create_plots       : �[32mINFO    �[0m Starting plot 1/1�[0m

Starting plotting progress into temporary dirs: E:\Chia\Plots\Temp and E:\Chia\Plots\Temp
ID: 2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1
Plot size is: 32
Buffer size is: 4608MiB
Using 128 buckets
Using 2 threads of stripe size 65536

Starting phase 1/4: Forward Propagation into tmp files... Sun May 16 02:53:01 2021
Computing table 1
F1 complete, time: 287.472 seconds. CPU (68.05%) Sun May 16 02:57:49 2021
Computing table 2
...
Forward propagation table time: 1905.169 seconds. CPU (119.920%) Sun May 16 03:29:35 2021
Computing table 3
...
Forward propagation table time: 2567.561 seconds. CPU (99.850%) Sun May 16 04:12:22 2021
Computing table 4
...
Forward propagation table time: 3078.870 seconds. CPU (98.130%) Sun May 16 05:03:41 2021
Computing table 5
...
Forward propagation table time: 2923.353 seconds. CPU (100.560%) Sun May 16 05:52:25 2021
Computing table 6
...
Forward propagation table time: 2492.062 seconds. CPU (111.940%) Sun May 16 06:33:57 2021
Computing table 7
...
Forward propagation table time: 2043.967 seconds. CPU (118.820%) Sun May 16 07:08:01 2021
Time for phase 1 = 15299.114 seconds. CPU (106.040%) Sun May 16 07:08:01 2021

Starting phase 2/4: Backpropagation into tmp files... Sun May 16 07:08:01 2021
Backpropagating on table 7
scanned table 7
scanned time =  321.256 seconds. CPU (9.330%) Sun May 16 07:13:23 2021
sorting table 7
Backpropagating on table 6
scanned table 6
scanned time =  208.608 seconds. CPU (62.060%) Sun May 16 07:30:40 2021
sorting table 6
sort time =  793.295 seconds. CPU (63.270%) Sun May 16 07:43:54 2021
Backpropagating on table 5
scanned table 5
scanned time =  243.433 seconds. CPU (51.740%) Sun May 16 07:47:57 2021
sorting table 5
sort time =  779.925 seconds. CPU (62.280%) Sun May 16 08:00:57 2021
Backpropagating on table 4
scanned table 4
scanned time =  256.417 seconds. CPU (48.990%) Sun May 16 08:05:14 2021
sorting table 4
sort time =  763.123 seconds. CPU (62.100%) Sun May 16 08:17:57 2021
Backpropagating on table 3
scanned table 3
scanned time =  253.902 seconds. CPU (49.090%) Sun May 16 08:22:12 2021
sorting table 3
sort time =  773.777 seconds. CPU (61.430%) Sun May 16 08:35:06 2021
Backpropagating on table 2
scanned table 2
scanned time =  235.054 seconds. CPU (52.960%) Sun May 16 08:39:01 2021
sorting table 2
sort time =  750.129 seconds. CPU (62.420%) Sun May 16 08:51:31 2021
table 1 new size: 3425200882
Time for phase 2 = 6209.960 seconds. CPU (52.060%) Sun May 16 08:51:32 2021
Wrote: 268

Starting phase 3/4: Compression from tmp files into "E:\\Chia\\Plots\\Temp\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot.2.tmp" ... Sun May 16 08:51:32 2021
Compressing tables 1 and 2
...
Total compress table time: 2007.046 seconds. CPU (58.210%) Sun May 16 09:24:59 2021
Compressing tables 2 and 3
...

Total compress table time: 2178.968 seconds. CPU (62.550%) Sun May 16 10:01:18 2021
Compressing tables 3 and 4
...
Total compress table time: 2147.227 seconds. CPU (63.660%) Sun May 16 10:37:05 2021

Compressing tables 4 and 5
...
Total compress table time: 2174.614 seconds. CPU (63.850%) Sun May 16 11:13:20 2021

Compressing tables 5 and 6

Total compress table time: 2240.648 seconds. CPU (64.680%) Sun May 16 11:50:41 2021
Compressing tables 6 and 7

Total compress table time: 2357.186 seconds. CPU (49.820%) Sun May 16 12:29:58 2021
Time for phase 3 = 13106.198 seconds. CPU (60.360%) Sun May 16 12:29:59 2021

Starting phase 4/4: Write Checkpoint tables into "E:\\Chia\\Plots\\Temp\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot.2.tmp" ... Sun May 16 12:29:59 2021
	Starting to write C1 and C3 tables
...
	Finished writing C1 and C3 tables
	Writing C2 table
	Finished writing C2 table
	Final table pointers:
	P1: 0x10c
	P2: 0x374851cd2
	P3: 0x6b5fa750b
	P4: 0x9fdca084e
	P5: 0xd55d00a36
	P6: 0x10d99a2b4b
	P7: 0x14ea3a8cda
	C1: 0x190a3aefda
	C2: 0x190a5526c6
	C3: 0x190a552776
Time for phase 4 = 1260.113 seconds. CPU (48.890%) Sun May 16 12:50:59 2021
Approximate working space used (without final file): 269.355 GiB
Final File size: 101.361 GiB
Total time = 35875.386 seconds. CPU (78.000%) Sun May 16 12:50:59 2021
Copied final file from "E:\\Chia\\Plots\\Temp\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot.2.tmp" to "G:\\dst\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot.2.tmp"
Copy time = 950.196 seconds. CPU (8.080%) Sun May 16 13:06:50 2021
Removed temp2 file "E:\\Chia\\Plots\\Temp\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot.2.tmp"? 1
Renamed final file from "G:\\dst\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot.2.tmp" to "G:\\dst\\plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot"
2021-05-16T13:06:50.491  chia.plotting.create_plots       : �[32mINFO    �[0m Summary:�[0m
2021-05-16T13:06:50.491  chia.plotting.create_plots       : �[32mINFO    �[0m Created a total of 1 new plots�[0m
2021-05-16T13:06:50.491  chia.plotting.create_plots       : �[32mINFO    �[0m plot-k32-2021-05-16-02-53-2a2ae9c134e9ba1f344172c1fc758bbffd6fa573a40945ada01bcb9bc75c54d1.plot�[0m

We can see that each minor phase in the same major phase will take approximately the same time, with only one exception, phase 1 table 1 takes only a fraction of time of other tables.

phase name time percent characteristic
1 Forward propagation 15299 43% cpu bound
2 Backpropagating 6209 17% cpu bound (mostly)
3 Compression 13106 37% cpu bound
4 Write Checkpoint tables 1260 4% io bound ( reading from disk )
total 35875 101%

Note:

  1. Apparently copying the final file to destination directory is not counted here
  2. This plot job uses an temp directory on an HDD
  3. the characteristic in the table is a guess

@altendky
Copy link
Collaborator

Progress is a hard thing. The only way to get it correct here, even in a steady state situation on a single computer, is to do detailed log timing analysis and use that as the reference. Remember, you have to account for the threads in stage one.
I run a single thread on one system and ran 4 on another for awhile. Either that is a silly thing to do or it has a relevant effect on this problem.

I can't say I understand how this brings significant value, especially given the associated complexity. What is it that people will do with this information that they can't do similarly with the major:minor phase status? Is it mostly an interest in more fine grained status that updates more often? We could have major:minor:micro phases based on the buckets or whatever is the next layer in each phase.

Don't get me wrong, I have a basic interest in percentage progress feedback as well. I just don't understand how it helps beyond mental satisfaction or how we can make it not a big lie and also not a major investment of time.

@hackerzhuli
Copy link
Contributor

hackerzhuli commented May 16, 2021

I guess we can only make the percentage as close to linear relationship with time as we can get. Of course it is imposible. I guess showing a percentage is much easier to understand for people who don't know what phases mean and just want to know the "progess". I think just use a simple estimation is good enough for now.

I added it in my own fork of plotman, so people who use it don't have to learn what phases mean to learn progress of jobs:

# plot id k tmp dst wall phase tmp pid stat mem user sys io
0 ad4d9060 32 E:\Chia\Plots\Temp G:\dst 2:47 1:3 13% 125G 15300 RUN 5.1G 1:14 0:02 -
1 65902766 32 E:\Chia\Plots\Temp G:\dst 5:33 1:6 32% 177G 9832 RUN 5.1G 3:30 0:06 -

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants