Bug fixes in adding columns #9091

N-thony · 2024-07-31T17:00:42Z

Fixes #9086
@rdstern @Patowhiz I don't remember the reason that code has been changed but this seems now to be fixed. It will be good to test this PR seriously and see if there is no regression. I did some test and all seems to work fine. Over to you.

N-thony · 2024-08-04T11:29:21Z

Fixes #9086 @rdstern @Patowhiz I don't remember the reason that code has been changed but this seems now to be fixed. It will be good to test this PR seriously and see if there is no regression. I did some test and all seems to work fine. Over to you.

@rdstern is everything okay after testing this?

rdstern · 2024-08-04T11:59:30Z

Thanks, @N-thony I'll check soon, but am assuming @Patowhiz will check first, as it resolves a problem he found. So easy for him to check that it now, at least, solved the original problem he found.

Patowhiz

@N-thony

Here are a few comments and suggestions:

Regarding item 1 of the issue:

It is solved, but I suspect your current solution may cause regressions elsewhere. I think the issue relates to:
1. Line 685-694 of the data_object_R6.R file. I'm not sure when line 685 was added/modified and why the check (length(col_name) != 1) is relevant. Ideally, the check in line 685 should stop any further execution of the function.
2. The script produced by the dialog:
```
X <- data_book$get_columns_from_data(data_name="flags_metadata", col_names="X", use_current_filter=FALSE)
wind_cols <- stringr::str_split_fixed(string=X, pattern=stringr::coll(pattern=","), n=5)
data_book$add_columns_to_data(data_name="flags_metadata", col_name="wind_cols", col_data=wind_cols, before=FALSE, adjacent_column="X")
rm(list=c("wind_cols", "X")) 
```
Notice that wind_cols will contain multiple columns, yet the dialog sets the col_name parameter. I think the correct solution would be to make the dialog set the use_col_name_as_prefix parameter to true and use the passed col_name as the prefix of all the newly generated column names. If use_col_name_as_prefix is set to true, you can make changes at line 685 that will make the length of col_name equal to the value of num_cols, which means you won't need to make inconsistent changes as currently done in line 715.

Regarding item 2 of the issue:

I checked with the new Climsoft import dialog in PR Changes Import from Climsoft Functionalities #9070 but still got the error. Note that I explained where I think the bug is in this comment. In the PR, I have fixed it by not using the data book R functions for factor conversion, so if you want to test my original issue, use my branch named climsoft_updated_dialog_metadata_bug.

Patowhiz · 2024-08-06T07:19:55Z

instat/static/InstatObject/R/data_object_R6.R

@@ -712,7 +712,7 @@ DataSheet$set("public", "add_columns_to_data", function(col_name = "", col_data,
      if(require_correct_length) stop("Length of new column must be divisible by the length of the data frame")
      else curr_col <- rep(curr_col, length.out = self$get_data_frame_length())
    }
-    if(use_col_name_as_prefix) curr_col_name = self$get_next_default_column_name(col_name[i])
+    if(use_col_name_as_prefix) curr_col_name = self$get_next_default_column_name(col_name)
    else curr_col_name = col_name[[i]]
    curr_col_name <- make.names(iconv(curr_col_name, to = "ASCII//TRANSLIT", sub = "."))


@N-thony kindly explain to me why you think this solves the problem. My understanding is, the col_name will always have the length of num_cols and also notice line 716 still uses col_name[[i]] because of this assumption. Shouldn't the change apply to line 716 as well?

I'm not sure and I can't remember why this change was made. If the problem has been fixed, we can close this PR. I did a comparison of the code with one of my old branch and realized the change where I did revert it. As you said, if it is fixed then this PR is not needed anymore

@N-thony kindly explain to me why you think this solves the problem. My understanding is, the col_name will always have the length of num_cols and also notice line 716 still uses col_name[[i]] because of this assumption. Shouldn't the change apply to line 716 as well?

This is done now

N-thony · 2024-08-06T09:46:28Z

@rdstern I will close this PR since the problem has been fixed already.

Patowhiz · 2024-08-06T11:48:00Z

@N-thony

Note that closing this PR means the first item is not yet fixed.

Regarding the second item in the issue, it is not specific to the Climsoft import but applies generally to any similar operations. I hope you intend to fix that in another PR as well?

I'm accepting to close this PR and will reopen the issue issue it attempts to solve.

On a separate note, I made a few comments above. They are quite critical and generally apply to other dialogs with similar operations. It's essential that you consider the items I have outlined when attempting to fix the issue.

Thanks.

rdstern · 2024-08-07T15:37:38Z

@N-thony and @Patowhiz can anything be merged yet? It would be good to include this in the Friday version, ready for @jkmusyoka checks in the Malawi workshop next week.

Patowhiz · 2024-08-07T17:32:06Z

@rdstern Considering my review comments above, which explain in detail where the problem might be and also point out the potential regression that may arise due to the solution provided here, from my side, this is not ready to merge.

I'm happy for @lilyclements to take over, review, and merge this one. The R function has gone through several modifications, and I may be missing something that @N-thony is already familiar with, which might render my review irrelevant.

N-thony · 2024-08-14T06:49:45Z

@rdstern Considering my review comments above, which explain in detail where the problem might be and also point out the potential regression that may arise due to the solution provided here, from my side, this is not ready to merge.

I'm happy for @lilyclements to take over, review, and merge this one. The R function has gone through several modifications, and I may be missing something that @N-thony is already familiar with, which might render my review irrelevant.

@lloyddewit can you review this when possible?

lloyddewit · 2024-08-15T06:49:17Z

@lloyddewit can you review this when possible?

@N-thony Thanks, but I don't think I have anything to add. I suggest that one of the normal approvers does the review/approval.

N-thony · 2024-08-15T07:02:48Z

@lloyddewit can you review this when possible?

@N-thony Thanks, but I don't think I have anything to add. I suggest that one of the normal approvers does the review/approval.

@lloyddewit sorry, I mean @lilyclements.

lilyclements · 2024-08-15T14:00:10Z

Looking at the code history, this [[i]] was added in by "Bug fix by pasting multiple columns", which relates to PR #8954 on issue #8815

If we want to merge this, I suggest we reopen #8954 with an explanation as to why it's reopen and a link to this PR.

N-thony · 2024-08-17T10:10:26Z

Looking at the code history, this [[i]] was added in by "Bug fix by pasting multiple columns", which relates to PR #8954 on issue #8815

If we want to merge this, I suggest we reopen #8954 with an explanation as to why it's reopen and a link to this PR.

@lilyclements I believe the issue has been resolved. The problem occurred because we were using the default prefix when attempting to add multiple columns, which was incorrect in this context. We should use the default prefix only when adding multiple columns based on a single provided column name.

N-thony · 2024-08-17T10:12:50Z

@rdstern @Patowhiz you can have a look now, I believe now we will not have any problem from climsoft too even tho it has been fixed other way.

rdstern

@N-thony this seems fine now on the few checks I made. I checked an example on splu=it and also ranks on the row statistics. Both produce multiple variables.

I am also really happy that:
a) There seem fewer regression bugs
b) It is other than me finding them. This was an important @Patowhiz discovery and @rachelkg has also been making suggestions recently. It will be good if (at least the) rest of the development team become more proficient users/testers too.
@Patowhiz can you approve and merge please

N-thony · 2024-09-20T06:08:37Z

@N-thony this seems fine now on the few checks I made. I checked an example on splu=it and also ranks on the row statistics. Both produce multiple variables.

I am also really happy that: a) There seem fewer regression bugs b) It is other than me finding them. This was an important @Patowhiz discovery and @rachelkg has also been making suggestions recently. It will be good if (at least the) rest of the development team become more proficient users/testers too. @Patowhiz can you approve and merge please

@rdstern @Patowhiz I'm aiming for the test release at the end of today. Can we get this merged?

rdstern · 2024-09-21T11:36:32Z

@N-thony I have approved. @Patowhiz back to you. I agree it would be great to merge this one in the forthcoming release

Patowhiz · 2024-09-21T11:37:14Z

@rdstern my comments and questions above are yet to be responded to by @N-thony . I suspect we may regress other parts if I merge this without knowing the answers to my questions above. In my above comments, I've also given an alternative solution that involves solving the problem at the dialog level (this has far less risk).

I'm happy for you and @lilyclements or someone else to review and merge this.

rdstern · 2024-09-21T11:48:51Z

@N-thony are you reasonably confident that your solution is ok and will not cause the regressions @Patowhiz is concerned about? If so, then I suggest we ask @ChrisMarsh82 to check and merge? It worked nicely for the examples I looked at - which is seems were a problem before. So I am keen to progress.

N-thony · 2024-09-21T12:29:07Z

@rdstern my comments and questions above are yet to be responded to by @N-thony . I suspect we may regress other parts if I merge this without knowing the answers to my questions above. In my above comments, I've also given an alternative solution that involves solving the problem at the dialog level (this has far less risk).

I'm happy for you and @lilyclements or someone else to review and merge this.

@rdstern @Patowhiz if the code looks okay, another way to make sure this works well is through tests. It will be good to test this as much as we want to be sure before merging.

rdstern · 2024-09-21T16:50:00Z

@N-thony I am unhappy delaying the merging of this bug-fix. I have tested the Prepare > Text > Split with data from tidyr (world populations) and it works fine with your fix.
I have now looked at the same example in the latest version and it doesn't work. That's what started this off with Patrick's issue on July 30. So, unless we merge, we are waiting 2 months for the fix of an obvious problem.

I also checked the only other examples I could think of, where multiple variables are produced and added to an existing data frame. They are Prepare > Numeric > Random samples. That doesn't work on the merged version and seems fine with your fix.

And the Prepare > Numeric > Row Statistics (with ranks) is the only other example I can think of for now. That also works fine with your fix.

So I find it curious that the 2 best programmers won't take responsibility in this case. I would like this to be merged in the forthcoming version, because it is the only known (obvious) bugs. Assuming @Patowhiz is still not prepared to merge, then please @ChrisMarsh82 could you do so. Whether you check the code or just merge I don't care any more - it is time this fix was included.

Dismissing this review to facilitate quick merging of this PR so that the dialogs that depend on it can be functional

Patowhiz

Merging this to facilitate further testing and new release that needs the bug fixed.

bug fixes in adding columns

9462617

Patowhiz previously requested changes Aug 6, 2024

View reviewed changes

N-thony closed this Aug 6, 2024

Patowhiz reopened this Aug 6, 2024

Patowhiz closed this Aug 6, 2024

Patowhiz mentioned this pull request Aug 6, 2024

Addition of columns #9086

Closed

N-thony reopened this Aug 6, 2024

minor code change

6c162f4

effectively fixed the bug in adding multipl columns

5e67018

rdstern approved these changes Sep 19, 2024

View reviewed changes

Patowhiz approved these changes Sep 21, 2024

View reviewed changes

Patowhiz merged commit fb93fb5 into IDEMSInternational:master Sep 21, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug fixes in adding columns #9091

Bug fixes in adding columns #9091

N-thony commented Jul 31, 2024

N-thony commented Aug 4, 2024

rdstern commented Aug 4, 2024

Patowhiz left a comment

Patowhiz Aug 6, 2024

N-thony Aug 6, 2024

N-thony Aug 7, 2024

N-thony commented Aug 6, 2024

Patowhiz commented Aug 6, 2024

rdstern commented Aug 7, 2024 •

edited

Loading

Patowhiz commented Aug 7, 2024

N-thony commented Aug 14, 2024

lloyddewit commented Aug 15, 2024

N-thony commented Aug 15, 2024

lilyclements commented Aug 15, 2024

N-thony commented Aug 17, 2024

N-thony commented Aug 17, 2024

rdstern left a comment •

edited

Loading

N-thony commented Sep 20, 2024

rdstern commented Sep 21, 2024

Patowhiz commented Sep 21, 2024

rdstern commented Sep 21, 2024

N-thony commented Sep 21, 2024

rdstern commented Sep 21, 2024

Patowhiz left a comment

Bug fixes in adding columns #9091

Bug fixes in adding columns #9091

Conversation

N-thony commented Jul 31, 2024

N-thony commented Aug 4, 2024

rdstern commented Aug 4, 2024

Patowhiz left a comment

Choose a reason for hiding this comment

Patowhiz Aug 6, 2024

Choose a reason for hiding this comment

N-thony Aug 6, 2024

Choose a reason for hiding this comment

N-thony Aug 7, 2024

Choose a reason for hiding this comment

N-thony commented Aug 6, 2024

Patowhiz commented Aug 6, 2024

rdstern commented Aug 7, 2024 • edited Loading

Patowhiz commented Aug 7, 2024

N-thony commented Aug 14, 2024

lloyddewit commented Aug 15, 2024

N-thony commented Aug 15, 2024

lilyclements commented Aug 15, 2024

N-thony commented Aug 17, 2024

N-thony commented Aug 17, 2024

rdstern left a comment • edited Loading

Choose a reason for hiding this comment

N-thony commented Sep 20, 2024

rdstern commented Sep 21, 2024

Patowhiz commented Sep 21, 2024

rdstern commented Sep 21, 2024

N-thony commented Sep 21, 2024

rdstern commented Sep 21, 2024

Patowhiz left a comment

Choose a reason for hiding this comment

rdstern commented Aug 7, 2024 •

edited

Loading

rdstern left a comment •

edited

Loading