Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added new option to Rename Columns dialog #8485

Merged
merged 17 commits into from
Aug 24, 2023

Conversation

MeSophie
Copy link
Contributor

@MeSophie MeSophie commented Aug 9, 2023

Fixes #8439
@rdstern, @lloyddewit I have added a new option Rename call Edit on rename columns dialog (Rename Columns>Rename With ). Please have a look.

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeSophie that's a great start. It already seems to work which is great.
Could you quickly re-instate the selector and receiver? I am assuming the used makes the multiple selection from the columns they specify - of course that may be all the columns. Or is that difficult to do?
Once that's done @lilyclements will be happy, but may have other ideas on further (multiple) edits that could be included. For example I could easily change L to Lily, but how could I add Lily to the front of each name. I suspect if we allowed regex I could change perhaps , or backslash_dot to Lily. But I am not good at that sort of thing.
@lilyclements and @MeSophie I also wonder about the layout of the controls in the dialog. It has become quite large, but also with empty space, particularly under the data selector.

@MeSophie
Copy link
Contributor Author

MeSophie commented Aug 9, 2023

@rdstern I have put the selector and the receiver back. But is not useful as you can see on the picture the is no parameter to specify the columns just the string to edit. Reason why I removed them before. I also try to adjust the dialog size.

image

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeSophie here are 2 things I noticed about the current dialog.
image

a) First is that the receiver has disappeared, when I went back to the dialog.
b) Second is the error message it has given. You see I am not looking for 10 at the start of the name. I was asking for Ends With. But it doesn't let me continue.

Incidentally I am testing with the rockart data from DAAG. I wonder how you are testing yourself? When you do tests, of improvements I would be happy for you to describe what you have done.

You will see, above, I made a selection, because I am also keen to see how your rename works on a select.

c) I now see your problem with the fact that you don't know how to apply this option to selected columns. At the same time I have discovered an oddity with the other 3 options. They can each be used for the whole data frame, just by leaving the receiver blank. (Unlike other dialogs, Ok is enabled when the receiver is left empty and the command (e.g. to Lower, is then applied to the whole data frame. We need to change that.

  1. The dialog Visualise data has a control you can use.
image

In the rename dialo, put those radio buoons side by side.

Then you can start - as you hoped - with the whole data frame. The selector and the multiple receiver, are not visible.

  1. Then you can implement the option of Edit with selected columns. For this, you can follow the code when a select is made. In defining a select you can have 2 conditions, with one being getting a set of columns and the other being the condition. Then they are combined with &. So you will then have 2 conditions, just as the select.

d) And also I expect there is a bit more that may be possible to alter the names construcively. @lilyclements I also wonder whether it would be easy to have a small report in the output window saying perhaps just xxx names in rockart changed Is that easy to find out?

@lilyclements
Copy link
Contributor

@rdstern to (d), we have two options to implement this

  1. We could add this into our function, rather than the dialog. This might be useful for when the functions are used outside of R-Instat. E.g., something like ncol(rockArt %>% dplyr::select(starts_with("SS"))) (but of course we check how many starts_with("SS") before we make the changes, and print in the output).

  2. Or, we can do this by adding an option cat in the dialog code - we do something similar to this in dlgSurvivalObject

I think 1. would make more sense personally, which I assume would be me changing that bit of R code
(and then @MeSophie would need to set ucrBase.clsRsyntax.iCallType = 2 in the dialog code, I think, when the new "Edit" option is selected (and iCallType to be what it currently is when this "Edit" option isn't selected)

@rdstern
Copy link
Collaborator

rdstern commented Aug 11, 2023

@lilyclements many thanks for your suggestion on point d). I suggest @MeSophie could usefully sort out the other points - I had a useful discussion with her. Then this could be considered, either here or as a separate issue.

@MeSophie
Copy link
Contributor Author

@lilyclements Here are two codes i hope you can help. The aim is to combine these two codes in order to select the columns to be edited. I've tried running the selection code and then the rename columns code but the changes still apply to all columns starting with f.

data_book$add_column_selection(data_name="survey", name="selection", column_selection=list(C0=list(operation="base::match", parameters=list(x=c("fert","village","field"))), C1=list(operation="tidyselect::starts_with", parameters=list(match="f", ignore.case=FALSE))), and_or="&")
and
data_book$rename_column_in_data(data_name="survey", type="rename_with", .fn=stringr::str_replace, .cols=tidyselect::starts_with("f")t, pattern="f", replacement="G")

Talking to Antoine, he suggested this third code:
data_book$rename_column_in_data(data_name="survey", type="rename_with", .fn=stringr::str_replace, pattern="f", replacement="G", .cols=c("fert","field","village"))

But the problem with this third code is it didn't take to account the start_with, end_with, contains, or match function. So the changes will take place wherever the corresponding string is located, whether at the beginning, in the middle or at the end of the column name.

@lilyclements
Copy link
Contributor

@MeSophie good question! This can be achieved really simply using the vars argument.

E.g. .cols=tidyselect::starts_with("f", vars = c("fert", "size", "field", "village"))

However, I am finding some issues in the vars parameter in general in R. I think this is not a problem with R-Instat, but with the R code.

E.g.

# Code run from Script Window
var_names <- c("fert", "size", "field", "village")
data_book$rename_column_in_data(data_name="survey", type="rename_with",
                                .fn=stringr::str_replace,
                                .cols=tidyselect::starts_with("f", vars = (var_names)),
                                pattern="f", replacement="G")
head(data_book$get_data_frame("survey"))
# Nothing is replaced

var_names <- c("fert", "field", "village")
data_book$rename_column_in_data(data_name="survey", type="rename_with",
                                .fn=stringr::str_replace,
                                .cols=tidyselect::starts_with("f", vars = (var_names)),
                                pattern="f", replacement="G")
head(data_book$get_data_frame("survey"))
# Field is replaced, but nothing else

Very strange!

I suggest for now, implementing this, and ignore the vars argument. We can open a separate issue to add that in in due course when this bug is fixed in the R code (I will investigate it further too - it is uncommon for a bug in this package to go unnoticed. It is more likely I am misusing it, but I can't see how here!)

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeSophie some excellent progress and I used it well. Then I got to here:

image

Notice the receiver has gone. It was there before. And I still have the message when I press Ok, though my search is Contains. So it isn't starting with.

Another problem, I tried abbreviate at the start - with no columns selected (the receiver was there then. That used to work and assume all the columns. Now it gives an error. I would prefer, when the receiver is there and empty, that Ok is disabled - as for the other dialogs.

@MeSophie
Copy link
Contributor Author

@rdstern Please can you check the dialog again. This message is not an error just a notification telling you that you cannot start filling and input with _ and it this the same with other special character (&,$,#...etc)
image

@rdstern
Copy link
Collaborator

rdstern commented Aug 22, 2023

@MeSophie I am just about to check your new rename code. But I want to start by responding to your comment above, where you say:
"This message is not an error just a notification telling you that you cannot start filling and input with _ and it this the same with other special character (&,$,#...etc)"
Now here your "other special characters" doesn't really apply, because they are also not allowed in a variable name. It is only underscore and dot (_ or .) that can be used in a name. So I should be able to search for them. However, they are special characters and I think that the default is to use regex. In that case the dot is short for any character.

I am also happy to assume that a variable name cannot start with an underscore, but I often use it as a separator in a name. And a name can start, or include dot. So, in searching for them in a string, we either have to use escape characters, which I think is the default in stringr, or we need to turn regex off as our default - which @N-thony has done in our other string handling dialogs.

Now to test the new code.

By the way, and this is for discussion with @lilyclements too. I find this to be a really important feature you are adding. With Patrick's new work we are now better able to handle wide data sets - so data sets with many variables. Now R-Instat is designed to make R easy to use. With a wide dataset this must include making changes to the names really easy. And this is what you are doing. It is part of the case we are trying to build up, that stats packages (including R-Instat) have a spreadsheet view of the data that is even better than in a spreadsheet. And ordinary spreadsheets don't have special facilities for coping with variable names. It is important to have these - and to make them easy!

@rdstern
Copy link
Collaborator

rdstern commented Aug 22, 2023

Just two comments:
a) Cosmetic - the size of the dialog changes with each of the 3 radio buttons. You keep it the same for both options of the Rename With, so with and without the selected columns. It looks pretty empty each way. I wonder if there is something you could do easily? For example:

  1. Move the group box up when for the whole data-frame - so without the multiple receiver. Then bove it down to its current position when the multiple receiver is there? It is smaller then because the Edit option is not there. So the dialog can remain of the same size.
  2. Or keep it as it is, for the whole data frame, and make the dialog smaller when you select the columns, because the Edit is absent.

b) Now the last one I hope. It is back to the underscore and dot that we have to solve.

image

Here is an example, with a surprising result. I am replacing .geo by nothing. As a literal string .geo does not exist in the variables you see here. But it works and the result is here:

image

Now that is because - with regex - which is the default - dot matches any character. So here it has matched the underscore (which is actually what I wanted here! But I would prefer to change the underscore directly.

I checked your code and I think the solution is easy. Here is your command, changed as I think you need to do.:
data_book$rename_column_in_data(data_name="finalresponses1", type="rename_with", .fn=stringr::str_replace, pattern=stringr::fixed(".geo"), replacement="")
I have added stringr::fixed around your string. Otherwise it will use regex.

And at least for now, get rid of the check for odd symbols in the pattern. If you give them, they won't be in the names, so the names just won't change.

Now a powerful feature to add, and easily. I suggest there are occasions when we will wish to use the stringr default, i.e. to use regex. Could you add a checkbox at the bottom, with label Use Regular Expressions. The default is unchecked, so you add the fixed option above. If checked, then keep the replace function as you had it, so without the fixed option.
(If you look at the Prepare > Column: Text > Find/Replace Replace button you will see we use that option here - and add a keyboard. I suggest we don't need the keyboard here.

@MeSophie
Copy link
Contributor Author

@rdstern the dialog is ready to be test.

rdstern
rdstern previously approved these changes Aug 23, 2023
Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeSophie this is great - very well done. @lloyddewit as soon as you are ok I would really like to see this merged.

Copy link
Contributor

@lloyddewit lloyddewit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeSophie Thank you, this looks good, just 2 suggestions

instat/dlgName.vb Outdated Show resolved Hide resolved
instat/dlgName.vb Outdated Show resolved Hide resolved
Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeSophie I am now getting an error sometimes when I have the Regular Expressions on. I thought it was working before, and my R isn't good enough to see what's wrong. I have been trying to replace using dot as any character.
If you can fix this easily, fine. Otherwise I would really like to get this merged, so make the option invisible for now.

@MeSophie
Copy link
Contributor Author

@rdstern I have since tried to reproduce this error without success, so I have removed the checkbox.

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lloyddewit I'd be happy for this to be merged.

@lloyddewit lloyddewit changed the title Added a New Option on Rename Columns Dialog Added new option to Rename Columns dialog Aug 24, 2023
@lloyddewit lloyddewit merged commit 0a44081 into IDEMSInternational:master Aug 24, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A new option to make renaming columns easier
4 participants