Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request for Shave #24

Open
shambhu112 opened this issue Jun 7, 2021 · 1 comment
Open

Feature Request for Shave #24

shambhu112 opened this issue Jun 7, 2021 · 1 comment

Comments

@shambhu112
Copy link

Would be great to have a function that can shave off rows and cols that are above a threshold for poorly corelated variables

i.e something like
shave(min = -0.2 , max = 0.2)

this will shave off (i.e not show) variables that are corelated to any other variable in the range above

@r-link
Copy link
Owner

r-link commented Jun 9, 2021

This is actually pretty easy to implement. Basically you'd have to subset the numeric columns of the dataset by something like data[ , sapply(1:ncol(data), function(i) max(abs(cor(data)[-i, i]))) > threshold] or something like that.

I am not sure if I really want to add such a feature because it does not really fit with the philosophy behind corrmorant - my idea was to provide a versatile tool for data inspection, but to make it extra complicated to use it for data dredging and p hacking. If you ever wondered why there is no build-in function to add p-values to the correlations, that's the reason (you can do it with add_funtext() but if you know enough R to find out how you probably also know why that's not a good idea).

I see why shave() may be useful, but I am not really fond of the idea that people might use the shave function to remove the variables that are not strongly correlated with anything and then publish a paper based on the reduced dataset without mentioning it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants