Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: redefine phi relevance function: all points are 0 #13

Open
sungreong opened this issue Sep 21, 2020 · 4 comments
Open

ValueError: redefine phi relevance function: all points are 0 #13

sungreong opened this issue Sep 21, 2020 · 4 comments

Comments

@sungreong
Copy link

sungreong commented Sep 21, 2020

I have a issue about redefine
my data have some issue( same x , but different y values)

rg_mtrx = [

    [35000,  1, 0],  ## over-sample ("minority")
    [125000, 0, 0],  ## under-sample ("majority")
    [200000, 0, 0],  ## under-sample
    [250000, 0, 0],  ## under-sample
]

oversample_2 = smogn.smoter(
    data = data_merge, 
    y = target,
    k = 7,                    ## positive integer (k < n)
    pert = 0.01,              ## real number (0 < R < 1)
    samp_method = 'balance',  ## string ('balance' or 'extreme')
    drop_na_col = True,       ## boolean (True or False)
    drop_na_row = True,       ## boolean (True or False)
    replace = False,        
    rel_thres = 0.10,         ## real number (0 < R < 1)
    rel_method = 'manual',    ## string ('auto' or 'manual')
    rel_ctrl_pts_rg = rg_mtrx ## 2d array (format: [x, y])
)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-71-92b5331e2604> in <module>
     23 #     rel_xtrm_type = 'both', ## unused (rel_method = 'manual')
     24 #     rel_coef = 0.001,        ## unused (rel_method = 'manual')
---> 25     rel_ctrl_pts_rg = rg_mtrx ## 2d array (format: [x, y])
     26 )

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/smogn/smoter.py in smoter(data, y, k, pert, samp_method, under_samp, drop_na_col, drop_na_row, replace, rel_thres, rel_method, rel_xtrm_type, rel_coef, rel_ctrl_pts_rg)
    178 
    179     if all(i == 1 for i in y_phi):
--> 180         raise ValueError("redefine phi relevance function: all points are 0")
    181     ## ---------------------------------------------------------------------- ##
    182 

ValueError: redefine phi relevance function: all points are 0
import smogn
housing_smogn_2 = smogn.smoter(
    data = housing_smogn, 
    y = "SalePrice"
)
housing_smogn_3 = smogn.smoter(
    data = housing_smogn_2.reset_index(drop=True), 
    y = "SalePrice"
)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-67-a679777bf14e> in <module>
      2 housing_smogn_3 = smogn.smoter(
      3     data = housing_smogn_2.reset_index(drop=True),
----> 4     y = "SalePrice"
      5 )

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/smogn/smoter.py in smoter(data, y, k, pert, samp_method, under_samp, drop_na_col, drop_na_row, replace, rel_thres, rel_method, rel_xtrm_type, rel_coef, rel_ctrl_pts_rg)
    175     ## phi relevance quality check
    176     if all(i == 0 for i in y_phi):
--> 177         raise ValueError("redefine phi relevance function: all points are 1")
    178 
    179     if all(i == 1 for i in y_phi):

ValueError: redefine phi relevance function: all points are 1


please fix this :)

@Bahar1978
Copy link

Hello, thanks for SMOGN. Unfortunately I have the same issue. Could you please guide us how should we solve it.

@jeongwhanchoi
Copy link

Hello, I also have the same issue. Could you solve this corner case?

@payalmohapatra
Copy link

I am facing the same issue as well. Is there a fix released?

@nickkunz
Copy link
Owner

nickkunz commented Dec 8, 2022

Thank you for using this Python implementation of SMOGN. I apologize for the delay. It appears that perhaps the distribution of your y response variable does not contain box plot extremes in order for the Φ function to automatically determine which range of values to over-sample.

Please consider either reducing the rel_coef argument's default value or manually specifying the range of values to over-sample and under-sample, as exhibited here: https://github.com/nickkunz/smogn/blob/master/examples/smogn_example_3_adv.ipynb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants