Skip to content

Shifu 0.2.5 Support Missing Value As a Bin

Zhang Pengshan (David) edited this page Apr 3, 2015 · 1 revision

Add Missing Value Bin

After 'shifu stats', check ColumnConfig.json, you will find one more bin at last in ColumnStats:binCountPos and ColumnStats:binCountNeg. This is missing value Bin.

Please be noted binCountPos and binCountNeg will be size of binBounaries + 1.

New KS, IV Value

New KS, IV value will be computed based on new binCountNeg and binCountPos (including missing value count).

What is missing value?

In Shifu 0.2.5, missing value is simple. For numeric column, 'null', emptry and non-number value will be set as missig value. In next version, we may have good definition on missing value and even to let user specify the rules.

'binWeightNeg' and 'binWeightPos'

Two lists in ColumnStats are added for weighted negative and positive values. So in each column, we will have two woe value. One is woe for count, the other one is woe for weight.

Clone this wiki locally