You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#34 Adds support for random effects via the pipe operator. While going through the example in the PR I noticed that operations that are written as x2|m were then printed as m:x2 because formulaic sorts the factors within a term. Apart from the : that should be a | (this is not a major issue), order here does matter, so printing x2|m is not the same as printing m|x2.
This issue suggests that the Operator class could have a commutative property that indicates whether you can change the order of the operators or not.
But maybe there's a better approach? Let's have this space for discussion.
The text was updated successfully, but these errors were encountered:
Hi @tomicapretto! I thought about this a bit over the weekend, and here is my proposal. Let me know what you think.
Rather than adding a commutative option to Operator objects, I propose to instead allow Term instances to override how resulting columns are named. This would allow you to subclass Term to say RandomEffectTerm. Random effect operators would then generated RandomEffectTerm instances instead of Term instances, but honour the same API. The result might look a bit like this (untested code):
data = pandas.DataFrame({'a': [1,2,3], 'b': ['a','b','c']})
class RandomEffectTerm(Term):
def __init__(group_factors, group):
self.group_factors = set(group_factors)
self.group = group
@property
def factors(self):
return self.group_factors | {self.group}
def get_column_name(self, features):
# features would look something like:
# {'a': 'a', 'b': 'b[T.a]'}
return ":".join(sorted(v for k, v in features.values if k != self.group.expr)) + "|" + features[self.group.expr]
model_matrix("(a|b)") -> columns of {"a|b[T.a]", "a|b[T.b]", ...}
If this looks good to you, I'll code it up when I get a chance. The last piece would then be to formalise how to surface the different segments of the model matrix (or perhaps how to store multiple of them), in order to get the "fixed effect" and "random effect" parts... but we can deal with that as a separate thread.
#34 Adds support for random effects via the pipe operator. While going through the example in the PR I noticed that operations that are written as
x2|m
were then printed asm:x2
because formulaic sorts the factors within a term. Apart from the:
that should be a|
(this is not a major issue), order here does matter, so printingx2|m
is not the same as printingm|x2
.This issue suggests that the
Operator
class could have acommutative
property that indicates whether you can change the order of the operators or not.But maybe there's a better approach? Let's have this space for discussion.
The text was updated successfully, but these errors were encountered: