Hi have a Vowpal Wabbit file with two namespaces, for example:
1.0 |A snow |B ski:10
0.0 |A snow |B walk:10
1.0 |A clear |B walk:10
0.0 |A clear |B walk:5
1.0 |A clear |B walk:100
1.0 |A clear |B walk:15
Using -q AB, I can get the interaction terms. Is there any way for me to keep only the interaction terms and ignore the linear terms?
In other words, the result of vw sample.vw -q AB --invert_hash sample.model right now is this:
....
A^clear:24861:0.153737
A^clear^B^walk:140680:0.015292
A^snow:117127:0.126087
A^snow^B^ski:21312:0.015803
A^snow^B^walk:28234:-0.010592
B^ski:107733:0.015803
B^walk:114655:0.007655
Constant:116060:0.234153
I would like it to be something like this:
....
A^clear^B^walk:140680:0.015292
A^snow^B^ski:21312:0.015803
A^snow^B^walk:28234:-0.010592
Constant:116060:0.234153
The --keep and --ignore options do not produce the desired effect because they are appear to be considered before the quadratic terms are generated. Is it possible to do this with vw or do I need a custom preprocessing step that creates all of the combinations?
John Langford (the main author of VW) wrote:
There is not a good way to do this at present. The easiest approach 
would be to make --ignore apply to the foreach_feature<> template in the 
source code.
You can use a trick with transforming each original example into four new examples:
1  |first:1  foo bar gah |second:1  loo too rah
-1 |first:1  foo bar gah |second:-1 loo too rah
1  |first:-1 foo bar gah |second:-1 loo too rah
-1 |first:-1 foo bar gah |second:1  loo too rah
This makes the quadratic features all be perfectly correlated with the label, but the linear features have zero correlation with the label. Hence a mild l1 regularization should kill off the linear features.
I'm skeptical that this will improve performance enough to care (hence the design), but if you do find that it's useful, please tell us about it.
See the original posts:
https://groups.yahoo.com/neo/groups/vowpal_wabbit/conversations/topics/2964 https://groups.yahoo.com/neo/groups/vowpal_wabbit/conversations/topics/4346
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With