In KDB, I have the following table:
q)tab:flip `items`sales`prices!(`nut`bolt`cam`cog`bolt`screw;6 8 0 3 0n 0n;10 20 15 20 0n 0n)
q)tab
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20
bolt
screw
In this table, there are 2 duplicate items (bolt). However since the first 'bolt' contains more information. I would like to remove the 'lesser' bolt.
FINAL RESULT:
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20
screw
As far as I understand, If I used the 'distinct' function its not deterministic?
One way to do it is to fill forward by item, then bolt will inherit the previous values.
q)update fills sales,fills prices by items from tab
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20
bolt 8 20
screw
This can also be done in functional form where you can pass the table and by columns:
{![x;();(!). 2#enlist(),y;{x!fills,/:x}cols[x]except y]}[tab;`items]
If "more information" means "least nulls" then you could count the number of nulls in each row and only return those rows by item that contain the fewest:
q)select from @[tab;`n;:;sum each null tab] where n=(min;n)fby items
items sales prices n
--------------------
nut 6 10 0
bolt 8 20 0
cam 0 15 0
cog 3 20 0
screw 2
Although would not recommend this approach as it requires working with rows rather than columns.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With