I am dealing with a poorly designed table, somewhat like this
create table (
entity_key integer,
tag1 varchar(10),
tag2 varchar(10),
tag3 varchar(10),
...
tag25 varchar(10)
);
An entity can have 0 or more tags indicated by the number of non-null columns. Tags are all the same type, and there should be a seperate "tags" table to which we can join the primary entities.
However, I'm stuck with this (quite large) table.
I want to run a query that gives me the distinct tags and a count of each.
If we had the normed "tags" table we could simply write
select tag, count(tag) from tags group by tag;
However, I haven't yet come up with a good approach for this query given the current table structure.
You can this by using an array and the unnest:
select x.tag, count(*)
from tags
cross join lateral unnest(array[tag1, tag2, tag3, tag4, tag5, tag6, tag7, ...]) as x(tag)
where x.tag is not null --<< git rid of any empty tags
group by x.tag;
This will group by the contents of the tag columns unlike Prdp's answer which groups by the "position" in the column list.
For this sample data:
insert into tags (entity_key, tag1, tag2, tag3, tag4, tag5)
values
(1, 'sql', 'dbms', null, null, null),
(2, 'sql', 'dbms', null, null, 'dml'),
(3, 'sql', null, null, 'ddl', null);
This will return this:
tag | count
-----+------
dml | 1
ddl | 1
sql | 3
dbms | 2
You can unpivot the data and do the count
select tag,count(data)
from
(
select tag1 as data,'tag1' as tag
from yourtable
Union All
select tag2,'tag2' as tag
from yourtable
Union All
..
select tag25,'tag25' as tag
from yourtable
) A
Group by tag
If postgresql supports Unpivot operator then you can use that
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With