I'm designing a keyspace in Cassandra that will hold information about groups of users. Some info on it:
I have two designs that I'm considering for this.
select * from table where GroupID = {GroupID}
and would return as many rows as there are users in the group.select * from table where GroupID = {GroupID}
and would return a single row with the set of user ids contained in its UserIDs column set.I can't find a lot of documentation surrounding what would be the better design for this scenario. Any thoughts or pros and cons to either scenario?
For a group of 20k user IDs, I would absolutely avoid using collections at all costs. Collections are a convenience feature, but they're not nearly as performant as using a traditional CQL data model where you have the PRIMARY KEY(GroupID,UserID)
where all users are ordered in a single partition. That will be both easy to reason about, easy to query (can SELECT
either a single partition and page through all group members, or you can SELECT ... WHERE GroupID=X and UserID=Y
to determine if a user is in the group), and very performant.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With