HBase access and index

Question

I have a HBase table with about 50 million rows and each row has several columns. My goal is to retrieve from the table those rows who have a given value in a given column, e.g. rows whose column 'col_1' has value 'val_1'.

I have two options to choose:

scan through the table from the beginning to the end, and check each row and see if it should be retrieved or not;
build indices for this table (e.g., indices for values in column 'col_1'), then for a given column value 'val_1', get all the row keys associated with this index 'val_1', then go through these row keys and retrieve the corresponding rows.This in my mind will involve random access to the original hbase table.

Does anyone give me some suggestions about which option runs faster, or you have another better option?

Thanks a lot!

Xodarap · Accepted Answer

Are you asking whether adding an index will make it faster? The answer is of course yes. You can see the wiki for thoughts on secondary indexes in HBase.

HBase access and index

Tags:

indexing

hbase

RecSys_2010

1 Answers

Xodarap

Recent Activity

Donate For Us

HBase access and index

Tags:

indexing

hbase

RecSys_2010

1 Answers

Xodarap

Related questions

Recent Activity

Donate For Us