How to merge matches from two distinct (not sharded) Lucene Indexes

Question

I have two separate indexes holding different fields that together contain all the searchable fields for an index. For example the first index holds the indexed text for all documents, and the second holds tags for each and every document.

Note the example below is a bit wonky as I've changed the names of the entities. Index1: text document-id

Index2: tag-name: "very important" user: "Fred's id"

I would like to keep the indexes separate as it seems wasteful to continually update a single index whenever a user adds/removes a tag.

So far I think I might need to process the two search results and merge them manually (in code).Any other suggestions ?

I do not want to merge separate/sharded indexes.

erickson · Accepted Answer

Lucene has a type of IndexReader to support this arrangement—ParallelReader.

It can be a little tricky to use, as the Lucene document identifier for a record must be the same in both indexes. In practice, this means adding documents in the same order to both indexes. I have read that in some cases, document deletion and index optimization can cause Lucene to reassign these document identifiers, but I haven't experimented to find out if this is true. Extra care may be needed if existing records are modified. If only new records are appended, there should be no trouble.

This approach is generally called "vertical partitioning," as opposed to "horizontal partitioning," or sharding.

How to merge matches from two distinct (not sharded) Lucene Indexes

Tags:

lucene

mP.

1 Answers

erickson

Recent Activity

Donate For Us

How to merge matches from two distinct (not sharded) Lucene Indexes

Tags:

lucene

mP.

1 Answers

erickson

Related questions

Recent Activity

Donate For Us