Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lucene: How does Lucene store fields with same name

Tags:

java

lucene

I have the following code:

Document doc = new Document();
String description = "This is a description text";
Field descField = new StringField("description ", description , Field.Store.YES);
doc.add(descField);
doc.add(new TextField("location", "Berlin", Field.Store.YES));
doc.add(new TextField("location", "Munich", Field.Store.YES));
doc.add(new TextField("location", "Vienna", Field.Store.YES));
writer.addDocument(doc);

How is the field 'location' physically stored in Lucene? Is it mapped to one single field (with offsets kept internally) or are there actually 3 fields with the same name separately stored in the inverted index?

The actually question I have this this: Are there performance issues (runtimes/space) or other issues if I choose to dynamically generate (e.g. from a data source at run-time) these location fields in comparison with adding them to a single field and hence reducing the number of fields to always just two (description and location)?

If somebody knows a pointer or an answer right out of their head it would be appreciated.

like image 587
RalfB Avatar asked Oct 28 '25 03:10

RalfB


1 Answers

It will be mapped to a single field. This:

doc.add(new TextField("location", "Berlin", Field.Store.YES));
doc.add(new TextField("location", "Munich", Field.Store.YES));
doc.add(new TextField("location", "Vienna", Field.Store.YES));

Is very much the same as:

doc.add(new TextField("location", "Berlin Munich Vienna", Field.Store.YES));

(Assuming you are using StandardAnalyzer)

Which you choose should make no discernible difference at index time, and no difference whatsoever in terms of which search result you get back.

The difference between the two is in their stored representation. When adding them separately, you will be able to get the stored result back in an array, rather than a string:

IndexableField[] locations = doc.getFields("location")
for (IndexableField location : location) 
    //Do stuff
like image 162
femtoRgon Avatar answered Oct 29 '25 20:10

femtoRgon