I have a Lucene index in which one of the fields is mapped to Sitecore's rich text field.
Since this field value contains html content for most of the items sharing the template, I expected html content to be returned when fetching the item's field value. However, I noticed that the value returned is stripped of all html tags.
I tried changing the INDEXTYPE to "UNTOKENTIZED". Yet this did not solve the problem. I understand that Lucene does this to allow searching based on that field. But that is not a requirement in my case and I want this behavior overridden.
It happens because there is a RichTextFieldReader assigned to the html and rich text fields:
<fieldReader 
    fieldTypeName="html|rich text"                                     
    fieldNameFormat="{0}"
    fieldReaderType="Sitecore.ContentSearch.FieldReaders.RichTextFieldReader, Sitecore.ContentSearch" />
In Sitecore 8.1 it's defined in Sitecore.ContentSearch.Lucene.DefaultIndexConfiguration.config.
It strips out all the tags using HtmlField.GetPlainText().
You can try to add another section at the same level as <mapFieldByTypeName hint="raw:AddFieldReaderByFieldTypeName"> section and use something like:
<mapFieldByFieldName hint="AddFieldReaderByFieldName">
    <fieldReader 
        fieldName="yourFieldName"
        fieldReaderType="Sitecore.ContentSearch.FieldReaders.DefaultFieldReader, Sitecore.ContentSearch" />
Mapping by fieldName has higher priority than mapping by field type, so it will use fieldRendered specified for your field instead of using the one specified for the type of your field.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With