Solr - JSON index query string of the database?

advertisements

I would like to know if it is possible to index data that contains a JSON string that can be decoded and each JSON value to be indexed with the separate values.

I am using the DIH to connect to a MySQL database and able to index the individual columns. The result would look like the following:

<response name="response" numFound="1" start="0" maxScore="2.7143538">
    ...
    <result name="response" numFound="1" start="0" maxScore="2.7143538">
        <doc>
        <float name="score">2.7143538</float>
        <str name="id">82</str>
        <str name="name">jorge</str>
        <str name="otherinfo">{"day":15,"year":1989,"month":"January"}</str>
    </doc>
</result>
</response>

The problem is that "otherinfo" is a JSON string that I would like to decode and have something like the following in my index:

<response name="response" numFound="1" start="0" maxScore="2.7143538">
    ...
    <result name="response" numFound="1" start="0" maxScore="2.7143538">
        <doc>
        <float name="score">2.7143538</float>
        <str name="id">82</str>
        <str name="name">jorge</str>
        <str name="day">15</str>
        <str name="year">1989</str>
        <str name="month">January</str>
    </doc>
</result>
</response>

Would this be possible to do at all with Solr?

Thanks in advance


I commented on this. I decided that I should answer instead.

The fix for your issue isn't at the Solr level. You shouldn't be storing your data this way in the DB to begin with. In the long run, it would be better to fix this problem there, as opposed to trying to hack this at the Solr indexing level.

Your question proves that someone, probably an end user, is interested in searching by this data. This implies that it should probably be stored in the database as an actual Date or Timestamp field so that it can be properly selected or sorted on.

I'm sure people won't like that this doesn't exactly answer your question, but someone needs to tell you this.