Sunday, January 31, 2010

Faceting and Multifaceting syntax in Solr 1.4

Introduction


So you've installed Solr 1.4 and you've managed to get some data indexed. Now you're ready to have some fun with faceting. Faceting is basically just filtering the results of a search without effecting the relevance score. Sites such as ebay use faceting to help narrow down the results of a generic search like TV, to give you options such as 32" wide-screen LCD etc.


 

It makes for a very pleasant user experience. It's one of the main reasons to use Solr and Solr makes this process very easy. There are actually 3 types of faceting. In this post I'm only going to talk about "field faceting". Once you've mastered "field faceting", the other 2 types ("query faceting" and "date faceting") are very easy and the basic Solr Wiki will be enough for you to get going.


Field Faceting


Now when you're indexing data in Solr, it's best to use the "text" field type, as this applies lots of filters which will help sort out plurals, remove white space etc. However when it comes to faceting, the values returned will be the values after the filters have been applied. Therefore it's best to create new versions of any fields you wish to facet.



You can use the copyfield keyword to make this process very easy.

For example if you have a field called county

Now, when you index county, Solr will use the text field for searching and the string field for displaying the facet results.

Lets get faceting

First up you'll need to turn on the faceting as it's turned off by default.

http://localhost:8080/solr/select/?q=*%3A*&rows=0&facet=on

To facet on a field, all you have to do is use the facet.field=fieldname

http://localhost:8080/solr/select/?q=*%3A*&rows=0&facet=on&facet.field=fcounty

You can apply extra keywords to: limit the results

http://localhost:8080/solr/select/?q=*%3A*&rows=0&facet=on&facet.field=fcounty&facet.limit=3

or even page the results

http://localhost:8080/solr/select/?q=*%3A*&rows=0&facet=on&facet.field=fcounty&facet.limit=5&&facet.offset=5
  

You can sort the results by number descending (Default)

http://localhost:8080/solr/select/?q=*%3A*&rows=0&facet=on&facet.field=fcounty&facet.sort=count

or alphabetical

http://localhost:8080/solr/select/?q=*%3A*&rows=0&facet=on&facet.field=fcounty&facet.sort=lex
 

You can make it so Solr only brings back facets with a minimum number of results:

http://localhost:8080/solr/select/?q=*%3A*&rows=0&facet=on&facet.field=fcounty&facet.mincount=10
  

for more information see solr wiki

Multifaceting

On a website i've worked on we used multifaceting, Solr has a bit of a clunky syntax to support this, but it works and works well.. so who cares :) Lets say I want to apply a filter (fq) on Kent as a county

http://localhost:8080/solr/select/?q=*%3A*&rows=0&facet=on&facet.field=fcounty&fq=fcounty:Kent
 

When you apply the above filter you can see that the other counties get a facet count of 0. This is single faceting and works well. To get multifaceting to work you need to tell solr to exclude the filter (fq) when it works out the facet counts. To do this you can use the ex and tag key words. When you do this you'll notice that the counts for the other facet's come back now, and the filter has still been applied to the results (numFound is still 11).

http://localhost:8080/solr/select/?q=*%3A*&version=2.2&start=0&rows=0&indent=on&facet=on&facet.field={!ex=fcounty}fcounty&fq={!tag=fcounty}fcounty:Kent
 

You can apply as many filter queries as you want now by ORing them together. So if i wanted to filter on Kent OR Cambridgeshire, you can do that as follows:

http://localhost:8080/solr/select/?q=*%3A*&version=2.2&start=0&rows=0&indent=on&facet=on&facet.field={!ex=fcounty}fcounty&fq={!tag=fcounty}fcounty:Kent%20OR%20Cambridgeshire

See how the numFound has gone up from 11 to 18

Conclusion

Now using Solr you can create a really rich user experience with searches and give ebay a run for it's money. Enjoy.

7 comments:

  1. Great! Saved me some research time on this. Especially helpful with the multifaceting.

    ReplyDelete
  2. Thanks N. Glad you liked it. Faceting is great fun.. and it's a really cool way to see what you're index looks like.

    ReplyDelete
  3. Really good Post on how to do Multifaceting with SolrNet here: http://bugsquash.blogspot.com/2010/03/low-level-solrnet.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+bugsquash+%28Bug+squash%29

    ReplyDelete
  4. Congratulations for your post!!!

    I think it is really useful, but I have several problems to obtain a "distinct" query using faceting. I have made tests with faceting but it doesn't run correctly, I would like to know if is possible to use faceting to make a distinct in a query. Could you help me please?????

    Thank you so much.

    ReplyDelete
  5. Hi, this post of stackoverflow might help you out:

    http://stackoverflow.com/questions/2814000/how-to-select-distinct-field-values-using-solr

    ReplyDelete
  6. Yes, thank you for your reply. I had seen this post before, and later the yours but I have problems to make a distinct in solr. It doesn't run correctly using faceting. I suppose I am not doing the correct process. Like the example said, this query: //localhost:8983/solr/select/?q=*%3A*&rows=0&facet=on&facet.field=txt
    must return a query like SELECT
    DISTINCT txt
    FROM
    my_table;
    but this isn't the case, because the request to solr doesn't return the field text in any case.
    Could you help me? any suggestion is grateful.
    Thanks

    ReplyDelete
  7. I have a task of integrating solr with ofbiz. eventhough I tried many times, I am not successful. could you please guide me with steps

    ReplyDelete