jump to navigation

eZ Barcamp, Lyon: advanced searching and navigation topic January 17, 2007

Posted by Paul Borgermans in eZ publish, Lucene, Searching, Solr.
trackback

During the international partner meeting of eZ systems in Lyon (January 24-26), there will also be a Barcamp. I will use this occasion to talk about a new advanced search and navigation engine for eZ publish that is in the works.

This search engine listens to the name Aurora, as it builds on the Apache Solr (pronounced solar) incubator project and well, eZ publish home is in Norway where aurora’s can be seen more often then here in Belgium :-)

In fact, the Aurora plugin/engine is a follow-up to the Lucene based plugin Kristof and I developed some time ago. The features I wanted to add like faceted search/navigation, keyword highlighting, high performance caching and more are all built in the Solr backend, which in itself is based on the Lucene Java IR libraries. So instead of writing this myself with Lucene and the PHP-Java bridge, I can concentrate more on fundamental aspects. Also, no need for a PHP-Java bridge extension to be installed, the Solr backend is used over a HTTP connection. I think this is good news for all those who complained about that aspect (but you still need Java 1.5, aka J2EE) installed.

The effort for this new search plugin also has a few benefits for the larger PHP world :

  • I created a new Solr response writer in Java for PHP: no XML result parsing necessary, the results are returned as a string which can be eval’ed as a multidimensional PHP array (so PHP now joins the Ruby, Python, XSLT, XML and JSON response writers already available). The code is not in Solr yet, but will in the coming weeks.
  • A core PHP utility class/library for Solr is in the works which will form the basis for a component in the eZ components PHP library (if the eZ team accepts this of course).

And a note to the PHP lovers who do not like Java: the object/class persistence and caching for Java web applications (like Solr, which runs inside a servlet container) has no counterpart in the PHP world. The speed is simply amazing.

Cheers!

About these ads

Comments»

1. Kristof - January 18, 2007

Hi Paul

Do you really mean eval’ed (with the eval PHP function) or do you mean unserialized (with the unserialize function)?

2. Paul Borgermans - January 18, 2007

Hi Kristof

I mean eval’ed, that was the most easy to do as I based the PHP response writer on the JSON response writer which it extends. Unserialize instead: it could be done however, the thing to add in the Java code is to lookup the size of each array and size, type of each array element (in Java these are of course the named lists).

It is also easy to add an option to indicate what format it should return for a php response (JSON does this too, where you have an option to return arrays, maps or a mixture of them)

Do you expect unserialize to be faster?

3. Kristof - January 18, 2007

No idea about the speed. I’m more concerned about the security aspect of doing an eval on data you retrieve from an external system.

4. Paul Borgermans - January 18, 2007

I don’t see a security issue, other than securing the Solr backend which is easy. And even then I don’t see a way to trick the response writer not to return anything but an array construct with strings (the strings themselves are with single quotes). You’ll see when you are back in office ;-)

5. ansonlee - January 21, 2007

hi paul, i’m also working on solr and php, currently, i just let solr output data in JSON and then php do json_decode to get the data. just curious how does your method compare to the json_decode method performance wise ?

6. Paul Borgermans - January 21, 2007

hi ansonlee, I did not do performance comparisons yet. The reason I wanted a PHP response writer is that the json decoder is only bundled with PHP from 5.2 onwards. In the PHP Solr utility class in development, json will be an option if it is available as extension or built in.

I will benchmark in a few weeks.

7. ansonlee - January 22, 2007

that’s fine, guess there won’t be big difference. just see whether you did a test, if no, just leave it.

glad to find that there is neighborhood like you working on the same thing, i’m also using solr to implement the facet browsing , php as front end stuff.

fyi, for facet browsing, i’m also looking into flamenco project (http://flamenco.berkeley.edu), but the gut feeling of final solution is still solr

8. thomas - March 14, 2007

Hello Paul

I would like to ask if you could finished the PHP response writer. Or where can I obtain the source code. I’m strongly interested, because this is exactly the response writer we need for our purpose.
If you are not finished yet, could you please send me the most current state. I’m willing to collaborate and finish this very useful response writer.

Thanks for you help

Thomas


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: