Apache Solr Mastery: Adding custom search paths with hook_menu

Recently, I've been working on the search interface for McGill University's course catalog. The University wants to allow students to browse courses at friendly URLs like:

  • arts/undergraduate/courses
  • science/undergraduate/courses
  • arts/graduate/courses

Instead of unreadable URLs like:

  • search/apachesolr_search/?filters=type%3Acatalog%20ss_faculty%3AAR%20sm_level%3AUndergraduate
  • search/apachesolr_search/?filters=type%3Acatalog%20ss_faculty%3ASC%20sm_level%3AUndergraduate
  • search/apachesolr_search/?filters=type%3Acatalog%20ss_faculty%3AAR%20sm_level%3AGraduate

First, let's use hook_menu to define a new search path, "arts/undergraduate/courses", with the page callback mcgill_courses_search:

function mcgill_menu() {
  $items['arts/undergraduate/courses'] = array(
    'page callback' => 'mcgill_courses_search',
    'access arguments' => array('search content'),
    'type' => MENU_CALLBACK,
  );
  return $items;
}

Now, let's define that page callback. mcgill_courses_search will be a copy of apachesolr_search_view from apachesolr_search.module, with two important modifications: it will replace search_get_keys with mcgill_get_keys, and it will replace search_data($keys, $type) with search_data($keys, 'mcgill'). A minor modification is setting $type to "apachesolr_search" in the function signature; this is just to get the page callback working with the fewest modifications. Once we define the page callback here, we'll have a closer look at search_get_keys and search_data.

// START MINOR MODIFICATION
function mcgill_courses_search($type = 'apachesolr_search') {
// END
  $results = '';
  if (!isset($_POST['form_id'])) {
    if (empty($type)) {
      drupal_goto('search/apachesolr_search');
    }
    // START MODIFICATION
    $keys = trim(mcgill_get_keys());
    // END
    $filters = '';
    if ($type == 'apachesolr_search' && isset($_GET['filters'])) {
      $filters = trim($_GET['filters']);
    }
    if ($keys || $filters) {
      // Log the search keys:
      $log = $keys;
      if ($filters) {
        $log .= 'filters='. $filters;
      }
      watchdog(
        'search',
        '%keys (@type).', array(
          '%keys' => $log,
          '@type' => t('Search')
        ),
        WATCHDOG_NOTICE,
        l(t('results'), 'search/'. $type .'/'. $keys));
 
      // START MODIFICATION
      $results = search_data($keys, 'mcgill');
      // END
 
      if ($results) {
        $results = theme('box', t('Search results'), $results);
      }
      else {
        $results = theme('box', t('Your search yielded no results'),
          variable_get('apachesolr_search_noresults',
            apachesolr_search_noresults()));
      }
    }
  }
  return drupal_get_form('search_form', NULL, $keys, $type) .
    $results;
}

The above page callback looks big and scary, but, remember, we're just changing two lines from apachesolr_search_view.


So, why don't we use Drupal's search_get_keys? The reason is, search_get_keys assumes your search path looks like "foo/bar/keywords", and that's not the case here. So, in mcgill_courses_search, we replaced search_get_keys with mcgill_get_keys. Let's implement mcgill_get_keys, and then explain the code:

function mcgill_get_keys() {
  static $return;
  if (!isset($return)) {
    $parts = explode('/', $_GET['q']);
    if (count($parts) == 4) {
      $return = array_pop($parts);
    }
    else {
      $return = empty($_REQUEST['keys']) ? '' : $_REQUEST['keys'];
    }
  }
  return $return;
}

mcgill_get_keys inspects the path. If it has four parts - for example, "arts/undergraduate/courses/english" - it will extract the last part as the keywords - in this case, "english". If it has fewer parts, it will behave like search_get_keys, in most cases returning the empty string. Keep in mind that, if we add more search paths, we may need to tweak mcgill_get_keys above.


In apachesolr_search_view, if $type is "apachesolr_search", search_data will invoke apachesolr_search_search. What's wrong with that? The problem is, apachesolr_search_search assumes your search path looks like "search/something/keywords", and that's not the case here. So, in mcgill_courses_search, we force search_data to invoke mcgill_search by passing it "mcgill" as an argument. Let's implement mcgill_search, which will be a copy of apachesolr_search_search from apachesolr_search.module, with one modification:

function mcgill_search($op = 'search', $keys = NULL) {
  switch ($op) {
    case 'name':
      return t('Search');
 
    case 'reset':
      apachesolr_clear_last_index('apachesolr_search');
      return;
 
    case 'status':
      return apachesolr_index_status('apachesolr_search');
 
    case 'search':
      $filters = isset($_GET['filters']) ? $_GET['filters'] : '';
      $solrsort = isset($_GET['solrsort']) ? $_GET['solrsort'] : '';
      $page = isset($_GET['page']) ? $_GET['page'] : 0;
      try {
        $results = apachesolr_search_execute(
          $keys,
          $filters,
          $solrsort,
          // START MODIFICATION
          'arts/undergraduate/courses', $page);
          // END
        return $results;
      }
      catch (Exception $e) {
        watchdog(
          'Apache Solr',
          nl2br(check_plain($e->getMessage())),
          NULL, WATCHDOG_ERROR);
        apachesolr_failure(t('Solr search'), $keys);
      }
      break;
  } // switch
}

The one modification is that mcgill_search replaces 'search/' . arg(1) with "arts/undergraduate/courses". Of course, if we add more search paths, we will have to modify this line. For now, let's keep things simple.

At this point, we have all we need for the search to run at the custom search path. We will of course want to add default filters based on the path. For example, if the path is "arts/undergraduate/courses", we want to filter the list of courses down to those within the Arts faculty at the undergraduate level. For that, come to my session at DrupalCon SF!

Comments

This is useful. However, if a user navigates to the custom path (for instance, "arts/undergraduate/courses") and wants to refine their search, this code will take them to search/apachesolr_search/whatever. Is there a way to retain the original path?

very useful tutorial, a little different from how I'd done it, a bit better. question: in your Bio it says you are a maintainer of AJAX Solr, is that a typo (Apache=>AJAX) or is it actually a "predictive search" thingy using Solr?

AJAX Solr is a JavaScript library for creating search interfaces to Solr: http://github.com/evolvingweb/ajax-solr

Demo: http://evolvingweb.github.com/ajax-solr/examples/reuters/index.html

Are you sure this code doesn't retain the original path? It should retain it for filters. It may not retain it when using the search form. However, I have some code that would correct that, too; you just need to alter the form to submit to the correct URL and to redirect to the correct URL in the submit handler.

Thanks for sharing this code! This has helped me a lot on a project I'm working on.

Hi, Great articles on using Apache Solr, however I am struggling with one aspect which is the keyword search.

A client is wanting to have multiple search fields for example, keyword, colour, size. Is there any way to achieve this?

I am assuming it is since the course calendar uses a similar drop down type idea beside keyword search field.

I have changed the search path to:

http://staging.artpistol.co.uk/art-gallery/browse/

So for example if I type in:

http://staging.artpistol.co.uk/art-gallery/browse/black

It does load all the black results but the filter down the left is not ticked?

Am I on the right tracks with this?

Thanks
Robert

One frustrating thing about this work-around is that when users hit Search and do a form post they automatically get redirected to the default apache search url search/apache-search

Great post! Helped me a lot, and if could be worked this little point mentionated about the others, that the path changes if you try to refine the search, will be a perfect use case tutorial!