Saturday, April 6, 2013

Search and Highlight a Portion of Article

If you want to show only a few sentences in you search results page and you want to be different sentences you will find this tutorial very useful.


Variable num_of_sent determines max number of sentences sentences we want to show in search result.

In PHP block we break all search words into an array $search_terms. Then we break article into sentences and compare each search word with word from sentence.

One match gives increment variable rel by one. Every word from article is compared with search keywords but the same sentences wont appear twice in search result.

When num_of_sen reaches desired value string $os that is sent to highlighting wont get any additional string and rel will continue to increase with every hit.
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Search, Highlight and Show Extract</title>
</head>
<body>
Enter keywords<form action='extract.php' method='post'>
<input type='text' name='search' value=''/>
<input type='submit'/>
</form>
<p>The HTML and CSS make your web site nice. So Roger knowledge of the PHP code that runs on backend machine is also good. Most people choose to use <u>databases with PHP's functions</u>, which are php the focus of this course. You can include CSS with include file or keep it in CSS file. It is impotrant that you have a single file to upadate the styles. It is not recomended to change php file on the fly. It is impotrant a single file to upadate the styles.</p>
<?php
$num_of_sent = 3;
if ($_POST['search'] != ""){
 $os = array();
 $rel = 0; // relevance
 $search_terms = explode(' ',$_POST['search']);
 for ($i = 0; $i < count($search_terms); $i++) {
  if (strcmp ($search_terms[$i], strtolower($search_terms[$i])) != 0) continue;
// Convert all letters to uppercase
  array_push($search_terms,strtoupper($search_terms[$i]));
// Make first character uppercase
  array_push($search_terms,ucfirst($search_terms[$i]));
 }
// text to be searched
 $text = "The HTML and CSS make your web site nice. So Roger knowledge of the PHP code that runs on backend machine is also good. Most people choose to use <u>databases with PHP's functions</u>, which are php the focus of this course. You can include CSS with include file or keep it in CSS file. It is impotrant that you have a single file to upadate the styles. It is not recomended to change php file on the fly. It is impotrant a single file to upadate the styles.";
 echo "<hr/>";
 $text = strip_tags($text);
 $sent = explode('.',$text);
        for ($i = 0; $i < count($sent); $i++) {
             $words = explode(' ',trim($sent[$i]));
   for ($j = 0; $j < count($words); $j++) {
     for ($k = 0; $k < count($search_terms); $k++) {
    if($search_terms[$k] == $words[$j]){
// Add relevance
      $rel ++;
// $num_of_sent is the max number of
// highlighted sentences we want to see
// in search result
       if ($rel < $num_of_sent){
// if sentence dont exists in array put it there
      if(!in_array($sent[$i],$os)) array_push($os,$sent[$i]);
     }
    }
   }
  }
 }
// Convert array $os to string in order to highlight it
 $oss = implode(" ", $os);
 echo "<h3>Highlighted extract</h3>";
// highlight
 $colors = array('FFFF00','FF9900','FF0000','FF00FF','99FF33','33FFCC','FF99FF','00CC33');
 $patterns = array();
 $replacements = array();
 for ($i = 0, $j = count($search_terms); $i < $j; $i++) {
  $patterns[$i] = '/b'.preg_quote($search_terms[$i], '/').'b/';
  $replacements[$i] = '<b style="color:black;background-color:#' .
  $colors[$i % 8] .'">' . $search_terms[$i] . '</b>';
 }
 if ($j) {
  while ($oss) {
   if (preg_match('{^([^<]*)?(</?[^>]+?>)?(.*)$}s',$oss,$matches)) {
    print preg_replace($patterns,$replacements,$matches[1]);
    print $matches[2];
    $oss = $matches[3];
   }
  }
 } else {
  print $oss;
 }
 echo "<h1>Relevance $rel</h1>";
}
?>
</body>
</html>
Variable rel is used for sorting purposes and it should be assigned to every article .

No comments:

Post a Comment