code everywhere
technology, web services and applications

PHP Boolean Text Search Parser

posted on June 28, 2013, 6:43 pm in php

This is a simple PHP class that will parse a text query into a logical group of AND, OR, and EXACT searches. It works recursively and can display the query breakdown.

Complex queries can take the form of "(a or b) and (c or d)" and positional queries (EXACT) require the use of "..." (double quotes). Queries cannot be ambiguous, so "a and b or c and d" would not work, instead use "a and (b or (c and d))".

Examples

Example query 1)

a and ("hello people" or (c and d))
s0 hello people Positional Search
s1 c and d Intersection
s2 s0 or s1 Union
s3 a and s2 Intersection

Example query 2)

( (abc and m and c) or (d or (e)) or g ) and (h and (i or j))
s0 abc and m and c Intersection
s1 e Retrieve
s2 i or j Union
s3 d or s1 Union
s4 h and s2 Intersection
s5 s0 or s3 or g Union
s6 s5 and s4 Intersection

The Code

Copy the the PHP class file and import it into your app:

<?php
/* File:    PHP Boolean Text Search Parser
 * Date:    June 28, 2013
 * License: MIT (http://opensource.org/licenses/MIT)
 * Copyright (C) 2013 by http://codeeverywhere.ca
 */
class parser{
    private 
$last;
    private 
$set;

    public function 
parse($q$s 0$arr null)
    {
        
$q stripcslashes($q);
        
        
//Replace Quotes
        
if( preg_match_all('/"[a-zA-Z0-9\s]+"/'$q$matches) )
        {
            
$replace = array();
            
$matches $matches[0];
            for(
$x=0$x<count($matches); $x++)
            {                
                
$replace[$x] = "s$s";
                
$this->set["s$s"] = $matches[$x];        
                
$s++;
                
$this->last $s;
            }
            
$q str_replace($matches$replace$q);
        }
        
        
$regex '/(\(\s*[a-zA-Z0-9-]+\s*(\s(and|or)\s+[a-zA-Z0-9-]+\s*)*\))/';
        
preg_match_all($regex$q$matches);
        
$matches $matches[1];        
        
$replace = array();
                
        for(
$x=0$x<count($matches); $x++)
        {
            
$replace[$x] = "s$s";
            
$this->set["s$s"] = $matches[$x];        
            
$s++;
            
$this->last $s;
        }
    
        
$q str_replace($matches$replace$q);
        
        if( 
count($matches) == )
        {
            
$this->set["s$s"] = $q;
            foreach(
$this->set as $key => $val)
            {
                
$this->set[$key] = trim(str_replace(array(')','(''"') , ''$val));
                if(
strpos($val,' and ') !== false and strpos($val,' or ') !== false)
                    return 
"Error - The query '$val' is ambiguous.";
            }
            return 
$this->set;
        }
        else
            return 
$this->parse($q$s$arr);
    }
    
    public function 
display()
    {
        echo 
"<table border=\"1\"  cellpadding=\"5\">";
        foreach(
$this->set as $s => $q)
        {
            if(
strpos($q,' and ') !== false)
                
$type 'Intersection';
            elseif(
strpos($q,' or ') !== false)
                
$type 'Union';
            elseif(
strpos($q,' ') !== false)
                
$type 'Positional Search';
            else
                
$type 'Retrieve';
            echo 
"<tr>";
            echo 
"<th>$s</th><td>".$q."</td><td>$type</td>";
            echo 
"</tr>";
        }
        echo 
"</table>";
    }    
}

Use the class:

<?php
$p 
= new parser();
echo 
$q '( (abc and m and c) or (("banana" or "orange") or (e)) or g ) 
    and ("fruit juice" and h and (i or j))'
;
$p->parse($q);
$p->display();    
?>

recent posts

< back