How to make search filter strictly?

Permalink
I'm running into a problem on a client site where a search for "Men's Health" will also return results for "Women's Health." These are both values for a select attribute that is searchable.

What I'm seeing from the query is that it outputs this:

OR ak_show_topic like '%Men\'s Health%'

I need it to look like this:

OR ak_show_topic like '%\nMen\'s Health\n%'

I was able to update that in another place on the site where we filter a page list through dropdown menus and just use the filter method on the page list like this:

$pl->filter(false, 'ak_show_topic LIKE "%\n' . urldecode($topic) . '\n%"');


I tracked down where it's filtering when you search, it seems to be from the searchKeywords function in /libraries/attribute/controller.php (5.5.1)

$qkeywords = $db->quote('%' . $keywords . '%');


When I try to add the newline characters there, I get them escaped, so the query looks like this:

OR ak_show_topic like '%\\nMen\'s Health\\n%'

This doesn't give me the correct page list, either.

Is there any way to set this up? It seems like maybe when I'm filtering directly of the $_GET URL parameter I'm actually exposing myself to sql injection? I'm not really sure how that works. I could pull the $db->quote() out of the filterKeywords function but I don't want to do that if it's a security risk.

Does anyone know how to make front-end site search filter by keywords strictly?

hereNT
 
mkly replied on at Permalink Reply
mkly
I think this is more about the MATCH() function in mysql. I'm not really well versed in that stuff, but you can try to do a "simple" keyword search which uses a LIKE instead.

Something like
$query = $this->get('query');
$pl = new PageList();
/**
 * public function filterByKeywords($keywords, $simple = false)
 * @param string $keywords query string to match against
 * @param bool $simple set to true to use MySQL LIKE instead of MATCH
*/
$pl->filterByKeywords($query, true);
$pages = $pl->getPage();
hutman replied on at Permalink Reply
hutman
That actually doesn't really do what I want either. It's already using like for the attributes, the problem is that it's using like without the \n to make it strict, so it will match anything that has the string.

The simple setting also only applies to cName, cDescription and content, in the non-simple version these are using match, but it's only using match on the cName, cDescription and content strings. The attributes search string is unchanged between the two versions.
hutman replied on at Permalink Reply
hutman
Oh, and this is actually hereNT, just logged into my work account now :)
mkly replied on at Permalink Reply
mkly
Oh and on that above one, if $topic is user input you should probably be escaping that afaik. I believe DatabaseItemList::filter() just passes the query in as is(unescaped) if column is set to false.

$db = Loader::db();
$topic = $this->get('topic');
$pl->filter(false, 'ak_show_topic LIKE "%\n' . $db->quote($topic) . '\n%"');
hutman replied on at Permalink Reply
hutman
It's generated from a select dropdown filled with values gotten from the select attribute, but then passed in to the page controller from a get string. The reason I'm using get is that the page list results were not paginating correctly when you searched - you would search, then hit one of the buttons to navigate to a different page of results an the keyword filter would be lost. So I'm not sure if I can pass it any other way, but prev/next/1/2/3 buttons in the page list don't see the post variables.

We finished this site several months ago and the client has been populating content, now they are pointing out things like the fact that our filter wasn't returning the correct results (men's and women's both returned when you search for men's) and that search does the same thing.

This is where I got the code for doing the filter from:
http://www.concrete5.org/community/forums/customizing_c5/pagelist_f...

There Andrew is showing it just using $pl->filter without the $db->quote so I figured it was OK.

Seems like there's just no way to do this. Which would really suck, I'm not sure how to fix it and this is a pretty big deal to the client.
mkly replied on at Permalink Reply
mkly
with @andrew's code
$pl->filterByAttribute('color', "%\nGreen\n%', 'like');

Is passing in a column name. This means that DatabaseItemList will parse this and escape it
$pl->filter(false, 'ak_show_topic LIKE "%\n' . $blah . '\n%"');

Will just pass the string in as is, because the column(first parameter) is false

See my code below, you would still have to override that method though. It's possible that in your override you could check for the attribute handle to filter differently or something.

That said, I could be missing something much easier that I just don't know about.
mkly replied on at Permalink Reply
mkly
Oh... hmmm, that wasn't really what you were asking, I didn't really read it all the way. Well, I'm not sure if this is the best(or works), but maybe you can create a new attribute type based on select and add the searchKeyword method

class MulitSearchableSelectAttributeTypeController extends SelectAttributeTypeController {
  public function searchKeywords($keywords, $list = false) {
    $db = Loader::db();
    /**
     * Adding the '\n'
     */
    $qkeywords = '%\n' . $db->quote($keywords) . '\n%';
    return 'ak_' . $this->attributeKey->getAttributeKeyHandle() . ' like '.$qkeywords.' ';
  }
}
hutman replied on at Permalink Reply
hutman
Sorry, I thought I was being pretty clear. I'm trying to search and have it return results on the attributes that are strict matches to the query, not partial matches.

I tried moving the '%\n' outside of the $db->quote() function with the parent searchKeywords function on the controller in the library. It causes mySQL errors and the query doesn't complete.

I also can't make another attribute type - I could override the built in select type but there are already hundreds of pages with these attributes.
mkly replied on at Permalink Reply
mkly
ok, dug around adodb
$qkeywords = '\'%\n' . $db->escape($keywords) . '%\n\'';

and
$pl->filter(false, 'ak_show_topic LIKE \'%\n' . $db->escape($topic) . '\n%\'');

...maybe?
hutman replied on at Permalink Reply
hutman
It took some tweaking, but I got that working in both places:

$qkeywords = '"%\n' . $db->escape($keywords) . '\n%"';


and

$pl->filter(false, 'ak_show_topic LIKE "%\n' . $db->escape(urldecode($topic)) . '\n%"');


So when I debug the query, I see both of those with this for the query:

ak_show_topic LIKE "%\nMen\'s Health\n%"

So the \n to make it strict is there. But on the actual page search it's not making a difference - it's still returning hits for women's heath and not just those for men's health.

It's probably better that the filter is now being escaped on the page with the drop downs, but I don't think that doing it for search will actually work :(

Thanks for all your help!
mkly replied on at Permalink Reply
mkly
Are you sure that's not because you are escaping single quotes and wrapping in double quotes?

I don't think
LIKE "%\nMen\'s Health\n%"

Matches the phrase
Men's Health

afaik
LIKE '%\nMen\'s Health\n%'

does match
Men's Health

Are you sure there isn't something else(like a tag or something) that is matching the term health?
hutman replied on at Permalink Reply
hutman
I'm not honestly sure. I originally tried it with single quotes but that wasn't working right because of the escaping. I looked at the debug output from the page list on both the dropdown filtered page list and the search block page list and noticed that the dropdown had double quotes. Then I changed the search to match.

It does seem to match properly when using the dropdown filter with the double quotes.
mkly replied on at Permalink Reply
mkly
Ok, learning me a MySQL.

It appears this needs to be
LIKE '%\\nMen\'s Health\\n%'


So that kind of brings us back to

$pl->filter(false, 'ak_show_topic LIKE '.$db->quote('%\n'.$topic.'\n%'));


Sorry my mysql is awful.
hutman replied on at Permalink Reply
hutman
My mysql is not the best, either.

When I had it like that, it only returned 3 pages, and one of them was Women's. There should 8 of them with Men's Health. If you have the %\n in double quotes, then it shows up with only one \ in front of the n when you debug, but then returns both entries. If you use single quotes you get the \\n format, but it only returns the 3 pages that are women's.