CFLib.org – Common Function Library Project

solrClean(input)

Last updated December 30, 2011

Version: 2 | Requires: ColdFusion 9 | Library: UtilityLib

 
Rated 1 time(s). Average Rating: 5.0

Description:
Like VerityClean, massages text input to make it Solr compatible. NOTE: requires uCaseWordsForSolr UDF.

Return Values:
Returns a string.

Example:

view plain print about
<cfset cleanSolrSearchText = solrClean(userSearchText) />

Parameters:

Name Description Required
input String to run against Yes

Full UDF Source:

view plain print about
<!---
 Like VerityClean, massages text input to make it Solr compatible.
 v2 by Daria Norris to deal with wildcard characters used as the first letter of the search
 
 @param input      String to run against (Required)
 @return Returns a string. 
 @author Sami Hoda (sami@bytestopshere.com) 
 @version 2, December 30, 2011 
--->

<cffunction name="solrClean" access="public" output="false" returntype="Any" >
    <cfargument name="input" type="string" default="" required="true" hint="String to run against" />

    <cfset var cleanText = trim(arguments.input) />

    <!--- // List of special characters to remove --->
    <cfset var reBadChars = "\\|@|'|<|>|\(|\)|!|=|\[|\]|\{|\}|\#chr(44)#|`" />


    <cfscript>
    //=-=-=-=-=-=-=-=-
    // Replace comma with OR
    //=-=-=-=-=-=-=-=-
    cleanText = replace(cleanText, "," , " or " , "all");

    //=-=-=-=-=-=-=-=-
    // Strip double spaces
    //=-=-=-=-=-=-=-=-
    cleanText = reReplace(cleanText,{2,}"," ","all");

    //=-=-=-=-=-=-=-=-=-
    // Strip bad characters
    //=-=-=-=-=-=-=-=-=
    cleanText = reReplace(cleanText,reBadChars," ","all");

    //=-=-=-=-=-=-=-=-
    // Clean up sequences of space characters
    //=-=-=-=-=-=-=-=-
    cleanText = reReplace(cleanText,"[[:space:]]+"," ","all");

    // clean up wildcard characters as first characters
    cleanText = reReplace(cleanText,'(^[\*\?]{1,})','');

    //=-=-=-=-=-=-=-=-=-
    // uCaseWords - and=AND, etc - lcase rest. if keyword is mixed case - solr treats as case-sensitive!
    //=-=-=-=-=-=-=-=-=
    cleanText = uCaseWordsForSolr(cleanText);
    
</cfscript>

    <cfreturn trim(cleanText) />
</cffunction>
blog comments powered by Disqus

Search CFLib.org


Latest Additions

Tayo Akinmade Tayo Akinmade added
arrayTrim
22 day(s) ago

Will Belden Will Belden added
longTime
28 day(s) ago

James Sleeman James Sleeman added
quickSort
a while ago

Ben Forta Ben Forta added
GetHostAddress
a while ago

Top Rated

Darwan Leonardo Sitepu EksporSQLData
Rated 5.0, 16 time(s)

Darwan Leonardo Sitepu backupDatabase
Rated 5.0, 13 time(s)

Barney Boisvert indentXml
Rated 5.0, 10 time(s)

Kevin Pepperman generateSsccAsn
Rated 5.0, 4 time(s)

Created by Raymond Camden / Design by Justin Johnson