9.13. Text Search Functions and Operators

Table 9.41, Table 9.42 and Table 9.43 summarize the functions and operators that are provided for full text searching. See Chapter 12 for a detailed explanation of PostgreSQL's text search facility.

Table 9.41. Text Search Operators

OperatorReturn TypeDescriptionExampleResult
@@booleantsvector matches tsquery ?to_tsvector('fat cats ate rats') @@ to_tsquery('cat & rat')t
@@@booleandeprecated synonym for @@to_tsvector('fat cats ate rats') @@@ to_tsquery('cat & rat')t
||tsvectorconcatenate tsvectors'a:1 b:2'::tsvector || 'c:1 d:2 b:3'::tsvector'a':1 'b':2,5 'c':3 'd':4
&&tsqueryAND tsquerys together'fat | rat'::tsquery && 'cat'::tsquery( 'fat' | 'rat' ) & 'cat'
||tsqueryOR tsquerys together'fat | rat'::tsquery || 'cat'::tsquery( 'fat' | 'rat' ) | 'cat'
!!tsquerynegate a tsquery!! 'cat'::tsquery!'cat'
<->tsquerytsquery followed by tsqueryto_tsquery('fat') <-> to_tsquery('rat')'fat' <-> 'rat'
@>booleantsquery contains another ?'cat'::tsquery @> 'cat & rat'::tsqueryf
<@booleantsquery is contained in ?'cat'::tsquery <@ 'cat & rat'::tsqueryt

Note

The tsquery containment operators consider only the lexemes listed in the two queries, ignoring the combining operators.

In addition to the operators shown in the table, the ordinary B-tree comparison operators (=, <, etc) are defined for types tsvector and tsquery. These are not very useful for text searching but allow, for example, unique indexes to be built on columns of these types.

Table 9.42. Text Search Functions

FunctionReturn TypeDescriptionExampleResult
array_to_tsvector(text[])tsvectorconvert array of lexemes to tsvectorarray_to_tsvector('{fat,cat,rat}'::text[])'cat' 'fat' 'rat'
get_current_ts_config()regconfigget default text search configurationget_current_ts_config()english
length(tsvector)integernumber of lexemes in tsvectorlength('fat:2,4 cat:3 rat:5A'::tsvector)3
numnode(tsquery)integernumber of lexemes plus operators in tsquery numnode('(fat & rat) | cat'::tsquery)5
plainto_tsquery([configregconfig , ] querytext)tsqueryproduce tsquery ignoring punctuationplainto_tsquery('english', 'The Fat Rats')'fat' & 'rat'
phraseto_tsquery([configregconfig , ] querytext)tsqueryproduce tsquery that searches for a phrase, ignoring punctuationphraseto_tsquery('english', 'The Fat Rats')'fat' <-> 'rat'
websearch_to_tsquery([configregconfig , ] querytext)tsqueryproduce tsquery from a web search style querywebsearch_to_tsquery('english', '"fat rat" or rat')'fat' <-> 'rat' | 'rat'
querytree(querytsquery)textget indexable part of a tsqueryquerytree('foo & ! bar'::tsquery)'foo'
setweight(vectortsvector, weight"char")tsvectorassign weight to each element of vectorsetweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A')'cat':3A 'fat':2A,4A 'rat':5A
setweight(vectortsvector, weight"char", lexemestext[])tsvectorassign weight to elements of vector that are listed in lexemessetweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A', '{cat,rat}')'cat':3A 'fat':2,4 'rat':5A
strip(tsvector)tsvectorremove positions and weights from tsvectorstrip('fat:2,4 cat:3 rat:5A'::tsvector)'cat' 'fat' 'rat'
to_tsquery([configregconfig , ] querytext)tsquerynormalize words and convert to tsqueryto_tsquery('english', 'The & Fat & Rats')'fat' & 'rat'
to_tsvector([configregconfig , ] documenttext)tsvectorreduce document text to tsvectorto_tsvector('english', 'The Fat Rats')'fat':2 'rat':3
to_tsvector([configregconfig , ] documentjson(b))tsvector reduce each string value in the document to a tsvector, and then concatenate those in document order to produce a single tsvectorto_tsvector('english', '{"a": "The Fat Rats"}'::json)'fat':2 'rat':3
json(b)_to_tsvector([configregconfig, ] documentjson(b), filterjson(b))tsvector reduce each value in the document, specified by filter to a tsvector, and then concatenate those in document order to produce a single tsvector. filter is a jsonb array, that enumerates what kind of elements need to be included into the resulting tsvector. Possible values for filter are "string" (to include all string values), "numeric" (to include all numeric values in the string format), "boolean" (to include all Boolean values in the string format "true"/"false"), "key" (to include all keys) or "all" (to include all above). These values can be combined together to include, e.g., all string and numeric values. json_to_tsvector('english', '{"a": "The Fat Rats", "b": 123}'::json, '["string", "numeric"]')'123':5 'fat':2 'rat':3
ts_delete(vectortsvector, lexemetext)tsvectorremove given lexeme from vectorts_delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat')'cat':3 'rat':5A
ts_delete(vectortsvector, lexemestext[])tsvectorremove any occurrence of lexemes in lexemes from vectorts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat'])'cat':3
ts_filter(vectortsvector, weights"char"[])tsvectorselect only elements with given weights from vectorts_filter('fat:2,4 cat:3b rat:5A'::tsvector, '{a,b}')'cat':3B 'rat':5A
ts_headline([configregconfig, ] documenttext, querytsquery [, optionstext])textdisplay a query matchts_headline('x y z', 'z'::tsquery)x y <b>z</b>
ts_headline([configregconfig, ] documentjson(b), querytsquery [, optionstext])textdisplay a query matchts_headline('{"a":"x y z"}'::json, 'z'::tsquery){"a":"x y <b>z</b>"}
ts_rank([weightsfloat4[], ] vectortsvector, querytsquery [, normalizationinteger])float4rank document for queryts_rank(textsearch, query)0.818
ts_rank_cd([weightsfloat4[], ] vectortsvector, querytsquery [, normalizationinteger])float4rank document for query using cover densityts_rank_cd('{0.1, 0.2, 0.4, 1.0}', textsearch, query)2.01317
ts_rewrite(querytsquery, targettsquery, substitutetsquery)tsqueryreplace target with substitute within queryts_rewrite('a & b'::tsquery, 'a'::tsquery, 'foo|bar'::tsquery)'b' & ( 'foo' | 'bar' )
ts_rewrite(querytsquery, selecttext)tsqueryreplace using targets and substitutes from a SELECT commandSELECT ts_rewrite('a & b'::tsquery, 'SELECT t,s FROM aliases')'b' & ( 'foo' | 'bar' )
tsquery_phrase(query1tsquery, query2tsquery)tsquerymake query that searches for query1 followed by query2 (same as <-> operator)tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'))'fat' <-> 'cat'
tsquery_phrase(query1tsquery, query2tsquery, distanceinteger)tsquerymake query that searches for query1 followed by query2 at distance distancetsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10)'fat' <10> 'cat'
tsvector_to_array(tsvector)text[]convert tsvector to array of lexemestsvector_to_array('fat:2,4 cat:3 rat:5A'::tsvector){cat,fat,rat}
tsvector_update_trigger()triggertrigger function for automatic tsvector column updateCREATE TRIGGER ... tsvector_update_trigger(tsvcol, 'pg_catalog.swedish', title, body)
tsvector_update_trigger_column()triggertrigger function for automatic tsvector column updateCREATE TRIGGER ... tsvector_update_trigger_column(tsvcol, configcol, title, body)
unnest(tsvector, OUT lexemetext, OUT positionssmallint[], OUT weightstext)setof recordexpand a tsvector to a set of rowsunnest('fat:2,4 cat:3 rat:5A'::tsvector)(cat,{3},{D}) ...

Note

All the text search functions that accept an optional regconfig argument will use the configuration specified by default_text_search_config when that argument is omitted.

The functions in Table 9.43 are listed separately because they are not usually used in everyday text searching operations. They are helpful for development and debugging of new text search configurations.

Table 9.43. Text Search Debugging Functions

FunctionReturn TypeDescriptionExampleResult
ts_debug([configregconfig, ] documenttext, OUT aliastext, OUT descriptiontext, OUT tokentext, OUT dictionariesregdictionary[], OUT dictionaryregdictionary, OUT lexemestext[])setof recordtest a configurationts_debug('english', 'The Brightest supernovaes')(asciiword,"Word, all ASCII",The,{english_stem},english_stem,{}) ...
ts_lexize(dictregdictionary, tokentext)text[]test a dictionaryts_lexize('english_stem', 'stars'){star}
ts_parse(parser_nametext, documenttext, OUT tokidinteger, OUT tokentext)setof recordtest a parserts_parse('default', 'foo - bar')(1,foo) ...
ts_parse(parser_oidoid, documenttext, OUT tokidinteger, OUT tokentext)setof recordtest a parserts_parse(3722, 'foo - bar')(1,foo) ...
ts_token_type(parser_nametext, OUT tokidinteger, OUT aliastext, OUT descriptiontext)setof recordget token types defined by parserts_token_type('default')(1,asciiword,"Word, all ASCII") ...
ts_token_type(parser_oidoid, OUT tokidinteger, OUT aliastext, OUT descriptiontext)setof recordget token types defined by parserts_token_type(3722)(1,asciiword,"Word, all ASCII") ...
ts_stat(sqlquerytext, [weightstext, ] OUT wordtext, OUT ndocinteger, OUT nentryinteger)setof recordget statistics of a tsvector columnts_stat('SELECT vector from apod')(foo,10,15) ...

close