F.42. pg_tsparser

pg_tsparser is a Postgres Pro extension for text search. This extension modifies the default text parsing strategy for words that include:

  • underscores

  • numbers and letters separated by the hyphen character

In addition to separate word parts returned by default, pg_tsparser also returns the whole word.

F.42.1. Installation and Setup

pg_tsparser is included into the Postgres Pro distribution. To enable pg_tsparser, once Postgres Pro is installed, create the pg_tsparser extension for each database you are planning to use:

 CREATE EXTENSION pg_tsparser; 

Once pg_tsparser is enabled, you can create your own text search configuration. In addition to pg_tsparser, you can use any available dictionary.

For example, you can create english_ts configuration for the English language, as follows:

 CREATE TEXT SEARCH CONFIGURATION english_ts ( PARSER = tsparser ); COMMENT ON TEXT SEARCH CONFIGURATION english_ts IS 'text search configuration for english language'; ALTER TEXT SEARCH CONFIGURATION english_ts ADD MAPPING FOR email, file, float, host, hword_numpart, int, numhword, numword, sfloat, uint, url, url_path, version WITH simple; ALTER TEXT SEARCH CONFIGURATION english_ts ADD MAPPING FOR asciiword, asciihword, hword_asciipart, word, hword, hword_part WITH english_stem; 

F.42.2. Examples

The following examples illustrate the difference in search results returned by pg_tsparser and the default parser:

 SELECT to_tsvector('english', 'pg_trgm') as def_parser, to_tsvector('english_ts', 'pg_trgm') as new_parser; def_parser | new_parser -----------------+----------------------------- 'pg':1 'trgm':2 | 'pg':2 'pg_trgm':1 'trgm':3 (1 row) SELECT to_tsvector('english', '123-abc') as def_parser, to_tsvector('english_ts', '123-abc') as new_parser; def_parser | new_parser -----------------+----------------------------- '123':1 'abc':2 | '123':2 '123-abc':1 'abc':3 (1 row) SELECT to_tsvector('english', 'rel-3.2-A') as def_parser, to_tsvector('english_ts', 'rel-3.2-A') as new_parser; def_parser | new_parser ------------------+------------------------------- '-3.2':2 'rel':1 | '3.2':3 'rel':2 'rel-3.2-a':1 (1 row) 

See Also

CREATE TEXT SEARCH CONFIGURATION

ALTER TEXT SEARCH CONFIGURATION

F.42.3. Authors

Postgres Professional, Moscow, Russia

close