F.43. unaccent
  
   unaccent
  
  is a text search dictionary that removes accents
  (diacritic signs) from lexemes.
  It's a filtering dictionary, which means its output is
  always passed to the next dictionary (if any), unlike the normal
  behavior of dictionaries.  This allows accent-insensitive processing
  for full text search.
 
  The current implementation of
  
   unaccent
  
  cannot be used as a
  normalizing dictionary for the
  
   thesaurus
  
  dictionary.
 
  This module is considered
  
   "
   
    trusted
   
   "
  
  , that is, it can be
  installed by non-superusers who have
  
   CREATE
  
  privilege
  on the current database.
 
F.43.1. Configuration
   An
   
    unaccent
   
   dictionary accepts the following options:
  
- 
     RULESis the base name of the file containing the list of translation rules. This file must be stored in$SHAREDIR/tsearch_data/(where$SHAREDIRmeans the PostgreSQL installation's shared-data directory). Its name must end in.rules(which is not to be included in theRULESparameter).
The rules file has the following format:
- 
     Each line represents one translation rule, consisting of a character with accent followed by a character without accent. The first is translated into the second. For example, À A Á A Â A Ã A Ä A Å A Æ AE The two characters must be separated by whitespace, and any leading or trailing whitespace on a line is ignored. 
- 
     Alternatively, if only one character is given on a line, instances of that character are deleted; this is useful in languages where accents are represented by separate characters. 
- 
     Actually, each " character " can be any string not containing whitespace, so unaccentdictionaries could be used for other sorts of substring substitutions besides diacritic removal.
- 
     As with other PostgreSQL text search configuration files, the rules file must be stored in UTF-8 encoding. The data is automatically translated into the current database's encoding when loaded. Any lines containing untranslatable characters are silently ignored, so that rules files can contain rules that are not applicable in the current encoding. 
   A more complete example, which is directly useful for most European
   languages, can be found in
   
    unaccent.rules
   
   , which is installed
   in
   
    $SHAREDIR/tsearch_data/
   
   when the
   
    unaccent
   
   module is installed.  This rules file translates characters with accents
   to the same characters without accents, and it also expands ligatures
   into the equivalent series of simple characters (for example, Æ to
   AE).
  
F.43.2. Usage
   Installing the
   
    unaccent
   
   extension creates a text
   search template
   
    unaccent
   
   and a dictionary
   
    unaccent
   
   based on it.  The
   
    unaccent
   
   dictionary has the default
   parameter setting
   
    RULES='unaccent'
   
   , which makes it immediately
   usable with the standard
   
    unaccent.rules
   
   file.
   If you wish, you can alter the parameter, for example
  
mydb=# ALTER TEXT SEARCH DICTIONARY unaccent (RULES='my_rules');
or create new dictionaries based on the template.
To test the dictionary, you can try:
mydb=# select ts_lexize('unaccent','Hôtel');
 ts_lexize
-----------
 {Hotel}
(1 row)
  
   Here is an example showing how to insert the
   
    unaccent
   
   dictionary into a text search configuration:
  
mydb=# CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
mydb=# ALTER TEXT SEARCH CONFIGURATION fr
        ALTER MAPPING FOR hword, hword_part, word
        WITH unaccent, french_stem;
mydb=# select to_tsvector('fr','Hôtels de la Mer');
    to_tsvector
-------------------
 'hotel':1 'mer':4
(1 row)
mydb=# select to_tsvector('fr','Hôtel de la Mer') @@ to_tsquery('fr','Hotels');
 ?column?
----------
 t
(1 row)
mydb=# select ts_headline('fr','Hôtel de la Mer',to_tsquery('fr','Hotels'));
      ts_headline
------------------------
 Hôtel de la Mer
(1 row)
  
F.43.3. Functions
   The
   
    unaccent()
   
   function removes accents (diacritic signs) from
  a given string.  Basically, it's a wrapper around
   
    unaccent
   
   -type dictionaries, but it can be used outside normal
  text search contexts.
  
unaccent([dictionaryregdictionary, ]stringtext) returnstext
   If the
   
    
     dictionary
    
   
   argument is
  omitted, the text search dictionary named
   
    unaccent
   
   and
  appearing in the same schema as the
   
    unaccent()
   
   function itself is used.
  
For example:
SELECT unaccent('unaccent', 'Hôtel');
SELECT unaccent('Hôtel');