9.5. Binary String Functions and Operators
  This section describes functions and operators for examining and
    manipulating binary strings, that is values of type
  
   bytea
  
  .
    Many of these are equivalent, in purpose and syntax, to the
    text-string functions described in the previous section.
 
SQL defines some string functions that use key words, rather than commas, to separate arguments. Details are in Table 9.11 . PostgreSQL also provides versions of these functions that use the regular function invocation syntax (see Table 9.12 ).
Table 9.11. SQL Binary String Functions and Operators
Additional binary string manipulation functions are available and are listed in Table 9.12 . Some of them are used internally to implement the SQL -standard string functions listed in Table 9.11 .
Table 9.12. Other Binary String Functions
| Function Description Example(s) | 
|---|
| 
        
        
        
        
         Returns the number of bits set in the binary string (also known as " popcount " ). 
         | 
| 
        
        
         Extracts n'th bit from binary string. 
         | 
| 
        
        
         Extracts n'th byte from binary string. 
         | 
| Returns the number of bytes in the binary string. 
         | 
| 
         
        Returns the number of characters in the binary string, assuming
        that it is text in the given
        
          
         | 
| Computes the MD5 hash of the binary string, with the result written in hexadecimal. 
         | 
| 
        
        
         
        Sets
        
         n'th
        
        bit in
        binary string to
        
          
         | 
| 
        
        
         
        Sets
        
         n'th
        
        byte in
        binary string to
        
          
         | 
| Computes the SHA-224 hash of the binary string. 
         | 
| Computes the SHA-256 hash of the binary string. 
         | 
| Computes the SHA-384 hash of the binary string. 
         | 
| Computes the SHA-512 hash of the binary string. 
         | 
| 
        
        
         
        Extracts the substring of
        
          
         | 
  Functions
  
   get_byte
  
  and
  
   set_byte
  
  number the first byte of a binary string as byte 0.
   Functions
  
   get_bit
  
  and
  
   set_bit
  
  number bits from the right within each byte; for example bit 0 is the least
   significant bit of the first byte, and bit 15 is the most significant bit
   of the second byte.
 
  For historical reasons, the function
  
   md5
  
  returns a hex-encoded value of type
  
   text
  
  whereas the SHA-2
   functions return type
  
   bytea
  
  .  Use the functions
  
   
    encode
   
  
  and
  
   
    decode
   
  
  to
   convert between the two.  For example write
  
   encode(sha256('abc'),
   'hex')
  
  to get a hex-encoded text representation,
   or
  
   decode(md5('abc'), 'hex')
  
  to get
   a
  
   bytea
  
  value.
 
  
  
  
  
  Functions for converting strings between different character sets
   (encodings), and for representing arbitrary binary data in textual
   form, are shown in
  
   Table 9.13
  
  .  For these
   functions, an argument or result of type
  
   text
  
  is expressed
   in the database's default encoding, while arguments or results of
   type
  
   bytea
  
  are in an encoding named by another argument.
 
Table 9.13. Text/Binary String Conversion Functions
| Function Description Example(s) | 
|---|
| 
        
        
         
        Converts a binary string representing text in
       encoding
        
          
         | 
| 
        
        
         
        Converts a binary string representing text in
       encoding
        
          
         | 
| 
        
        
         
        Converts a
         
         | 
| 
        
        
         
        Encodes binary data into a textual representation; supported
        
          
         | 
| 
        
        
         
        Decodes binary data from a textual representation; supported
        
          
         | 
  The
  
   encode
  
  and
  
   decode
  
  functions support the following textual formats:
 
- base64 #
- 
    The base64format is that of RFC 2045 Section 6.8 . As per the RFC , encoded lines are broken at 76 characters. However instead of the MIME CRLF end-of-line marker, only a newline is used for end-of-line. Thedecodefunction ignores carriage-return, newline, space, and tab characters. Otherwise, an error is raised whendecodeis supplied invalid base64 data - including when trailing padding is incorrect.
- escape #
- 
    The escapeformat converts zero bytes and bytes with the high bit set into octal escape sequences (\nnn), and it doubles backslashes. Other byte values are represented literally. Thedecodefunction will raise an error if a backslash is not followed by either a second backslash or three octal digits; it accepts other byte values unchanged.
- hex #
- 
    The hexformat represents each 4 bits of data as one hexadecimal digit,0throughf, writing the higher-order digit of each byte first. Theencodefunction outputs thea-fhex digits in lower case. Because the smallest unit of data is 8 bits, there are always an even number of characters returned byencode. Thedecodefunction accepts thea-fcharacters in either upper or lower case. An error is raised whendecodeis given invalid hex data - including when given an odd number of characters.
  See also the aggregate function
  
   string_agg
  
  in
  
   Section 9.21
  
  and the large object functions
   in
  
   Section 33.4
  
  .