Pattern Matching makes it possible to search not only for particular characters, but also for types of characters such as “any digit” or for characters that meet special conditions such as "occurring at the beginning of a line”.
These generalized searches are performed by using “pattern matching codes” within the search string. Each pattern matching code consists of the special character “|” followed by another character, typically a mnemonic letter.
“
|” is the “pipe” character, which is
Shift-\ on the keyboard. All pattern matching codes begin with this character. Although the mnemonic letter can be in upper or lower case, for purposes of clarity, all examples show these letters in uppercase. Most of the pattern matching codes only have a special meaning in the search string; they have no meaning in the replacement string. If you need “variable” characters in the replacement string, you must use
regular expressions.
Examples of search strings using pattern matching
- |D|D
- Search for two consecutive digits
- |!|D|D|D|!|D
- Search for next two digit number
- |<note
- Search for a line beginning with the word "note"
- |W|>
- Search for whitespace (any number of spaces and tabs) at the end of a line
- t|A|A|An
- Search for any five letter word beginning in “t” and ending in “n”
- |000
- Search for the “Null” character (value 000)
Pattern Matching Codes
- |A
- Match any alphabetic letter, upper or lower case. It supports non-English letters, such as “umlauts”, if CONFIG > Search options > Support non-English characters has been enabled
- |B
- Match a blank - a single Space or Tab. See also |W and |X
- |C
- Match any Control Character - a character with an ASCII decimal value of 0 to 31
- |D
- Match any numeric digit - “0” through “9”. This code does not match “.” or “,”
- |F
- Match any alphanumeric character - a letter or a digit
- G
- Match any graphics character - characters with decimal value greater than 128. It is useful for finding stray graphics (8-bit) characters in a file.
- |Hhh
- Match the character with hexadecimal value ‘hh’. Both digits MUST be present. This code can also be used in the replacement string.
- |I
- Match any word separator - Space, Tab, any control character, or one of the additional configurable word separators defined by Config_String(WORD_SEP).
- |K
- Match any (non-standard) control character other than Tab, Carriage-Return and Line-Feed. It is useful for finding stray control characters in a file. See also |C and |G.
- |L
- Match the “newline” character(s) Carriage-Return and/or Line-Feed depending upon the file type. With Windows/DOS files, the Carriage-Return is optional. Also see |N and Pattern Matching the Newline.
- |M
- Match multiple characters - zero, one or more characters until the string following the |M is satisfied. Since the match may cover many lines, it may match a huge number of characters. Use |* instead, to match multiple characters on one line. This code is not generally not useful as the first item in a search string. See also |Y and the following sub-topic “Matching Multiple Characters”.
- |N
- Match the “newline” Carriage-Return and/or Line-Feed depending upon the file type. With Windows/DOS files, the Carriage-Return is mandatory. This code can also be used in the replacement string. See also |L and Pattern Matching the Newline.
- |Oooo
- Match the character with octal value ‘ooo’. Three digits MUST be present. This code can also be used in the replacement string.
- |P
- Match any parenthesis - { }, [ ], < > and ( ). ( Internally used by GOTO > Matching ( ) )
- |S
- Match any separator - a character which is not a letter, a digit or underscore “_”. Space, Tab and all control characters are separators. Graphics characters (value 128-255) are not separators.
- |T
- Match the ASCII Tab character (hex 09). This code can also be used in the replacement string.
- |U
- Match any uppercase letter. This pattern supports non-English letters, such as “umlauts”, if CONFIG > Search options > Support non-English characters has been enabled.
- |V
- Match any lowercase letter. See description for |U.
- |W
- Match “whitespace” - one or more Spaces and/or Tabs. See also |B and |X
- |X
- Match extended whitespace - one or more Spaces, Tabs, Carriage-Returns and/or Line-Feeds. See also |B and |W.
- |Y
- Match zero, one or more characters until the immediately following character or pattern matching code is satisfied. This code is not generally not useful as the first item in a search string. Also see |M, |* and Pattern Matching Multiple Characters.
- |ddd
- Match the character with decimal value ‘ddd’. This code can also be used in the replacement string.
- |000
- Match the "Null" character ( ASCII 0 ).
- |<
- Match the beginning of a line - the following matched characters must occur at the beginning of a line. A search for just |< does not match the End-Of-File.
- |>
- Match the end of a line - the preceding matched characters must occur at the end of a line. With CONFIG > File Handling > File type set to Record mode, it matches the end of a record.
Unlike “|L” and “|N”, “|<” and “|>” do not include the “newline” character(s) in the matched text. This is an important distinction when performing a replacement with “|L” and “|N” the newline character(s) will be replaced, with “|<” and “|>” they are not replaced. “|>” will match the End-Of-File.
- |*
- Match multiple characters on the same line - zero, one or more characters until the string following the |* is satisfied. However, unlike |M, all matched characters must be on the same line. This code is not generally not useful as the first item in a search string. See also “|Y” and Pattern Matching Multiple Characters.
- |?
- Match any single character; this is the simple “wildcard” similar to “?” in filenames.
- |!
- Match any character except the following character or pattern code. Use this code to exclude a certain character or type of character. For example, to search for “exam ” or “examiner” but not “exams”, use “exam|!s”. Think of |!x as “not x”.
- |@(r)
- Use the contents of text register ‘r’ in this position in the search string. This code can also be used in the replacement string.
- |{set}
- Match any one item in the “pattern set”. Learn more about Pattern Matching Sets.
- |[set]
- Match one optional occurrence of any item in the “pattern set”. This code is not meaningful as the first item in a search string. Learn more about Pattern Matching Sets.
- ||
- Match the “|” character. You need a double “||” to search for a single “|” in your text. A double “||” is also needed on the replacement side.
Pattern Matching
Related Resources