As of MarcEdit 4.5, Regular Expressions can now be utilized in the Replace
and Edit Subfield functions in the MarcEditor. This allows users to create complex
search and replacement functions. In general, MarcEdit's Regular Expression
implemention is fairly straightforward. First, MarcEdit uses a replace/with
structure, meaning that the regular expression must be broken into a pattern
and a replacement argument. Second, MarcEdit's implementation is slightly different
from the traditional unix greg implementation. For example, if there was a field
containing the following data:
aaabbb
And the user wanted the final output to look like:
aaabxxbb
In Unix, one might use the following regular expression:
/ab/\0xx/
Using the Replace Function, this same expression would be written like:
In the Find Text Textbox: ab
In the Replace With Textbox: \01xx
Check the Use Regular Expression option
In MarcEdit, regular expressions should use the format defined below:
Regular Expression Syntax:
| char definition: |
|
| Character Classes |
[ ] (square brackets) Identifies a user-defined class of characters, any of which will match: [abc] will match a, b, or c. Only three special metacharacters are recognized within a class definition, the caret (^) for complemented characters, the hyphen (-) for a range of characters, or one of the following \ backslash escape sequences: |
| Tags/sub-patterns |
( ) (parentheses) Parentheses are used to match a Tag, or sub-pattern, within the full search pattern, and remember the match. The matched sub-pattern can be retrieved later in the mask, or in a replace operation, with \01 through \99, based upon the left-to-right position of the opening parentheses. |
| Escaped characters |
|