Regular Expressions

What are Regular Expressions?

Regular expressions provide a standard way to define rules that can be applied to text to see if it matches, and optionally replace that match with other text. These rules are simply text (known as a pattern) where certain characters have special meanings. Therefore a regex consists of a match pattern and possibly a replace pattern. Match patterns can contain “capture groups” that store parts of the matched text, which can then be reinserted (usually in a different order or location) via the replace pattern.

Regexes are found in a wide range of areas: in programming languages (to manipulate strings), in text editors (for searching/replacing), in system administration (eg, for parsing log files), and so on. They are generally used in four different ways:

RegexRenamer uses the last of these to match each filename in a directory and, if it matches, performs the replacement to obtain the new filename. When combined with using captures this quickly becomes a powerful way to rename groups of files.

Learning to use regexes

Knowing how to use regexes is a valuable skill that will be useful in any area to do with managing text. This help file contains a brief guide to help you get started:

These guides however, only focus on the areas of regexes that you need to know to use RegexRenamer. For example they don’t cover things like working with multiline text or non-printable characters. For more information and further learning there are many in-depth tutorials online, such as the one at Regular-Expressions.info. If you are really serious, it is generally agreed the ultimate reference is Mastering Regular Expressions by O’Reilly Books.

Regex versions

Regular expressions come in several versions, according to each implementation. The exact regex syntax in one “regex engine” may be slightly different to another. Usually the basics remain the same while advanced features are added or defined differently.

The regular expression engine used in RegexRenamer is the .NET 2.0 version, which is based on the perl5 standard. For the actual technical specifications of the syntax used, refer to Regular Expressions Language Elements in the Microsoft .NET Framework General Reference section of the MSDN Library.