Regular Expression

Reference

RegexOne - Learn Regular Expressions

Cheat Sheet

Value Property
abc.. Characters
123… Digits
\d Any Digit
/D Any Non-digit character
. Any Character
. Period
[abc] Only a, b or c
[^abc] Not a, b nor c
[a-z] Characters a to z
[0-9] Numbers 0 to 9
\w Any alphanumeric character
\W Any Non-alphanumeric character
{m} m Repetitions
{m,n} m to n Repetitions
* Zero or more repetitions
+ One or more repetitions
? Optional character
\s Any whitespace
\S Any non-whitespace character
^…$ Starts and ends
(…) Capture Group
(a(bc)) Capture Sub-group
(.*) Capture all
abc|def Matches abc or def

Common Mistakes

  1. Escape common character which are used in metacharacters
  2. Use ‘+’ and ‘*’ rather than upper and lower limits if possible

Tutorials

Lesson 1: An Introduction, and the ABCs

Problem

Go ahead and try writing a pattern that matches all three rows, it may be as simple as the common letters on each line.

Task Text
Match abc123xyz
Match define “123”
Match var g = 123;

Solution: abc

Lesson 1.5: The 123s

Problem

Below are a few more lines of text containing digits. Try writing a pattern that matches all the digits in the strings below, and notice how your pattern matches anywhere within the string, not just starting at the first character. We will learn how to control this in a later lesson.

Task Text
Match abc123xyz
Match define “123”
Match var g = 123;

Solution: 123

Lesson 2: The Dot

Problem

Below are a couple strings with varying characters but the same length. Try to write a single pattern that can match the first three strings, but not the last (to be skipped). You may find that you will have to escape the dot metacharacter to match the period in some of the lines.

Task Text
Match cat.
Match 896.
Match ?=+.
Skip abc1

Solution: ….

Lesson 3: Matching specific characters

Problem

Below are a couple lines, where we only want to match the first three strings, but not the last three strings. Notice how we can’t avoid matching the last three strings if we use the dot, but have to specifically define what letters to match using the notation above.

Task Text
Match can
Match man
Match fan
Skip dan
Skip ran
Skip pan

Solution: [cmf]an

Lesson 4: Excluding specific characters

Problem

With the strings below, try writing a pattern that matches only the live animals (hog, dog, but not bog). Notice how most patterns of this type can also be written using the technique from the last lesson as they are really two sides of the same coin. By having both choices, you can decide which one is easier to write and understand when composing your own patterns.

Task Text
Match abc123xyz
Match define “123”
Skip bog

Solution: [^b]og

Lesson 5: Character ranges

Problem

In the exercise below, notice how all the match and skip lines have a pattern, and use the bracket notation to match or skip each character from each line. Be aware that patterns are case sensitive and a-z differs from A-Z in terms of the characters it matches (lower vs upper case).

Task Text
Match Ana
Match Bob
Match Cpc
Skip aax
Skip bby
Skip ccz

Solution: [A-C][n-p][a-c]

Lesson 6: Catching some zzz’s

Problem

In the lines below, the last string with only one z isn’t what we would consider a proper spelling of the slang “wazzup?”. Try writing a pattern that matches only the first two spellings by using the curly brace notation above.

Task Text
Match wazzzzzup
Match wazzzup
Skip wazup

Solution: waz{3,5}up

Lesson 7: Mr. Kleene, Mr. Kleene

Problem

Below are a few simple strings that you can match using both the star and plus metacharacters.

Task Text
Match aaaabcc
Match aabbbbc
Match aacc
Skip a

Solution: aa+b*c+

Lesson 8: Characters optional

Problem

In the strings below, notice how the the plurality of the word “file” depends on the number of files found. Try writing a pattern that uses the optionality metacharacter to match only the lines where one or more files were found.

Task Text
Match 1 file found?
Match 2 files found?
Match 24 files found?
Skip No files found.

Solution: \d+ files? found\?

Lesson 9: All this whitespace

Problem

In the strings below, you’ll find that the content of each line is indented by some whitespace from the index of the line (the number is a part of the text to match). Try writing a pattern that can match each line containing whitespace characters between the number and the content. Notice that the whitespace characters are just like any other character and the special metacharacters like the star and the plus can be used as well.

Task Text
Match 1. abc
Match 2. abc
Match 3. abc
Skip 4.abc

Solution: \d.\s+abc

Lesson 10: Starting and ending

Problem

Try to match each of the strings below using these new special characters.

Task Text
Match Mission: successful
Skip Last Mission: unsuccessful
Skip Next Mission: successful upon capture of target

Solution: ^Mission: successful$

Lesson 11: Match groups

Problem

Go ahead and try to use this to write a regular expression that matches only the filenames (not including extension) of the PDF files below.

Task Text Capture Groups
Capture file_record_transcript.pdf file_record_transcript
Capture file_07241999.pdf file_07241999
Skip testfile_fake.pdf.tmp  

Solution: (file.+).pdf$

Lesson 12: Nested groups

Problem

For the following strings, write an expression that matches and captures both the full date, as well as the year of the date.

Task Text Capture Groups Capture Groups
Capture Jan 1987 Jan 1987 1987
Capture May 1969 May 1969 1969
Capture Aug 2011 Aug 2011 2011

Solution: (\w+ (\d+))

Lesson 13: More group work

Problem

Below are a couple different common display resolutions, try to capture the width and height of each display.

Task Text Capture Groups Capture Groups
Capture 1280x720 1280 720
Capture 1920x1600 1920 1600
Capture 1024x768 1024 768

Solution: (\d+)x(\d+)

Lesson 14: It’s all conditional

Problem

Go ahead and try writing a conditional pattern that matches only the lines with small fuzzy creatures below.

Task Text
Match I love cats
Match I love dogs
Match I love logs
Skip I love cogs

Solution: I love (cats|dogs)

Lesson 15: Other special characters

Task Text
Match The quick brown fox jumps over the lazy dog.
Match There were 614 instances of students getting 90.0% or above.
Match The FCC had to censor the network for saying &$#*@!.

Solution: .*