Regular Expression
Regular Expression
Reference
RegexOne - Learn Regular Expressions
Cheat Sheet
Value | Property |
---|---|
abc.. | Characters |
123… | Digits |
\d | Any Digit |
/D | Any Non-digit character |
. | Any Character |
. | Period |
[abc] | Only a, b or c |
[^abc] | Not a, b nor c |
[a-z] | Characters a to z |
[0-9] | Numbers 0 to 9 |
\w | Any alphanumeric character |
\W | Any Non-alphanumeric character |
{m} | m Repetitions |
{m,n} | m to n Repetitions |
* | Zero or more repetitions |
+ | One or more repetitions |
? | Optional character |
\s | Any whitespace |
\S | Any non-whitespace character |
^…$ | Starts and ends |
(…) | Capture Group |
(a(bc)) | Capture Sub-group |
(.*) | Capture all |
abc|def | Matches abc or def |
Common Mistakes
- Escape common character which are used in metacharacters
- Use ‘+’ and ‘*’ rather than upper and lower limits if possible
Tutorials
Lesson 1: An Introduction, and the ABCs
Problem
Go ahead and try writing a pattern that matches all three rows, it may be as simple as the common letters on each line.
Task | Text |
---|---|
Match | abc123xyz |
Match | define “123” |
Match | var g = 123; |
Solution: abc
Lesson 1.5: The 123s
Problem
Below are a few more lines of text containing digits. Try writing a pattern that matches all the digits in the strings below, and notice how your pattern matches anywhere within the string, not just starting at the first character. We will learn how to control this in a later lesson.
Task | Text |
---|---|
Match | abc123xyz |
Match | define “123” |
Match | var g = 123; |
Solution: 123
Lesson 2: The Dot
Problem
Below are a couple strings with varying characters but the same length. Try to write a single pattern that can match the first three strings, but not the last (to be skipped). You may find that you will have to escape the dot metacharacter to match the period in some of the lines.
Task | Text |
---|---|
Match | cat. |
Match | 896. |
Match | ?=+. |
Skip | abc1 |
Solution: ….
Lesson 3: Matching specific characters
Problem
Below are a couple lines, where we only want to match the first three strings, but not the last three strings. Notice how we can’t avoid matching the last three strings if we use the dot, but have to specifically define what letters to match using the notation above.
Task | Text |
---|---|
Match | can |
Match | man |
Match | fan |
Skip | dan |
Skip | ran |
Skip | pan |
Solution: [cmf]an
Lesson 4: Excluding specific characters
Problem
With the strings below, try writing a pattern that matches only the live animals (hog, dog, but not bog). Notice how most patterns of this type can also be written using the technique from the last lesson as they are really two sides of the same coin. By having both choices, you can decide which one is easier to write and understand when composing your own patterns.
Task | Text |
---|---|
Match | abc123xyz |
Match | define “123” |
Skip | bog |
Solution: [^b]og
Lesson 5: Character ranges
Problem
In the exercise below, notice how all the match and skip lines have a pattern, and use the bracket notation to match or skip each character from each line. Be aware that patterns are case sensitive and a-z differs from A-Z in terms of the characters it matches (lower vs upper case).
Task | Text |
---|---|
Match | Ana |
Match | Bob |
Match | Cpc |
Skip | aax |
Skip | bby |
Skip | ccz |
Solution: [A-C][n-p][a-c]
Lesson 6: Catching some zzz’s
Problem
In the lines below, the last string with only one z isn’t what we would consider a proper spelling of the slang “wazzup?”. Try writing a pattern that matches only the first two spellings by using the curly brace notation above.
Task | Text |
---|---|
Match | wazzzzzup |
Match | wazzzup |
Skip | wazup |
Solution: waz{3,5}up
Lesson 7: Mr. Kleene, Mr. Kleene
Problem
Below are a few simple strings that you can match using both the star and plus metacharacters.
Task | Text |
---|---|
Match | aaaabcc |
Match | aabbbbc |
Match | aacc |
Skip | a |
Solution: aa+b*c+
Lesson 8: Characters optional
Problem
In the strings below, notice how the the plurality of the word “file” depends on the number of files found. Try writing a pattern that uses the optionality metacharacter to match only the lines where one or more files were found.
Task | Text |
---|---|
Match | 1 file found? |
Match | 2 files found? |
Match | 24 files found? |
Skip | No files found. |
Solution: \d+ files? found\?
Lesson 9: All this whitespace
Problem
In the strings below, you’ll find that the content of each line is indented by some whitespace from the index of the line (the number is a part of the text to match). Try writing a pattern that can match each line containing whitespace characters between the number and the content. Notice that the whitespace characters are just like any other character and the special metacharacters like the star and the plus can be used as well.
Task | Text |
---|---|
Match | 1. abc |
Match | 2. abc |
Match | 3. abc |
Skip | 4.abc |
Solution: \d.\s+abc
Lesson 10: Starting and ending
Problem
Try to match each of the strings below using these new special characters.
Task | Text |
---|---|
Match | Mission: successful |
Skip | Last Mission: unsuccessful |
Skip | Next Mission: successful upon capture of target |
Solution: ^Mission: successful$
Lesson 11: Match groups
Problem
Go ahead and try to use this to write a regular expression that matches only the filenames (not including extension) of the PDF files below.
Task | Text | Capture Groups |
---|---|---|
Capture | file_record_transcript.pdf | file_record_transcript |
Capture | file_07241999.pdf | file_07241999 |
Skip | testfile_fake.pdf.tmp |
Solution: (file.+).pdf$
Lesson 12: Nested groups
Problem
For the following strings, write an expression that matches and captures both the full date, as well as the year of the date.
Task | Text | Capture Groups | Capture Groups |
---|---|---|---|
Capture | Jan 1987 | Jan 1987 | 1987 |
Capture | May 1969 | May 1969 | 1969 |
Capture | Aug 2011 | Aug 2011 | 2011 |
Solution: (\w+ (\d+))
Lesson 13: More group work
Problem
Below are a couple different common display resolutions, try to capture the width and height of each display.
Task | Text | Capture Groups | Capture Groups |
---|---|---|---|
Capture | 1280x720 | 1280 | 720 |
Capture | 1920x1600 | 1920 | 1600 |
Capture | 1024x768 | 1024 | 768 |
Solution: (\d+)x(\d+)
Lesson 14: It’s all conditional
Problem
Go ahead and try writing a conditional pattern that matches only the lines with small fuzzy creatures below.
Task | Text |
---|---|
Match | I love cats |
Match | I love dogs |
Match | I love logs |
Skip | I love cogs |
Solution: I love (cats|dogs)
Lesson 15: Other special characters
Task | Text |
---|---|
Match | The quick brown fox jumps over the lazy dog. |
Match | There were 614 instances of students getting 90.0% or above. |
Match | The FCC had to censor the network for saying &$#*@!. |