Posted at

Learn Regular Expression from Corey

正規表現で悩んでいるときに先生が勧めてくれたYouTube動画がすごく良かったので自分なりに内容をまとめました。

動画: Regular Expressions (Regex) Tutorial: How to Match Any Pattern of Text


Meta Characters (Need to be escaped)

.[{()/^$|?*+

If I want to search "Finxter.com", the way is Finxter\.com

If I want to search "Ha HaHa" with word boundary (\bHa), the The matches are "Ha Ha".

If I want to search numbers(321-555-4321 and 123.555.1234), the way is \d\d\d.\d\d\d.\d\d\d\d or \d\d\d[-.]\d\d\d[-.]\d\d\d\d.


TEL number example


  1. 321-555-4321

  2. 123.555.1234

  3. 123*555*1234

  4. 800-555-4321

  5. 5. 900-555-4321

\d\d\d.\d\d\d.\d\d\d\d

\d\d\d[-.]\d\d\d[-.]\d\d\d\d
[89]00[-.]\d\d\d[-.]\d\d\d\d
[1-5]
[a-fA-F]
[^a-g]


Word example


  1. cat

  2. mat

  3. pat

  4. bat



  • [^b]at: Get 1-3.


Meta Characters



  • .: Any Character Except New Line


  • \d: Digit (0-9)


  • \D: Not a Digit (0-9)


  • \w: Word Character (a-z, A-Z, 0-9, _)


  • \W: Not a Word Character


  • \s: Whitespace (space, tab, newline),


  • \S: Not Whitespace (space, tab, newline),


  • \b: Word Boundary


  • \B: Not a Word Boundary


  • ^: Beginning of a String


  • $: End of a String


  • []: Matches Characters in brackets


  • [^ ]: Matches Characters NOT in brackets


  • |: Either Or


  • (): Group


Quantifiers:



  • *(asterisk) - 0 or more


  • + - 1 or more


  • ? - 0 or one


  • {3} - Exact Number


  • {3,4} - Range of Numbers (Min, Max)


Name example


  1. Mr. Schafer

  2. Mr Smith

  3. Ms Davis

  4. Mrs. Robonson

  5. Mr. T



  • Mr\.? : Get 1, 2, 4, 5


  • Mr\.?\s[A-Z]: Get 1, 2, 5


  • Mr\.?\s[A-Z]\w+: Get 1, 2


  • Mr\.?\s[A-Z]\w*: Get 1, 2, 5


  • M(r|s|rs)\.?\s[A-Z]\w*: Get all 1-5


Email example


  1. CoreyMSchafer@gmail.com

  2. corey.schafer@university.edu

  3. corey-321-schafer@my-work.net



  • [a-zA-Z]+@[a-zA-Z]+\.com: Get 1


  • [a-zA-Z.]+@[a-zA-Z]+\.(com|edu): Get 1, 2


  • [a-zA-Z0-9.-]+@[a-zA-Z-]+\.(com|edu|net): Get all 1-3


  • [a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+: Get all 1-3


URL example


  1. https://www.google.com

  2. http://coreyms.com

  3. https://youtube.com

  4. https://www.nasa.gov



  • https?://(www\.)?(\w+)(\.\w+): Get all 1-4


  • Replace with Group 1: $1. The result is...

    Group 1: www.

    Group 1:

    Group 1:

    Group 1: www.



Replace with Group 2: $2. The result is...

Group 2: google 

Group 2: coreyms
Group 2: youtube
Group 2: nasa

Replace with Group 3: $3. The result is...

Group 3: .com 

Group 3: .com
Group 3: .com
Group 3: .gov

Replace with $2$3. The result is...

google.com

coreyms.com
youtube.com
nasa.gov