How to Parse fields using RegEx in Alteryx
- How-Tos FAQs
- March 8, 2021

While we are analyzing or extracting data from a database, it is often helpful to exploit regular expressions to parse data from a specific field. In Alteryx, there is a dedicated tool to write and use regular expressions to parse data. It is the RegEx tool. Lear how to Parse fields using RegEx in Alteryx.
RegEx tool is a powerful tool mainly used to parse , match or replace data exploiting regular-expression syntax.
Our starting field should appear as shown below.
We aim to get three different fields from the original one to parse the initial date , the name in the middle, and the VAT number at the end.
RegEx tool
The RegEx tool is one of the most powerful data parsing and text recognition tools.
In Alteryx, it is located in the Parse tab of the Tool Palette
- Connect a RegEx tool directly to your workflow
- In the first section of the configuration, panel select the field to parse.
- In the second section, write the regular expressions to parse data.
Use () to identify the marked groups (fields we want as output) and flag the ‘Case insensitive’ option
- In the third section, select the Parse method
You can also set names, data types, and sizes for any output column.
The syntax used for the marked groups:
- First one – Birth Date
(\d{2}\-\d{2}\-\d{4})
It says we want two digits, then a dash, two other digits, another dash, and four digits at the end.
Use {} after a special character to specify how many should be
Use () to define the marked group
- Second one – Name
(\l+)
It says we want any lowercase character (‘+’ means one or more).
It should not extract correctly what we want, but we have chosen ‘Case insensitive,’ so both upper and lowercase characters are included.
Use () to define the marked group
- Third one – VAT
(\d+)
It says we want any remaining digit (‘+’ means one or more).
Use () to define the marked group.
The ending dataset should appear as shown below
As you can see, the regular expressions used have correctly parsed all the three marked groups we have set: Birth Date, Name, and VAT.
We have seen how to use regular expression syntax to parse data from a specific field in Alteryx. RegEx tool is beneficial not only to parse data but also to check if a string matches a specific expression (match method), to replace a specific string in a field (replace method), to perform more complex data parsing (tokenize method).
Here the most basic syntax for regular expressions are reported:
. = any single character
\w = word character
\l = lowercase character
\u = uppercase character
\d = digit
\n = new line
\s = white space character
+ = one or more
[] = any specified character
() = marked group (i.e., the information you want to extract)
And that’s how to parse fields using RegEx in Alteryx!
Continue reading
Check out: How to impute a value for missing data in Alteryx