skip to main content

kiesler.at
YahooPipesRegex
Back to Page | Back to History

Difference between revisions

Version 3, 2008-06-27 11:29 Version 4, 2008-06-27 11:50
Lines 9 - 14 Lines 9 - 24
The RegEx module is one of the most powerful modules in Yahoo Pipes. You can do all kind of data transformations with it. This wiki page here would like to give you a short overview. The RegEx module is one of the most powerful modules in Yahoo Pipes. You can do all kind of data transformations with it. This wiki page here would like to give you a short overview.
   
**Please note**: Like in the Yahoo Pipes discussions, I put RegEx patterns within square brackets. That way, you can distinguish for example [] and [ ] easily. Please omit the square brackets unless noted otherwise. **Please note**: Like in the Yahoo Pipes discussions, I put RegEx patterns within square brackets. That way, you can distinguish for example [] and [ ] easily. Please omit the square brackets unless noted otherwise.
   
   
  +++ The Checkboxes
   
  What do the four checkboxes mean next to each RegEx line? The answer is taken from the [http://discuss.pipes.yahoo.com/Message_Boards_for_Pipes/threadview?bn=pip-DeveloperHelp&tid=3410&mid=3414 Yahoo Pipes Discussions].
   
  * **g** allow global matches. set=match every occurence; unset=match only first occurence.
  * **i** be case insensitive. set='A' equals 'a'; unset 'A' and 'a' are treated differently
  * **m** treat string as multiple lines. set='^' matches every start of string after a \n and/or \r . unset='^' matches only the very first character in the string.
  * **s** allow '.' to match new lines as well. set='.' matches '\n'. unset='.' does not match '\n'.
   
   
+++ Common patterns +++ Common patterns
Lines 29 - 39 Lines 39 - 53
   
In RegEx, some characters are "reserved". That means, they are not used literally, but instead used as functions. Examples: In RegEx, some characters are "reserved". That means, they are not used literally, but instead used as functions. Examples:
   
* [.] -- one arbitrary character * [.] -- one arbitrary character. if +s flag is set, this includes the new-line character (\n). if +s flag is unset, the dot does not include the new-line character.
* [\d] -- one digit. (0..9) * [\d] -- one digit. (0..9)
* [\n] -- new line, like in C * [\n] -- new line, like in C
* [\r] -- carriage return, like in C * [\r] -- carriage return, like in C
* [\s] -- one space character. Includes ' ' and tabs (\t) * [\s] -- one space character. Includes ' ' and tabs (\t)
   
  * [^] -- beginning of string. If +m flag is set, this matches every start of a line. a line is then defined as something at the very start of the string or something after a new line ('\n'). If +m flag is unset, this matches only the very first character of the string.
   
  * [$] -- end of string. If +m flag is set, this matches every end of a line. if +m flag is unset, this matches only the very last character of the string.
   
* [()] -- groups. You can use the groups matched in the replacement field. For example replace [(\d)] with [0$1] results in a leading zero added. * [()] -- groups. You can use the groups matched in the replacement field. For example replace [(\d)] with [0$1] results in a leading zero added.
* [[]] -- character groups. For example, [123] matches 1, 2 or 3. * [[]] -- character groups. For example, [123] matches 1, 2 or 3.
Lines 57 - 62 Lines 71 - 93
   
From a post in the [http://discuss.pipes.yahoo.com/Message_Boards_for_Pipes/threadview?m=te&bn=pip-DeveloperHelp&tid=4654&mid=4654&tof=26&frt=2#4654 Yahoo Pipes Discussion]. Sometimes, one of your field contains just an image URL. You'd like to replace that URL with an image tag, so it is rendered as an image. From a post in the [http://discuss.pipes.yahoo.com/Message_Boards_for_Pipes/threadview?m=te&bn=pip-DeveloperHelp&tid=4654&mid=4654&tof=26&frt=2#4654 Yahoo Pipes Discussion]. Sometimes, one of your field contains just an image URL. You'd like to replace that URL with an image tag, so it is rendered as an image.
   
* Replace [(.*)] with [ ] * Replace [(.*)] with []
   
   
  **Prefixing something**
   
  Sometimes, you'd like to add something in front of a field. For example, to add a "Yahoo: " in front of every title, you could
   
  * Replace [(.*)] with [Yahoo: $1]
   
  $1 matches the first group used (we have only one group in this example).
   
   
  **Translating dates**
   
  What, if you want to change a date of format mm/dd/yy to the ISO equivalent of yyyy-mm-dd ? You could use an expression like this one:
   
  * Replace [(\d\d)\/(\d\d)\/(\d\d)] with [20$3-$1-$2]
   
  Here, we have three groups. In the result, I also prefix a "20" as the year was specified only with two digits.