注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

chunwaihome 的博客

 
 
 

日志

 
 

Regex Plugin  

2010-03-22 09:06:52|  分类: Powerpro Plugin |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

  • Introduction to usage:

Any given PowerPro plug-in can offer several different and even unrelated functionalities, called "services". The REGEX plug-in has four services: match, replace, matchg and replaceg. As explained in PowerPro's documentation, any plug-in is called this way:

plugin.service(arguments)

So there are four possible ways to call the REGEX plug-in, one for each service:

  regex.match(string, pattern, replace)

regex.replace(string, pattern, replace, "Output_var")

regex.matchg(string, pattern, replace)

regex.replaceg(string, pattern, replace, "Output_var")

 

注意第四個參數"Output_Var時常是可省略的,如果您只想測試正則表達式,你可能只需"Match" Services。但是,如果您想用正則表達式改變文字,此時您需要利用"Replace" services和第四個參數Output_var。其實Output_var是一個變量它儲存著Replace參數所得的文字,因此為什麼第四個參數只有"replace" Service才需要。

 

 Here is a very simple but practical example:

string = "Mary had a little lamb"             ;contain the string that you want to analyze
pattern = "lamb"
replace = "dog"
regex.replace(string, pattern, replace, "output")

output = "Mary had a little dog"

Plain English: take the phrase "Mary had a little lamb", take the first occurrence of the word "lamb" and replace it with the word "dog".

Yet another example, this time typing the strings directly instead of using variables:

regex.match("Mary had 52 lambs", "[0-9]+", "many", "result")

result = "Mary had many lambs"

 

The return code variable

Return Code儲存著Regex Plugin的回返值

Return_code= regex.match(string,pattern,replace)

Return Code變量是可選的,當您需要知道等號右邊的Regex命令是否正確它才會有用。

 

return_code contains the result of the expression's evaluation, but in the form of a return code, not a string. Here is the meaning of each possible code:

0   =   Full Match
1   =   Partial Match
2   =   No Match
8   =   Error: Invalid Pattern
9   =   Error: No Input


Remember that return_code is only referred to as such so make its role clear. Any other variable name could be used for that purpose, like result, myReturn or JustChecking

MATCH/MATCHG:

We use the match and match global services to check whether some text contains any sequence of characters, like words, phrases, numbers or symbols. You'd better keep two things in mind when using these services:

 

The first point is relevant if you just want to test the presence of some pattern in a string and don't need to know exactly what was matched. In other words, "If (myReturn lt 2)" would mean "If something matches...".

The second point is relevant if you want to use the portion of the string that matches pattern. That's due to the perhaps unexpected behavior of the REGEX plug-in, placing the replacement string (replace) in the "output" variable whenever a match is found. For example:

Plain English: take the phrase "Today is 24/09/2002, 03:28", check if it contains the pattern "[0-9]{4}", i.e. a sequence of exactly four digits, and obtain the matched portion so as to verify exactly what four-digit number was found.

string = "Today is 24/09/2002, 03:28"

pattern = "[0-9]{4}"

replace = ""

return_code = regex.match(string, pattern, replace, "output")

or:

return_code = regex.match("Today is 24/09/2002, 03:28", "[0-9]{4}", "", "output")

return_code =>  1 

output =>  

Hmmm... something is not right. Part of the string matches the pattern, so return_code is  1 . But the output is empty. Why? Because the replacement string replace is empty! Let's try again:

string = "Today is 24/09/2002, 03:28"

pattern = "[0-9]{4}"

replace = "XXX"

return_code = regex.match(string, pattern, replace, "output")

or:

return_code = regex.match("Today is 24/09/2002, 03:28", "[0-9]{4}", "XXX", "output")

return_code =>  1 

output => XXX

OK... so replace becomes output. But we're just matching. We're not replacing. Return code is  1  so we have a match, but what is the four-digit number that the plug-in found? How do we obtain the portion of the string that was matched?

There are two ways to achieve that, and both require that we use the match service as if it were a replacement service:

- First, an obvious workaround: we can (group) the whole pattern and get the match with the first backreference:

string = "Today is 24/09/2002, 03:28"

pattern = "([0-9]{4})"

replace = "\1"

return_code = regex.match(string, pattern, replace, "output")

or:

return_code = regex.match("Today is 24/09/2002, 03:28", "([0-9]{4})", "\1", "output")

return_code =>  1 

output => 2002

OK, now we found the four-digit number matched by "([0-9]{4})"! It's "2002"!

- the second way is not a workaround, but an actual mechanism provided by the REGEX plug-in: grouping or not the whole pattern, the match can always be obtained with the zeroth back reference:

string = "Today is 24/09/2002, 03:28"

pattern = "[0-9]{4}"

replace = "\0"

return_code = regex.match(string, pattern, replace, "output")

or:

return_code = regex.match("Today is 24/09/2002, 03:28", "[0-9]{4}", "\0", "output")

return_code =>  1 

output => 2002

Not very intuitive, but no rocket science either. Mystery solved. That's what documentation is for!

Note that although we use the match service as if it were a replacement service, no actual replacement has taken place yet. We use the replace argument, but we're not changing the original string at all, we're in fact just replacing whatever we place in replace with the matched portion, thus extracting the matched portion. Replacing involves changing part or all of the original string and getting back the entire string with the modifications. The match service will not output anything besides the matched portion only, its output never includes the rest of the string if it does not match.


But, wait! There is more! We still haven't tried the matchg service. What if we want to find all occurrences of a pattern that may occur more than once? Let's see:

Plain English: take the phrase "Today is 24/09/2002, 03:28", check if it contains the pattern "[0-9]{2}", i.e. a sequence of exactly two digits, and obtain the matched portion so as to verify exactly what two-digit number was found.

string = "Today is 24/09/2002, 03:28"

pattern = "[0-9]{2}"

replace = "\0"

return_code = regex.matchg(string, pattern, replace, "output")

or:

return_code = regex.matchg("Today is 24/09/2002, 03:28", "[0-9]{2}", "\0", "output")

return_code =>  1 

output => 240920020328

The REGEX plug-in finds 24, 09, 20, 02, 03 and 28 and displays them all in a sequence, in the order they are found. If you want to separate the matches, just add a space to the replacement string:

string = "Today is 24/09/2002, 03:28"

pattern = "[0-9]{2}"

replace = "\0 "

return_code = regex.matchg(string, pattern, replace, "output")

or:

return_code = regex.matchg("Today is 24/09/2002, 03:28", "[0-9]{2}", "\0 ", "output")

return_code =>  1 

output => 24 09 20 02 03 28

 




 

  评论这张
 
阅读(642)| 评论(0)
推荐 转载

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017