regex - Regular Expression for nested tags (Wikimedia content) -


has not regex in a while, and will rust slightly.

I am trying to extract categories from Wikipedia entry. What I need is a personal string contained in a pattern that starts with two open brackets and ends with two closed parentheses. is.

This query works most of the time -

  (? [? & Lt; grade & gt; * [^ \] #]) ( [\]]  

But there are problems when they have a comma (',') in the closing brackets.

Its unfortunate result is that when the following text is parse

  lower = = [[Seattle, Washington]], [[United States | United States]] |  

This category " "Removes the following for

  Seattle, Washington]], [[Joint United States | USA]  

Clearly, the comma is blocking it and it is getting the next set. The best way to capture each value between open and closed double brackets. What is the problem?

The problem is not a comma, the problem is that . * Match will be "]] [[" Just with something else * is greedy - it will be as matchable as possible. -Lalachi Sons Karan can use has been suggested that (as), or . * [^ \]] * - Change anything of the greedy match leaving closing the bracket should also do the trick.

In addition, these are not "nested" tags - this will be [[tags [[inside]]] tags]] . Probably not what you want because I do not think this means in Wikimedia markup.


Comments

Popular posts from this blog

c# - ListView onScroll event -

PHP - get image from byte array -

Linux Terminal Problem with Non-Canonical Terminal I/O app -