I want to find words starting with a single non-alphanumerical character, say
'$', in a string with
$Python $foo $any_word123
$$Python foo foo$bar
\bdoes not work
If the first character were to be alphanumerical, I could do this.
But this does not work for a pattern like
\b matches the empty string only between a
\w and a
# The line below matches only the last '$baz' which is the one that should not be matched re.findall(r'\b\$\w+', '$foo $bar x$baz').
The above outputs
['$baz'], but the desired pattern should output
I tried replacing
\b by a positive lookbehind with pattern
^|\s, but this does not work because lookarounds must be fixed in length.
What is the correct way to handle this pattern?pythonregex
One way is to use a negative lookbehind with the non-whitespace metacharacter
s = '$Python $foo foo$bar baz' re.findall(r'(?<!\S)\$\w+', s) # output: ['$Python', '$foo']
The following will match a word starting with a single non-alphanumerical character.
re.findall(r''' (?: # start non-capturing group ^ # start of string | # or \s # space character ) # end non-capturing group ( # start capturing group [^\w\s] # character that is not a word or space character \w+ # one or more word characters ) # end capturing group ''', s, re.X)
re.findall(r'(?:^|\s)([^\w\s]\w+)', s, re.X)
'$a $b a$c $$d' -> ['$a', '$b']