I have regex which prints between parenthesis, infact I need only specific parenthesis, i mean
car(skoda,audi)
bike(hayabusa)
I get output as: skoda audi hayabusa
To get the cars and bikes in parenthesis I used: (r'^(\S+)\((.*)\)$')
But i need to get only cars in 'car(...)' specifically, what to do?
I tried something like: (r'^car(\S+)\((.*)\)$')
and i need only skoda,audi
not hayabusa
I dont get output
Coding to use:
class Group:
def __init__(self):
self.members = []
self.text = []
with open('text1.txt') as f:
groups = collections.defaultdict(Group)
group_pattern = re.compile(r'^(\S+)\((.*)\)$') #<=here i am using
current_group = None
for line in f:
line = line.strip()
m = group_pattern.match(line)
if m: # this is a group definition line
group_name, group_members = m.groups()
groups[group_name].members.extend(group_members.split(','))
current_group = group_name
else:
if (current_group is not None) and (len(line) > 0):
groups[current_group].text.append(line)
for group_name, group in groups.items():
print "%s(%s)" % (group_name, ','.join(group.members))
print '\n'.join(group.text)
What's wrong with your code?
^car(\S+)\((.*)\)$
The reason why your code matches the expected strings are,
- You need to change
(\S+)
to(\S*)
because\S+
does a greedy match.That is it matches upto the last. so no capturing will occur.
Finally your regex would be,
^car(\S*)\((.*)\)$
Get your string which was present inside the group index 2.
>>> import re
>>> s = """car(skoda,audi)
... bike(hayabusa)"""
>>> regex = re.compile(r'^car\S*\((.*)\)$', re.M)
>>> m = regex.findall(s)
>>> m
['skoda,audi']