Say I have a string s
containing letters and two delimiters 1
and 2
. I want to split the string in the following way:
- if a substring
t
falls between1
and2
, returnt
- otherwise, return each character
So if s = 'ab1cd2efg1hij2k'
, the expected output is ['a', 'b', 'cd', 'e', 'f', 'g', 'hij', 'k']
.
I tried to use regular expressions:
import re
s = 'ab1cd2efg1hij2k'
re.findall( r'(1([a-z]+)2|[a-z])', s )
[('a', ''),
('b', ''),
('1cd2', 'cd'),
('e', ''),
('f', ''),
('g', ''),
('1hij2', 'hij'),
('k', '')]
From there i can do [ x[x[-1]!=''] for x in re.findall( r'(1([a-z]+)2|[a-z])', s ) ]
to get my answer, but I still don't understand the output. The documentation says that findall
returns a list of tuples if the pattern has more than one group. However, my pattern only contains one group. Any explanation is welcome.
Best Solution
You pattern has two groups, the bigger group:
and the second smaller group which is a subset of your first group:
Here is a solution that gives you the expected result although mind you, it is really ugly and there is probably a better way. I just can't figure it out: