正则表达式Python /组量词
作者:互联网
我想匹配一个看起来像目录的变量列表,例如:
Same/Same2/Foot/Ankle/Joint/Actuator/Sensor/Temperature/Value=4.123
Same/Same2/Battery/Name=SomeString
Same/Same2/Home/Land/Some/More/Stuff=0.34
“子目录”的长度是可变的,具有上限(高于它的9).
我想将每个子目录分组,除了我在上面命名为“Same”的第一个子目录.
我能想到的最好的是:
^(?:([^/]+)/){4,8}([^/]+)=(.*)
它已经查找了4-8个子目录,但只将最后一个分组.为什么?
使用组量词是否有更好的解决方案?
编辑:解决了.将使用split()代替.
解决方法:
import re
regx = re.compile('(?:(?<=\A)|(?<=/)).+?(?=/|\Z)')
for ss in ('Same/Same2/Foot/Ankle/Joint/Actuator/Sensor/Temperature/Value=4.123',
'Same/Same2/Battery/Name=SomeString',
'Same/Same2/Home/Land/Some/More/Stuff=0.34'):
print ss
print regx.findall(ss)
print
编辑1
现在您已经提供了有关要获取的内容的更多信息(_“Same / Same2 / Battery / Name = SomeString成为SAME2_BATTERY_NAME = SomeString”_)可以提出更好的解决方案:使用正则表达式或使用split(),replace()
import re
from os import sep
sep2 = r'\\' if sep=='\\' else '/'
pat = '^(?:.+?%s)(.+$)' % sep2
print 'pat==%s\n' % pat
ragx = re.compile(pat)
for ss in ('Same\Same2\Foot\Ankle\Joint\Actuator\Sensor\Temperature\Value=4.123',
'Same\Same2\Battery\Name=SomeString',
'Same\Same2\Home\Land\Some\More\Stuff=0.34'):
print ss
print ragx.match(ss).group(1).replace(sep,'_')
print ss.split(sep,1)[1].replace(sep,'_')
print
结果
pat==^(?:.+?\\)(.+$)
Same\Same2\Foot\Ankle\Joint\Actuator\Sensor\Temperature\Value=4.123
Same2_Foot_Ankle_Joint_Actuator_Sensor_Temperature_Value=4.123
Same2_Foot_Ankle_Joint_Actuator_Sensor_Temperature_Value=4.123
Same\Same2\Battery\Name=SomeString
Same2_Battery_Name=SomeString
Same2_Battery_Name=SomeString
Same\Same2\Home\Land\Some\More\Stuff=0.34
Same2_Home_Land_Some_More_Stuff=0.34
Same2_Home_Land_Some_More_Stuff=0.34
编辑2
重新阅读你的评论,我意识到我没有考虑到你想要在’=’符号之前但不在它之后的字符串部分的上部.
因此,这个新代码公开了3个满足此要求的方法.您将选择您喜欢的那个:
import re
from os import sep
sep2 = r'\\' if sep=='\\' else '/'
pot = '^(?:.+?%s)(.+?)=([^=]*$)' % sep2
print 'pot==%s\n' % pot
rogx = re.compile(pot)
pet = '^(?:.+?%s)(.+?(?==[^=]*$))' % sep2
print 'pet==%s\n' % pet
regx = re.compile(pet)
for ss in ('Same\Same2\Foot\Ankle\Joint\Sensor\Value=4.123',
'Same\Same2\Battery\Name=SomeString',
'Same\Same2\Ocean\Atlantic\North=',
'Same\Same2\Maths\Addition\\2+2=4\Simple=ohoh'):
print ss + '\n' + len(ss)*'-'
print 'rogx groups '.rjust(32),rogx.match(ss).groups()
a,b = ss.split(sep,1)[1].rsplit('=',1)
print 'split split '.rjust(32),(a,b)
print 'split split join upper replace %s=%s' % (a.replace(sep,'_').upper(),b)
print 'regx split group '.rjust(32),regx.match(ss.split(sep,1)[1]).group()
print 'regx split sub '.rjust(32),\
regx.sub(lambda x: x.group(1).replace(sep,'_').upper(), ss)
print
结果,在Windows平台上
pot==^(?:.+?\\)(.+?)=([^=]*$)
pet==^(?:.+?\\)(.+?(?==[^=]*$))
Same\Same2\Foot\Ankle\Joint\Sensor\Value=4.123
----------------------------------------------
rogx groups ('Same2\\Foot\\Ankle\\Joint\\Sensor\\Value', '4.123')
split split ('Same2\\Foot\\Ankle\\Joint\\Sensor\\Value', '4.123')
split split join upper replace SAME2_FOOT_ANKLE_JOINT_SENSOR_VALUE=4.123
regx split group Same2\Foot\Ankle\Joint\Sensor\Value
regx split sub SAME2_FOOT_ANKLE_JOINT_SENSOR_VALUE=4.123
Same\Same2\Battery\Name=SomeString
----------------------------------
rogx groups ('Same2\\Battery\\Name', 'SomeString')
split split ('Same2\\Battery\\Name', 'SomeString')
split split join upper replace SAME2_BATTERY_NAME=SomeString
regx split group Same2\Battery\Name
regx split sub SAME2_BATTERY_NAME=SomeString
Same\Same2\Ocean\Atlantic\North=
--------------------------------
rogx groups ('Same2\\Ocean\\Atlantic\\North', '')
split split ('Same2\\Ocean\\Atlantic\\North', '')
split split join upper replace SAME2_OCEAN_ATLANTIC_NORTH=
regx split group Same2\Ocean\Atlantic\North
regx split sub SAME2_OCEAN_ATLANTIC_NORTH=
Same\Same2\Maths\Addition\2+2=4\Simple=ohoh
-------------------------------------------
rogx groups ('Same2\\Maths\\Addition\\2+2=4\\Simple', 'ohoh')
split split ('Same2\\Maths\\Addition\\2+2=4\\Simple', 'ohoh')
split split join upper replace SAME2_MATHS_ADDITION_2+2=4_SIMPLE=ohoh
regx split group Same2\Maths\Addition\2+2=4\Simple
regx split sub SAME2_MATHS_ADDITION_2+2=4_SIMPLE=ohoh
标签:python,regex,regex-group,quantifiers 来源: https://codeday.me/bug/20190826/1734530.html