首页 > 编程语言> > python – pyparsing OneOrMore嵌入在其他OneOrMore中

python – pyparsing OneOrMore嵌入在其他OneOrMore中

2019-07-09 13:55:59 作者：互联网

我试图第一次使用pyparsing.
我的解析器没有做我希望它会做的事情,有人可以检查一下,看看有什么问题.我试图在OneOrMore中嵌入OneOrMore,我认为应该可以正常工作,但事实并非如此.

以下是整个代码：

import pyparsing

status = """
    sale number       : 11/7 
    NAME               ID    PAWN    PRICE    TIME         %C     STATE     START/STOP
    cross-cu-1       1055       1    106284K  07:49:36.19  25.05%   run          1d01h
    cross-cu-2        918       1    104708K  07:38:19.08  24.02%   run          1d01h
    sale number       : 11/8 
    NAME               ID    PAWN    PRICE    TIME         %C     STATE     START/STOP
    cross-cu-3       1055       1    106284K  07:49:36.19  25.05%   run          1d01h
    cross-cu-4        918       1    104708K  07:38:19.08  24.02%   run          1d01h
    """

integer = pyparsing.Word(pyparsing.nums).setParseAction(lambda toks: int(toks[0]))
decimal = pyparsing.Word(pyparsing.nums + ".").setParseAction(lambda toks: float(toks[0]))
wordSuppress = pyparsing.Suppress(pyparsing.Word(pyparsing.alphas))
endOfLine = pyparsing.LineEnd().suppress()
colon = pyparsing.Suppress(":")

saleNumber = pyparsing.Regex("\d{2}\/\d{1}").setResultsName("saleNumber")
lineSuppress = pyparsing.Regex("NAME.*STOP") + endOfLine
saleRow = wordSuppress + wordSuppress + colon + saleNumber + endOfLine

name = pyparsing.Regex("cross-cu-\d").setResultsName("name")
id = integer.setResultsName("id")
pawn = integer.setResultsName("pawn")
price = integer.setResultsName("price") + "K"
time = pyparsing.Regex("\d{2}:\d{2}:\d{2}.\d{2}").setResultsName("time")
c = decimal.setResultsName("c") + "%"
state = pyparsing.Word(pyparsing.alphas).setResultsName("state")
startStop = pyparsing.Word(pyparsing.alphanums).setResultsName("startStop")
row = name + id + pawn + price + time + c + state + startStop + endOfLine

table = pyparsing.OneOrMore(pyparsing.Group(saleRow + lineSuppress.suppress() + (pyparsing.OneOrMore(pyparsing.Group(row) | pyparsing.SkipTo(row).suppress())) ) | pyparsing.SkipTo(saleRow).suppress())

resultDic = [x.asDict() for x in table.parseString(status)]
print resultDic

它只返回[{‘saleNumber’：’11 / 7’}]
我希望得到一个这样的词典列表：

[{ {'saleNumber': '11/7'},{ elements in cross-cu-1 line, elements in cross-cu-2 line } },
 { {'saleNumber': '11/8'},{ elements in cross-cu-3 line, elements in cross-cu-4 line } }]

任何帮助表示赞赏！
请不要建议实现此输出的其他方法！我也想学习pyparsing！

解决方法:

在这种情况下,pyparsing可能是矫枉过正.为什么不直接读取文件然后解析结果？

代码如下所示：

编辑：我已更新代码以更密切地关注您的示例.

来自集合import defaultdict

status = """
sale number       : 11/7
NAME               ID    PAWN    PRICE    TIME         %C     STATE     START/STOP
cross-cu-1       1055       1    106284K  07:49:36.19  25.05%   run          1d01h
cross-cu-2        918       1    104708K  07:38:19.08  24.02%   run          1d01h
sale number       : 11/8
NAME               ID    PAWN    PRICE    TIME         %C     STATE     START/STOP
cross-cu-3       1055       1    106284K  07:49:36.19  25.05%   run          1d01h
cross-cu-4        918       1    104708K  07:38:19.08  24.02%   run          1d01h
"""

sale_number = ''

sales = defaultdict(list)

for line in status.split('\n'):
    line = line.strip()
    if line.startswith("NAME"):
         continue
    elif line.startswith("sale number"):
         sale_number = line.split(':')[1].strip()
    elif not line or line.isspace() :
         continue
    else:
         # you can also use a regular expression here
         sales[sale_number].append(line.split())

for sale in sales:
    print sale, sales[sale]

标签：python,parsing,python-2-7,pyparsing
来源： https://codeday.me/bug/20190709/1413730.html