c – 无法使用Boost Spirit语法来使用std :: map <>的已知键
作者:互联网
我似乎正在经历一些带有提升精神的精神障碍,我无法忍受.我有一个相当简单的语法,我需要处理,我想将值放入一个结构,包含一个std :: map<>作为其中一个成员.这些对的关键名称是预先知道的,因此只允许使用这些名称.地图中可以有一对多的按键,按任意顺序通过qi验证每个按键名称.
作为一个例子,语法看起来像这样.
test .|*|<hostname> add|modify|save ( key [value] key [value] ... ) ;
//
test . add ( a1 ex00
a2 ex01
a3 "ex02,ex03,ex04" );
//
test * modify ( m1 ex10
m2 ex11
m3 "ex12,ex13,ex14"
m4 "abc def ghi" );
//
test 10.0.0.1 clear ( c1
c2
c3 );
在这个例子中,“add”的键是a1,a2和a3,同样用于“修改”m1,m2,m3和m4,每个键必须包含一个值.对于“清除”,地图c1,c2和c3的键可能不包含值.另外,假设在这个例子中你最多可以有10个键(a1 … a11,m1 …… m11和c1 … c11),它们的任何组合都可以按任何顺序用于相应的动作.这意味着您不能将已知的密钥cX用于“添加”或mX用于“清除”
结构遵循这种简单的模式
//
struct test
{
std::string host;
std::string action;
std::map<std::string,std::string> option;
}
所以从上面的例子中,我希望结构包含…
// add ...
test.host = .
test.action = add
test.option[0].first = a1
test.option[0].second = ex00
test.option[1].first = a2
test.option[1].second = ex01
test.option[2].first = a3
test.option[2].second = ex02,ex03,ex04
// modify ...
test.host = *
test.action = modify
test.option[0].first = m1
test.option[0].second = ex10
test.option[1].first = m2
test.option[1].second = ex11
test.option[2].first = m3
test.option[2].second = ex12,ex13,ex14
test.option[2].first = m3
test.option[2].second = abc def ghi
// clear ...
test.host = *
test.action = 10.0.0.1
test.option[0].first = c1
test.option[0].second =
test.option[1].first = c2
test.option[1].second =
test.option[2].first = c3
test.option[2].second =
我可以让每个独立的部分独立工作,但我似乎无法让他们一起工作.例如,我有主机和操作没有地图<>.
我已经改编了一个先前发布的Sehe(here)示例试图使其工作(BTW:Sehe有一些很棒的例子,我一直在使用文档).
这是摘录(显然不起作用),但至少显示我要去的地方.
namespace ast {
namespace qi = boost::spirit::qi;
//
using unused = qi::unused_type;
//
using string = std::string;
using strings = std::vector<string>;
using list = strings;
using pair = std::pair<string, string>;
using map = std::map<string, string>;
//
struct test
{
using preference = std::map<string,string>;
string host;
string action;
preference option;
};
}
//
BOOST_FUSION_ADAPT_STRUCT( ast::test,
( std::string, host )
( std::string, action ) )
( ast::test::preference, option ) )
//
namespace grammar
{
//
template <typename It>
struct parser
{
//
struct skip : qi::grammar<It>
{
//
skip() : skip::base_type( text )
{
using namespace qi;
// handle all whitespace (" ", \t, ...)
// along with comment lines/blocks
//
// comment blocks: /* ... */
// // ...
// -- ...
// # ...
text = ascii::space
| ( "#" >> *( char_ - eol ) >> ( eoi | eol ) ) // line comment
| ( "--" >> *( char_ - eol ) >> ( eoi | eol ) ) // ...
| ( "//" >> *( char_ - eol ) >> ( eoi | eol ) ) // ...
| ( "/*" >> *( char_ - "*/" ) >> "*/" ); // block comment
//
BOOST_SPIRIT_DEBUG_NODES( ( text ) )
}
//
qi::rule<It> text;
};
//
struct token
{
//
token()
{
using namespace qi;
// common
string = '"' >> *("\\" >> char_ | ~char_('"')) >> '"';
identity = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");
real = double_;
integer = int_;
//
value = ( string | identity );
// ip target
any = '*';
local = ( char_('.') | fqdn );
fqdn = +char_("a-zA-Z0-9.\\-" ); // consession
ipv4 = +as_string[ octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
>> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
>> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
>> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] ];
//
target = ( any | local | fqdn | ipv4 );
//
pair = identity >> -( attr( ' ' ) >> value );
map = pair >> *( attr( ' ' ) >> pair );
list = *( value );
//
BOOST_SPIRIT_DEBUG_NODES( ( string )
( identity )
( value )
( real )
( integer )
( any )
( local )
( fqdn )
( ipv4 )
( target )
( pair )
( keyval )
( map )
( list ) )
}
//
qi::rule<It, std::string()> string;
qi::rule<It, std::string()> identity;
qi::rule<It, std::string()> value;
qi::rule<It, double()> real;
qi::rule<It, int()> integer;
qi::uint_parser<unsigned, 10, 1, 3> octet;
qi::rule<It, std::string()> any;
qi::rule<It, std::string()> local;
qi::rule<It, std::string()> fqdn;
qi::rule<It, std::string()> ipv4;
qi::rule<It, std::string()> target;
//
qi::rule<It, ast::map()> map;
qi::rule<It, ast::pair()> pair;
qi::rule<It, ast::pair()> keyval;
qi::rule<It, ast::list()> list;
};
//
struct test : token, qi::grammar<It, ast::test(), skip>
{
//
test() : test::base_type( command_ )
{
using namespace qi;
using namespace qr;
auto kw = qr::distinct( copy( char_( "a-zA-Z0-9_" ) ) );
// not sure how to enforce the "key" names!
key_ = *( '(' >> *value >> ')' );
// tried using token::map ... didn't work ...
//
add_ = ( ( "add" >> attr( ' ' ) ) [ _val = "add" ] );
modify_ = ( ( "modify" >> attr( ' ' ) ) [ _val = "modify" ] );
clear_ = ( ( "clear" >> attr( ' ' ) ) [ _val = "clear" ] );
//
action_ = ( add_ | modify_ | clear_ );
/* *** can't get from A to B here ... not sure what to do *** */
//
command_ = kw[ "test" ]
>> target
>> action_
>> ';';
BOOST_SPIRIT_DEBUG_NODES( ( command_ )
( action_ )
( add_ )
( modify_ )
( clear_ ) )
}
//
private:
//
using token::value;
using token::target;
using token::map;
qi::rule<It, ast::test(), skip> command_;
qi::rule<It, std::string(), skip> action_;
//
qi::rule<It, std::string(), skip> add_;
qi::rule<It, std::string(), skip> modify_;
qi::rule<It, std::string(), skip> clear_;
};
...
};
}
我希望这个问题不是太模糊,如果你需要一个问题的实例,我当然可以提供.非常感谢任何和所有帮助,所以提前谢谢!
解决方法:
笔记:
>这个
add_ = ( ( "add" >> attr( ' ' ) ) [ _val = "add" ] );
modify_ = ( ( "modify" >> attr( ' ' ) ) [ _val = "modify" ] );
clear_ = ( ( "clear" >> attr( ' ' ) ) [ _val = "clear" ] );
你的意思是要求一个空间?或者你真的只是试图强制struct action字段包含一个尾随空格(即将发生的事情).
如果你的意思是后者,我会在解析器之外做到这一点¹.
如果您想要第一个,请使用kw工具:
add_ = kw["add"] [ _val = "add" ];
modify_ = kw["modify"] [ _val = "modify" ];
clear_ = kw["clear"] [ _val = "clear" ];
事实上,你可以简化(再次,¹):
add_ = raw[ kw["add"] ];
modify_ = raw[ kw["modify"] ];
clear_ = raw[ kw["clear"] ];
这也意味着你可以简化到
action_ = raw[ kw[lit("add")|"modify"|"clear"] ];
但是,稍微接近你的问题,你也可以使用symbol parser:
symbols<char> action_sym;
action_sym += "add", "modify", "clear";
//
action_ = raw[ kw[action_sym] ];
Caveat: the symbols needs to be a member so its lifetime extends beyond the constructor.
>如果您打算捕获ipv4地址的输入表示
ipv4 = +as_string[ octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
>> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
>> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
>> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] ];
Side note I’m assuming
+as_string
is a simple mistake and you meantas_string
instead.
简化:
qi::uint_parser<uint8_t, 10, 1, 3> octet;
这避免了范围检查(再次见¹):
ipv4 = as_string[ octet >> '.' >> octet >> '.' >> octet >> '.' >> octet ];
但是,这将构建地址的4字符二进制字符串表示.如果你想要,那很好.我怀疑它(因为你已经写过std :: array< uint8_t,4>或uint64_t,对吧?).所以如果你想要字符串,再次使用raw []:
ipv4 = raw[ octet >> '.' >> octet >> '.' >> octet >> '.' >> octet ];
>与1号相同的问题:
pair = identity >> -( attr(' ') >> value );
这一次,问题背叛了制作不应该是令人难忘的;从概念上来说,令牌化是在解析之前,因此我会保持令牌船长.在这种情况下,kw并没有真正做很多好事.相反,我将对,映射和列表(未使用?)移动到解析器中:
pair = kw[identity] >> -value;
map = +pair;
list = *value;
一些例子
最近有一个关于使用符号解析的例子(here),但这个答案更接近你的问题:
> How to provider user with autocomplete suggestions for given boost::spirit grammar?
它远远超出了解析器的范围,因为它在语法中执行各种操作,但它显示的是具有通用的“lookup-ish”规则,可以使用特定的“符号集”进行参数化:请参阅Identifier Lookup section答案:
Identifier Lookup
We store “symbol tables” in
Domain
members_variables
and
_functions
:060011
Then we declare some rules that can do lookups on either of them:
060012
unknown;
The corresponding declarations will be shown shortly.
Variables are pretty simple:
060013
Calls are trickier. If a name is unknown we don’t want to assume it
implies a function unless it’s followed by a'('
character.
However, if an identifier is a known function name, we want even to
imply the(
(this gives the UX the appearance of autocompletion
where when the user typessqrt
, it suggests the next character to be
(
magically).060014
known(phx::ref(_functions)) // known -> imply the parens
| &(identifier >> ‘(‘) >> unknown(phx::ref(_functions))
) >> implied(‘(‘) >> -(expression % ‘,’) >> implied(‘)’);It all builds on
known
,unknown
andmaybe_known
:060015
我认为你可以在这里建设性地使用相同的方法.另外一个噱头可能是验证提供的属性实际上是唯一的.
演示工作
结合上面的所有提示使它编译并“解析”测试命令:
#include <string>
#include <map>
#include <vector>
namespace ast {
//
using string = std::string;
using strings = std::vector<string>;
using list = strings;
using pair = std::pair<string, string>;
using map = std::map<string, string>;
//
struct command {
string host;
string action;
map option;
};
}
#include <boost/fusion/adapted.hpp>
BOOST_FUSION_ADAPT_STRUCT(ast::command, host, action, option)
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/repository/include/qi_distinct.hpp>
namespace grammar
{
namespace qi = boost::spirit::qi;
namespace qr = boost::spirit::repository::qi;
template <typename It>
struct parser
{
struct skip : qi::grammar<It> {
skip() : skip::base_type(text) {
using namespace qi;
// handle all whitespace along with line/block comments
text = ascii::space
| (lit("#")|"--"|"//") >> *(char_ - eol) >> (eoi | eol) // line comment
| "/*" >> *(char_ - "*/") >> "*/"; // block comment
//
BOOST_SPIRIT_DEBUG_NODES((text))
}
private:
qi::rule<It> text;
};
//
struct token {
//
token() {
using namespace qi;
// common
string = '"' >> *("\\" >> char_ | ~char_('"')) >> '"';
identity = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");
value = string | identity;
// ip target
any = '*';
local = '.' | fqdn;
fqdn = +char_("a-zA-Z0-9.\\-"); // concession
ipv4 = raw [ octet >> '.' >> octet >> '.' >> octet >> '.' >> octet ];
//
target = any | local | fqdn | ipv4;
//
BOOST_SPIRIT_DEBUG_NODES(
(string) (identity) (value)
(any) (local) (fqdn) (ipv4) (target)
)
}
protected:
//
qi::rule<It, std::string()> string;
qi::rule<It, std::string()> identity;
qi::rule<It, std::string()> value;
qi::uint_parser<uint8_t, 10, 1, 3> octet;
qi::rule<It, std::string()> any;
qi::rule<It, std::string()> local;
qi::rule<It, std::string()> fqdn;
qi::rule<It, std::string()> ipv4;
qi::rule<It, std::string()> target;
};
//
struct test : token, qi::grammar<It, ast::command(), skip> {
//
test() : test::base_type(command_)
{
using namespace qi;
auto kw = qr::distinct( copy( char_( "a-zA-Z0-9_" ) ) );
//
action_sym += "add", "modify", "clear";
action_ = raw[ kw[action_sym] ];
//
command_ = kw["test"]
>> target
>> action_
>> '(' >> map >> ')'
>> ';';
//
pair = kw[identity] >> -value;
map = +pair;
list = *value;
BOOST_SPIRIT_DEBUG_NODES(
(command_) (action_)
(pair) (map) (list)
)
}
private:
using token::target;
using token::identity;
using token::value;
qi::symbols<char> action_sym;
//
qi::rule<It, ast::command(), skip> command_;
qi::rule<It, std::string(), skip> action_;
//
qi::rule<It, ast::map(), skip> map;
qi::rule<It, ast::pair(), skip> pair;
qi::rule<It, ast::list(), skip> list;
};
};
}
#include <fstream>
int main() {
using It = boost::spirit::istream_iterator;
using Parser = grammar::parser<It>;
std::ifstream input("input.txt");
It f(input >> std::noskipws), l;
Parser::skip const s{};
Parser::test const p{};
std::vector<ast::command> data;
bool ok = phrase_parse(f, l, *p, s, data);
if (ok) {
std::cout << "Parsed " << data.size() << " commands\n";
} else {
std::cout << "Parsed failed\n";
}
if (f != l) {
std::cout << "Remaining unparsed input: '" << std::string(f,l) << "'\n";
}
}
打印
Parsed 3 commands
让我们限制密钥
就像上面的链接答案一样,让我们传递地图,对规则实际的密钥集来获取它们允许的值:
using KeySet = qi::symbols<char>;
using KeyRef = KeySet const*;
//
KeySet add_keys, modify_keys, clear_keys;
qi::symbols<char, KeyRef> action_sym;
qi::rule<It, ast::pair(KeyRef), skip> pair;
qi::rule<It, ast::map(KeyRef), skip> map;
Note A key feature used is the associated attribute value with a
symbols<>
lookup (in this case we associate aKeyRef
with an action symbol):
//
add_keys += "a1", "a2", "a3", "a4", "a5", "a6";
modify_keys += "m1", "m2", "m3", "m4";
clear_keys += "c1", "c2", "c3", "c4", "c5";
action_sym.add
("add", &add_keys)
("modify", &modify_keys)
("clear", &clear_keys);
现在繁重的开始了.
使用qi :: locals<>和继承的属性
让我们给command_一些本地空间来存储选定的键集:
qi::rule<It, ast::command(), skip, qi::locals<KeyRef> > command_;
现在我们原则上可以使用它(使用_a占位符).但是,有一些细节:
//
qi::_a_type selected;
总是喜欢描述性的名字:) _a和_r1变得很快.事情让人感到困惑.
command_ %= kw["test"]
>> target
>> raw[ kw[action_sym] [ selected = _1 ] ]
>> '(' >> map(selected) >> ')'
>> ';';
Note: the subtlest detail here is
%=
instead of=
to avoid the suppression of automatic attribute propagation when a semantic action is present (yeah, see ¹ again…)
但总而言之,这看起来并不那么糟糕?
//
qi::_r1_type symref;
pair = raw[ kw[lazy(*symref)] ] >> -value;
map = +pair(symref);
现在至少事情就解析了
快好了
//#define BOOST_SPIRIT_DEBUG
#include <string>
#include <map>
#include <vector>
namespace ast {
//
using string = std::string;
using strings = std::vector<string>;
using list = strings;
using pair = std::pair<string, string>;
using map = std::map<string, string>;
//
struct command {
string host;
string action;
map option;
};
}
#include <boost/fusion/adapted.hpp>
BOOST_FUSION_ADAPT_STRUCT(ast::command, host, action, option)
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/repository/include/qi_distinct.hpp>
namespace grammar
{
namespace qi = boost::spirit::qi;
namespace qr = boost::spirit::repository::qi;
template <typename It>
struct parser
{
struct skip : qi::grammar<It> {
skip() : skip::base_type(rule_) {
using namespace qi;
// handle all whitespace along with line/block comments
rule_ = ascii::space
| (lit("#")|"--"|"//") >> *(char_ - eol) >> (eoi | eol) // line comment
| "/*" >> *(char_ - "*/") >> "*/"; // block comment
//
//BOOST_SPIRIT_DEBUG_NODES((skipper))
}
private:
qi::rule<It> rule_;
};
//
struct token {
//
token() {
using namespace qi;
// common
string = '"' >> *("\\" >> char_ | ~char_('"')) >> '"';
identity = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");
value = string | identity;
// ip target
any = '*';
local = '.' | fqdn;
fqdn = +char_("a-zA-Z0-9.\\-"); // concession
ipv4 = raw [ octet >> '.' >> octet >> '.' >> octet >> '.' >> octet ];
//
target = any | local | fqdn | ipv4;
//
BOOST_SPIRIT_DEBUG_NODES(
(string) (identity) (value)
(any) (local) (fqdn) (ipv4) (target)
)
}
protected:
//
qi::rule<It, std::string()> string;
qi::rule<It, std::string()> identity;
qi::rule<It, std::string()> value;
qi::uint_parser<uint8_t, 10, 1, 3> octet;
qi::rule<It, std::string()> any;
qi::rule<It, std::string()> local;
qi::rule<It, std::string()> fqdn;
qi::rule<It, std::string()> ipv4;
qi::rule<It, std::string()> target;
};
//
struct test : token, qi::grammar<It, ast::command(), skip> {
//
test() : test::base_type(start_)
{
using namespace qi;
auto kw = qr::distinct( copy( char_( "a-zA-Z0-9_" ) ) );
//
add_keys += "a1", "a2", "a3", "a4", "a5", "a6";
modify_keys += "m1", "m2", "m3", "m4";
clear_keys += "c1", "c2", "c3", "c4", "c5";
action_sym.add
("add", &add_keys)
("modify", &modify_keys)
("clear", &clear_keys);
//
qi::_a_type selected;
command_ %= kw["test"]
>> target
>> raw[ kw[action_sym] [ selected = _1 ] ]
>> '(' >> map(selected) >> ')'
>> ';';
//
qi::_r1_type symref;
pair = raw[ kw[lazy(*symref)] ] >> -value;
map = +pair(symref);
list = *value;
start_ = command_;
BOOST_SPIRIT_DEBUG_NODES(
(start_) (command_)
(pair) (map) (list)
)
}
private:
using token::target;
using token::identity;
using token::value;
using KeySet = qi::symbols<char>;
using KeyRef = KeySet const*;
//
qi::rule<It, ast::command(), skip> start_;
qi::rule<It, ast::command(), skip, qi::locals<KeyRef> > command_;
//
KeySet add_keys, modify_keys, clear_keys;
qi::symbols<char, KeyRef> action_sym;
qi::rule<It, ast::pair(KeyRef), skip> pair;
qi::rule<It, ast::map(KeyRef), skip> map;
qi::rule<It, ast::list(), skip> list;
};
};
}
#include <fstream>
int main() {
using It = boost::spirit::istream_iterator;
using Parser = grammar::parser<It>;
std::ifstream input("input.txt");
It f(input >> std::noskipws), l;
Parser::skip const s{};
Parser::test const p{};
std::vector<ast::command> data;
bool ok = phrase_parse(f, l, *p, s, data);
if (ok) {
std::cout << "Parsed " << data.size() << " commands\n";
} else {
std::cout << "Parsed failed\n";
}
if (f != l) {
std::cout << "Remaining unparsed input: '" << std::string(f,l) << "'\n";
}
}
打印
Parsed 3 commands
坚持,不要那么快!这是不对的
是啊.如果你启用调试,你会看到它奇怪地解析事情:
<attributes>[[[1, 0, ., 0, ., 0, ., 1], [c, l, e, a, r], [[[c, 1], [c, 2]], [[c, 3], []]]]]</attributes>
这实际上“仅仅”是语法的问题.如果语法无法看到键和值之间的差异,那么显然c2将被解析为具有键c1的属性值.
由你来消除歧义语法.现在,我将使用否定断言演示修复:我们只接受未知键的值.它有点脏,但对于教学目的可能对你有用:
key = raw[ kw[lazy(*symref)] ];
pair = key(symref) >> -(!key(symref) >> value);
map = +pair(symref);
注意我考虑了可读性的关键规则:
解析
<attributes>[[[1, 0, ., 0, ., 0, ., 1], [c, l, e, a, r], [[[c, 1], []], [[c, 2], []], [[c, 3], []]]]]</attributes>
正是医生所要求的!
¹Boost Spirit: “Semantic actions are evil”?
标签:boost-spirit-qi,c,boost,boost-spirit,boost-spirit-lex 来源: https://codeday.me/bug/20191002/1842129.html