如何使用Stanford TokensRegex?
作者:互联网
我正在尝试使用Stanford TokensRegex.但是,我在匹配器的行中遇到错误(请参阅注释),它表示().请你尽力帮助我.以下是我的代码:
String file = "A store has many branches. A manager may manage at most 2 branches.";
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
Annotation document = new Annotation(file);
pipeline.annotate(document);
List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
for(CoreMap sentence: sentences) {
TokenSequencePattern pattern = TokenSequencePattern.compile("[]");
TokenSequenceMatcher matcher = pattern.getMatcher(sentence); // ERROR HERE!
while( matcher.find()){
JOptionPane.showMessageDialog(rootPane, "It has been found");
}
}
解决方法:
这里的错误来自pattern.getMatcher(句子),因为getMatcher(*)这个方法只需要List< CoreLabel>作为其输入参数.我做了以下事情:
List<CoreLabel> tokens = new ArrayList<CoreLabel>();
for(CoreMap sentence: sentences) {
// **using TokensRegex**
for (CoreLabel token: sentence.get(TokensAnnotation.class))
tokens.add(token);
TokenSequencePattern p1 = TokenSequencePattern.compile("A store has");
TokenSequenceMatcher matcher = p1.getMatcher(tokens);
while (matcher.find())
System.out.println("found");
// **looking for the POS**
for (CoreLabel token: sentence.get(TokensAnnotation.class)) {
String word = token.get(TextAnnotation.class);
// this is the POS tag of the token
String pos = token.get(PartOfSpeechAnnotation.class);
System.out.println("word is "+ word +", pos is " + pos);
}
}
以上代码未经优化.请根据您的需要调整它们.
标签:java,regex,stanford-nlp 来源: https://codeday.me/bug/20190629/1321227.html