python – NLTK:如何遍历名词短语以返回字符串列表?
发布时间:2020-11-18 10:11:18 所属栏目:Python 来源:互联网
导读:在NLTK中,如何遍历已解析的句子以返回名词短语字符串列表? 我有两个目标: (1)创建名词短语列表,而不是使用’traverse()’方法打印它们.我目前使用StringIO来记录现有traverse()方法的输出.这不是一个可接受的解决方案. (2)解析名词短语字符串,以便:(NP Mic
|
在NLTK中,如何遍历已解析的句子以返回名词短语字符串列表? 我有两个目标: NLTK文档建议使用traverse()来查看名词短语,但是如何在这个递归方法中捕获’t’,以便生成一个字符串名词短语列表? from nltk.tag import pos_tag
def traverse(t):
try:
t.label()
except AttributeError:
return
else:
if t.label() == 'NP': print(t) # or do something else
else:
for child in t:
traverse(child)
def nounPhrase(tagged_sent):
# Tag sentence for part of speech
tagged_sent = pos_tag(sentence.split()) # List of tuples with [(Word,PartOfSpeech)]
# Define several tag patterns
grammar = r"""
NP: {<DT|PP$>?<JJ>*<NN>} # chunk determiner/possessive,adjectives and noun
{<NNP>+} # chunk sequences of proper nouns
{<NN>+} # chunk consecutive nouns
"""
cp = nltk.RegexpParser(grammar) # Define Parser
SentenceTree = cp.parse(tagged_sent)
NounPhrases = traverse(SentenceTree) # collect Noun Phrase
return(NounPhrases)
sentence = "Michael Jackson likes to eat at McDonalds"
tagged_sent = pos_tag(sentence.split())
NP = nounPhrase(tagged_sent)
print(NP)
目前打印: 解决方法def extract_np(psent):
for subtree in psent.subtrees():
if subtree.label() == 'NP':
yield ' '.join(word for word,tag in subtree.leaves())
cp = nltk.RegexpParser(grammar)
parsed_sent = cp.parse(tagged_sent)
for npstr in extract_np(parsed_sent):
print (npstr) (编辑:鄂州站长网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |
