Skip to content

Unwanted length mismatch #3

@timvieira

Description

@timvieira

Why are the following parses considered to have different lengths? I'm guessing it has something to do with a punctuation filter.

GOLD=  
    (ROOT (S (CC And) (NP (NP (NNS rents)) (PP (IN on) (NP (NP (NNP Beverly) (NNP Hills)
    (POS ')) (NNP Rodeo) (NNP Drive)))) (ADVP (RB generally)) (VP (VBP do) (RB n't) 
    (VP (VB exceed)  (NP (NP (RB about) ($ $) (CD 125)) (NP (DT a) (JJ square) 
    (NN foot))))) (. .)))
TEST= 
    (ROOT (S (CC And) (NP (NP (NNS rents)) (PP (IN on) (NP (NNP Beverly) (NNP Hills)))) 
    ('' ')  (NP (NNP Rodeo) (NNP Drive)) (ADVP (RB generally)) (VP (VBP do) (RB n't) 
    (VP (VB exceed) (NP (NP (QP (IN about) ($ $) (CD 125))) (NP (DT a) (NN square) 
    (NN foot))))) (. .)))

In this case, I think the TEST parse drops the token ('' '), but the GOLD parse does not because it is has a possessive tag.

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions