Skip to content

Property segmentSeparator not doing its job #8

@ancordovag

Description

@ancordovag

Hello:
I'm importing a rs3 file to ANNIS using Pepper, but altough I get rights the tokens for the queries, I get a broken "full text":
... resolutionS/2014/189because we believethat it is the appropriate response of the Council to the current crisis in UkraineGiven that situation at stake here are fundamental principles of international lawsuch as abstaining from the threat or use of for ...

You can notice how the sements are all together. I have played a lot with the properties and tried different combinations. This is my last one.
<importer name="RSTImporter" formatName="rs3" path="07_rst/tomerge/7138_spch010_chile.rs3"> <property key="rstImporter.nodeKindName">node_kind</property> <property key="rstImporter.nodeTypeName">node_type</property> <property key="rstImporter.relationTypeName">name</property> <property key="rstImporter.tokenize">yes</property> <property key="simpleTokenize">' '</property> <property key="rstImporter.segmentSeparator"> </property> <property key="pepper.after.reportCorpusGraph">true</property> </importer>
As said, for queries it is not a problem, but Im trying to use afterwards the merge module, and it says it cannot align any text, and I suppose it is for this error, because it is the only different thing between this text and the other exmaralda text I want to merge with. Could someone help me please?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions