Could we imagine a process that reverts the NindIndex to its last coherent state ?
Context
for some reason in the documents I try ton index in Nind , I get systematically an exception.
~:$ indexXmlMultimedia corpus_xml/*.mult
...
Indexing corpus_xml/02148_HORZ-ISBL_lot_A_-_EXHIBIT_F_-_Annex_F3_-_rev0.xml.mult (2148/2554, 84%)
terminate called after throwing an instance of 'latecon::nindex::OutWriteBufferException'
what(): Out write buffer error
Abandon (core dumped)
Since my documents are large (frequently more than 200 pages), the indexation takes nearly 6 hours to reach this error at 84% of the corpus.
First of all, I don't know yet what causes this OutWriteBufferException. The .mult file seems to be erroneous, because when it is excluded, the corpus indexation can complete to its end.
Sadely, this needs a restart of the indexation from the top, because after the OutWriteBufferException, the Nind index files are now corrupted. An indexation on the corrupted Nind files causes a NindPadFileException that doesn't even tell on which file the corruption was found !!
terminate called after throwing an instance of 'latecon::nindex::NindPadFileException'
what(): Nind Pad error
Propositions
- How could we ensure that the
NindPadFileException tells us on which file the corruption is found ?
- The class heritage of
NindPadFileException from FileException and std::runtime_error does not seems to overwrite the what() method which is supposed to return a context message containing the file on-which the corruption was found.
- Could we imagine a process that rolls back the last document indexation, so that the Nind files come back to a safe state ?
- This would allow the restart of the indexing with the next .mult files up to the end of the corpus, instead of restarting from the top.
Calling for help
@kleag ?? @jys ?? could you help here ?
Could we imagine a process that reverts the NindIndex to its last coherent state ?
Context
for some reason in the documents I try ton index in Nind , I get systematically an exception.
~:$ indexXmlMultimedia corpus_xml/*.mult ... Indexing corpus_xml/02148_HORZ-ISBL_lot_A_-_EXHIBIT_F_-_Annex_F3_-_rev0.xml.mult (2148/2554, 84%) terminate called after throwing an instance of 'latecon::nindex::OutWriteBufferException' what(): Out write buffer error Abandon (core dumped)Since my documents are large (frequently more than 200 pages), the indexation takes nearly 6 hours to reach this error at 84% of the corpus.
First of all, I don't know yet what causes this
OutWriteBufferException. The .mult file seems to be erroneous, because when it is excluded, the corpus indexation can complete to its end.Sadely, this needs a restart of the indexation from the top, because after the
OutWriteBufferException, the Nind index files are now corrupted. An indexation on the corrupted Nind files causes aNindPadFileExceptionthat doesn't even tell on which file the corruption was found !!Propositions
NindPadFileExceptiontells us on which file the corruption is found ?NindPadFileExceptionfromFileExceptionandstd::runtime_errordoes not seems to overwrite the what() method which is supposed to return a context message containing the file on-which the corruption was found.Calling for help
@kleag ?? @jys ?? could you help here ?