Skip to content

Fix bug when extract multiple adjacent words from a string without word boundaries#142

Open
lishukan wants to merge 50 commits intovi3k6i5:masterfrom
lishukan:master
Open

Fix bug when extract multiple adjacent words from a string without word boundaries#142
lishukan wants to merge 50 commits intovi3k6i5:masterfrom
lishukan:master

Conversation

@lishukan
Copy link

Dear developers:
There is no doubt that flashtext is an excellent string matching tool. I have already used it on a large number of occasions. But recently I found it in a string without word boundaries (such as a Chinese sentence),

If two words that need to be extracted happen to be adjacent, then it will only be able to extract the first word.

So I made some modifications: when matching words, the index for the next iteration will start at the end of the last matched word.

I have added a new use case and It passed all unit tests.

image

vi3k6i5 and others added 30 commits November 10, 2017 20:47
added reference to flashtext paper
  `charactes` | `characters`
  `explaination` | `explanation`
  `matche` | `match`
Fix issue with incomplete keyword at the end of the sentence
Performances improvement for strings manipulations
@abulice
Copy link

abulice commented Sep 21, 2023 via email

@lishukan
Copy link
Author

@vi3k6i5 Hello, dear owner . Is this repo still maintained ? I found that this repo hasn't updated its code for a long time . If it is no longer maintained, I will no longer wait for the merge of the MR.

lishukan and others added 2 commits October 23, 2023 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants