Conversation
mkangia
left a comment
There was a problem hiding this comment.
thanks for taking care of this @adil-uddin
mostly just coding optimizations.
|
|
||
| def get_unique_words(cell): | ||
| """ | ||
| Remove output tags and return unique words. If 'jr://file/' is found in cell return empty list |
There was a problem hiding this comment.
what is the reason for doing this?
| """ | ||
| Checks whether a word contains only english alphabet or not | ||
| """ | ||
| if len(re.findall('[a-zA-z0-9]', word)) == len(word): |
There was a problem hiding this comment.
you can just do a regex match here?
| """ | ||
| source_cell_words = get_unique_words(source_cell) | ||
| current_cell_words = get_unique_words(current_cell) | ||
| shared_words = set(source_cell_words).intersection(set(current_cell_words)) |
There was a problem hiding this comment.
do we need get_unique_words to return a list? we can just have that return a set and hence then not need to convert it to set again?
| current_cell_words = get_unique_words(current_cell) | ||
| shared_words = set(source_cell_words).intersection(set(current_cell_words)) | ||
| if len(shared_words) > 0: | ||
| shared_english_words = [] |
There was a problem hiding this comment.
nit:
common_words -> shared_english_words
distinct_words -> non_shared_words
| if is_english_word(word): | ||
| continue | ||
| else: | ||
| return shared_english_words, True |
There was a problem hiding this comment.
this can be rewritten as
for word in non_shared_words:
if not is_english_word(word):
return shared_english_words, True
|
|
||
| def is_english_word(word): | ||
| """ | ||
| Checks whether a word contains only english alphabet or not |
There was a problem hiding this comment.
nit: is_alphanumeric would be a true reflection of what this method is doing.
| source_cell_words = get_unique_words(source_cell) | ||
| current_cell_words = get_unique_words(current_cell) | ||
| shared_words = set(source_cell_words).intersection(set(current_cell_words)) | ||
| if len(shared_words) > 0: |
There was a problem hiding this comment.
nit: len is redundant here? just if shared_words should suffice?
| if is_english_word(word): | ||
| has_english_words = True | ||
| else: | ||
| has_non_english_words = True |
There was a problem hiding this comment.
if both are true we can break out of the loop?
if has_english_words and has_non_english_words:
break
| curMismatchFillStyle = None | ||
| mismatch_present = False | ||
| lesser_mismatch_present = False | ||
| lang_mismatch_present = False |
There was a problem hiding this comment.
can this be rewritten such that these variables are not declared? it looks like that should be possible to assign curMismatchFillStyle directly
| curMismatchFillStyle = LESSER_MISMATCH_FILL_STYLE_NAME | ||
| curMismatchFillStyle = None | ||
| mismatch_present = False | ||
| lesser_mismatch_present = False |
There was a problem hiding this comment.
similarly here, looks like it should be possible to rewrite this without the new set of variables.

No description provided.