Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #96 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 16 16
Lines 943 943
=========================================
Hits 943 943 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
gmmajal
left a comment
There was a problem hiding this comment.
Can you make the changes specified for the doctrings and further clarify if text_features needs to be modified?
| def predict(self, X) -> csr_array: | ||
| """ | ||
| Predicts binary concept match labels for each input text. | ||
|
|
There was a problem hiding this comment.
I think in the docstrings for the predict() method we can replace "A sparse matrix of shape ..." with a sparse array for the sake of consistency.
| txt_vec = self.text_vectorizer_.transform([inp]) | ||
| else: | ||
| txt_vec = 0 | ||
| txt_feat = self.text_features_.transform([text])[0] |
There was a problem hiding this comment.
Can you explain why over here(line 433) and in line 458, when the transform method is applied we access index 0. I see that it was modified for the text_vectorizer attribute i.e. index 0 is not accessed. Should it be also modified for the text_features attribute or does it have a different data structure?
There was a problem hiding this comment.
I looked at what the transform methods are doing for text_vectorizer and text_features, respectively. The one for text_vectorizer returns a sparse matrix whereas for text_features a numpy array is returned. The data structure does indeed seem to be different.
There was a problem hiding this comment.
The text vectorizer produces a csr_matrix from scikit-learn, so we can’t switch it to a sparray at this point.
|
We can’t safely transition from Specifically, scikit-learn’s PR #31072 (“First steps toward sparray migration pass 2”) is still open, indicating that full adoption isn’t complete yet. |
Good catch! I had a look at the pull request you referenced. The team at |
|
Another thing to note is that we are getting deprecation warnings for two of our unit tests. We are comparing sparse diagonal matrices in both tests. According to the documentation given here: https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.diags.html#scipy.sparse.diags and https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.diags_array.html#scipy.sparse.diags_array, the automatic conversion to a float will no longer take place starting from |
Replace
spmatrixwithsparrayfor SciPy compatibilityThis pull request updates all references to
scipy.sparse.spmatrixto use the newscipy.sparse.sparrayclass, in line with SciPy's ongoing deprecation ofspmatrix. This change ensures compatibility with recent and future versions of SciPy.Changes Made
spmatrixtype checks and imports withsparray.Related Issue
Fixes stwfsapy#89