Skip to content

[WIP] No subclasses refactor#435

Closed
cgarciae wants to merge 6 commits into
masterfrom
no-subclasses
Closed

[WIP] No subclasses refactor#435
cgarciae wants to merge 6 commits into
masterfrom
no-subclasses

Conversation

@cgarciae
Copy link
Copy Markdown
Contributor

@cgarciae cgarciae commented Nov 12, 2021

The idea of this PR is to:

  1. Remove the classes LuxDataFrame, LuxSeries, etc
  2. Entirely avoid strong modifications to the pandas library such as overriding pandas.DataFrame class (and all other internal references to it) with their Lux counterpart e.g.:
 pandas.DataFrame = LuxDataFrame

Current behaviour can be surprising to the user and it makes supporting new versions harder as you might have to deal with more edge cases because the impact of the modifications is so large. Instead the idea is to some lightweight monkey patching of fiew core methods (which mostly do bookkeeping) and move the rest to out of the DataFrame class to avoid possible conflicts. A lot of discussion and possible iteration is needed. These are the current efforts:

Ongoing changes

  • LuxDataFrame as a class is removed, instead key methods of interest form pandas.DataFrame are patched.
  • The type LuxDataFrame is now a Protocol that is there only for possible validation and type inference purposes.
  • Core methods are patched using a new @patch(cls) decorator, this replaces the method on the class but the original method is still available as ._super_{method_name}, e.g:
@patch(DataFrame)
def _set_axis(self, axis, labels):
    self._super_set_axis(axis, labels)
    ...
  • To reduce the surface area in which Lux impacts Pandas, a single .lux property that includes all lux extension methods is created. Its very similar to the existing .str and .dt functionality already present in pandas, here is an example usage:
df.lux.save_as_html(...)

@cgarciae
Copy link
Copy Markdown
Contributor Author

cgarciae commented Nov 22, 2021

@dorisjlee Current status:

  • I experimented using groupby in "mini_lux" and it propagated the information out of the box 🎉. This means we can probably drop groupby.py.
  • I finished porting frame.py for now but had to remove many references to LuxDataFrame in many of the other modules to avoid circular dependencies. I think that we can solve this later by defining LuxDataFrame in a separate module.

@dorisjlee
Copy link
Copy Markdown
Member

Thanks for the update, that sounds great @cgarciae! Glad to hear that we don't need to patch up GroupbyObjects explicitly to get the propagation to work!

Let me know when you want me to take a look at this patch, happy to go through and play around with this if it would be helpful!

@cgarciae
Copy link
Copy Markdown
Contributor Author

Closed in favor of #437

@cgarciae cgarciae closed this Nov 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants