-
Notifications
You must be signed in to change notification settings - Fork 571
Some tweaks to draft table APIs #11880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Web viewer built successfully.
View image diff on kitdiff. Note: This comment is updated whenever you push a commit. |
abey79
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I obviously get shills up my spine with the multiple inheritance thing, but this is the point of this sandbox, so lgtm
| if url is None: | ||
| tmpdir = tempfile.TemporaryDirectory() | ||
| self.tmpdirs.append(tmpdir) | ||
| url = Path(tmpdir.name).as_uri() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love the utter nastiness we are allowed in here 😅
|
|
||
|
|
||
| class TableEntry(Entry): | ||
| class TableEntry(Entry, datafusion.DataFrame): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as this PR goes, I'm obviously open to experimentation and this is fine. As for API design, I unsurprisingly the opposite of in love with this though 😅 Multiple inheritance is obviously broken in man ways, but it feels particularly nasty when one of the base class is outside of the control of the project. For example, append and update could easily conflict with future datafusion apis.
| class TableEntry(Entry, datafusion.DataFrame): | |
| # TODO(RR-2939): can we do better than multiple inheritance? | |
| class TableEntry(Entry, datafusion.DataFrame): |
| return TableEntry(self._inner.register_table(name, url)) | ||
|
|
||
| def create_table_entry(self, name: str, schema, url: str) -> TableEntry: | ||
| def create_table(self, name: str, schema, url: str | None = None) -> TableEntry: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: we tried to establish the convention that: xyz would return "raw data" (aka dataframe), while xyz_entry returns the python object representation. We couldn't really make it super consistent, but the df/object hybrid obviously doesn't help here.
Related
What
Mostly making sure I understand the shim plumbing and fixing things so I get correct typing in vs code.
I know there are downsides to using multiple inheritance like this, but, I wanted to see what it would look like if TableEntry just acted like a dataframe with a few Entry-related extras.