Crashes while handling non-select result set (DataFrame)

### What happens?

Hi.

### Problem

As the result of a `sparl.sql("non-select")` where `non-select` is any SQL statement that is not a select, e.g., USE, INSERT, DROP, CREATE, ... the `sql()` function will correctly return an empty DataFrame, which is the behavior of the pyspark API.

However, that object crashes when using any of its APIs, because the internal `relation` object is None. The same applies when trying to create an empty DataFrame without columns. A

### Fix

I think the best fix would require fixing the underlying c++ Relation object from the duckdb C++ library to support an empty relation without columns. There are also a couple other fixes like allowing the underlying `duckdb.struct_type()` to have no fields. That would make the low-level API more robust and require less patching in the python layer.

Then the `DuckDBPyConnection::RunQuery` function needs to return an empty relation for non-select statement, instead of `nullptr`. All these fixes felt a bit overwhelming so I won't submit a patch.



### To Reproduce


Testcase. All this works with Spark.
```
@pytest.mark.parametrize("mode", ["pandas", "list", "non-select"])
def test_empty_sdf( spark_session_g, mode):
    from pyspark.sql import functions as f
    from pyspark.sql import types as t
    import pandas as pd

    spark = spark_session_g
    if mode =="pandas":
        sdf = spark.createDataFrame(pd.DataFrame(), t.StructType([]))
    elif mode == "list":
        sdf = spark.createDataFrame([], t.StructType([]))
    else:
        curr_db = spark.catalog.currentDatabase()
        sdf = spark.sql(f"USE {curr_db}")  # non-result set query

    assert sdf.schema ==  t.StructType([])
    assert sdf.columns == []
    assert sdf.collect() == []
    assert sdf.toPandas().empty
    assert sdf.toArrow().shape == (0, 0)
    sdf.createOrReplaceTempView("my_vv1")
    assert spark.sql("SELECT * from my_vv1").toArrow().shape == (0, 0)
    sdf.show() # no-op, no crash
    assert sdf.withColumn("col1", f.lit(1)).columns == ["col1"]
    assert sdf.withColumns({"col1": f.lit(1)}).columns == ["col1"]
    assert sdf.drop("noop").columns == []
```


### OS:

Any

### DuckDB Package Version:

Main branch

### Python Version:

3.12

### Full Name:

João Eiras

### Affiliation:

private

### What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a source build

### Did you include all relevant data sets for reproducing the issue?

Yes

### Did you include all code required to reproduce the issue?

- [x] Yes, I have

### Did you include all relevant configuration to reproduce the issue?

- [x] Yes, I have

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crashes while handling non-select result set (DataFrame) #428

What happens?

Problem

Fix

To Reproduce

OS:

DuckDB Package Version:

Python Version:

Full Name:

Affiliation:

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

Did you include all relevant data sets for reproducing the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant configuration to reproduce the issue?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Crashes while handling non-select result set (DataFrame) #428

Description

What happens?

Problem

Fix

To Reproduce

OS:

DuckDB Package Version:

Python Version:

Full Name:

Affiliation:

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

Did you include all relevant data sets for reproducing the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant configuration to reproduce the issue?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions