Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions datafusion/datasource-parquet/src/opener.rs
Original file line number Diff line number Diff line change
Expand Up @@ -277,8 +277,9 @@ impl FileOpener for ParquetOpener {
// - The table schema as defined by the TableProvider.
// This is what the user sees, what they get when they `SELECT * FROM table`, etc.
// - The logical file schema: this is the table schema minus any hive partition columns and projections.
// This is what the physicalfile schema is coerced to.
// - The physical file schema: this is the schema as defined by the parquet file. This is what the parquet file actually contains.
// This is what the physical file schema is coerced to.
// - The physical file schema: this is the schema that the arrow-rs
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the "what the file contains" confusing -- Parquet has different datatypes than Arrow

// parquet reader will actually produce.
let mut physical_file_schema = Arc::clone(reader_metadata.schema());

// The schema loaded from the file may not be the same as the
Expand Down
17 changes: 12 additions & 5 deletions datafusion/physical-expr/src/projection.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
// specific language governing permissions and limitations
// under the License.

//! [`ProjectionExpr`] and [`ProjectionExprs`] for representing projections.

use std::ops::Deref;
use std::sync::Arc;

Expand All @@ -35,14 +37,16 @@ use datafusion_physical_expr_common::utils::evaluate_expressions_to_arrays;
use indexmap::IndexMap;
use itertools::Itertools;

/// A projection expression as used by projection operations.
/// An expression used by projection operations.
///
/// The expression is evaluated and the result is stored in a column
/// with the name specified by `alias`.
///
/// For example, the SQL expression `a + b AS sum_ab` would be represented
/// as a `ProjectionExpr` where `expr` is the expression `a + b`
/// and `alias` is the string `sum_ab`.
///
/// See [`ProjectionExprs`] for a collection of projection expressions.
#[derive(Debug, Clone)]
pub struct ProjectionExpr {
/// The expression that will be evaluated.
Expand Down Expand Up @@ -107,11 +111,14 @@ impl From<ProjectionExpr> for (Arc<dyn PhysicalExpr>, String) {
}
}

/// A collection of projection expressions.
/// A collection of [`ProjectionExpr`] instances, representing a complete
/// projection operation.
///
/// Projection operations are used in query plans to select specific columns or
/// compute new columns based on existing ones.
///
/// This struct encapsulates multiple `ProjectionExpr` instances,
/// representing a complete projection operation and provides
/// methods to manipulate and analyze the projection as a whole.
/// See [`ProjectionExprs::from_indices`] to select a subset of columns by
/// indices.
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct ProjectionExprs {
exprs: Vec<ProjectionExpr>,
Expand Down