Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
0457987
init
aglinxinyuan Mar 31, 2026
5b36a66
Merge branch 'main' into xinyuan-union
aglinxinyuan Mar 31, 2026
cd43d9f
Update frontend/src/app/workspace/service/validation/validation-workf…
aglinxinyuan Mar 31, 2026
e602ae8
fix test
aglinxinyuan Apr 1, 2026
4d2035e
Merge branch 'main' into xinyuan-union
chenlica Apr 2, 2026
4bfa9fb
Merge branch 'main' into xinyuan-union
aglinxinyuan Apr 3, 2026
39ced05
update
aglinxinyuan Apr 4, 2026
d073642
update
aglinxinyuan Apr 5, 2026
1c1f97b
update
aglinxinyuan Apr 6, 2026
9d4a640
update
aglinxinyuan Apr 6, 2026
6116d18
update
aglinxinyuan Apr 6, 2026
8918de0
update
aglinxinyuan Apr 6, 2026
b498313
update
aglinxinyuan Apr 6, 2026
7fcf839
update
aglinxinyuan Apr 6, 2026
09d4b2e
Revert "update"
aglinxinyuan Apr 6, 2026
549ad07
Revert "update"
aglinxinyuan Apr 6, 2026
3d6ad85
update
aglinxinyuan Apr 6, 2026
d63958a
update
aglinxinyuan Apr 9, 2026
cc6d497
update
aglinxinyuan Apr 9, 2026
d2ebecb
update
aglinxinyuan Apr 9, 2026
ae1a0aa
update
aglinxinyuan Apr 9, 2026
96e0e3b
Merge branch 'main' into xinyuan-union
aglinxinyuan Apr 9, 2026
fd29f50
Merge branch 'main' into xinyuan-union
aglinxinyuan Apr 9, 2026
c3cd887
Merge branch 'main' into xinyuan-union
aglinxinyuan Apr 10, 2026
58e1cec
fix fmt
aglinxinyuan Apr 10, 2026
1fbece2
Merge remote-tracking branch 'origin/xinyuan-union' into xinyuan-union
aglinxinyuan Apr 10, 2026
267cf46
fix fmt
aglinxinyuan Apr 10, 2026
43f8a5a
local
aglinxinyuan Apr 10, 2026
9d68e14
update
aglinxinyuan Apr 10, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions common/config/src/main/resources/storage.conf
Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,10 @@ storage {
uri-without-scheme = "localhost:5432/texera_iceberg_catalog"
uri-without-scheme = ${?STORAGE_ICEBERG_CATALOG_POSTGRES_URI_WITHOUT_SCHEME}

username = "texera"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aglinxinyuan @Xiao-zhen-Liu are these changes intentional? it is not mentioned in PR description.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is unintentional, but it has no effect since it's just a placeholder.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should avoid this kind of unintentional change. whether it has effect or not is debatable. for example, for other developers they may have conflict when they pull the change that they need to deal with. so that's also "effect".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree. It's definitely a mistake. Besides this one, all the other changes are intentional.

username = "postgres"
username = ${?STORAGE_ICEBERG_CATALOG_POSTGRES_USERNAME}

password = "password"
password = "postgres"
password = ${?STORAGE_ICEBERG_CATALOG_POSTGRES_PASSWORD}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,11 @@ message GlobalPortIdentity{
message InputPort {
PortIdentity id = 1 [(scalapb.field).no_box = true];
string displayName = 2;
bool allowMultiLinks = 3;
bool disallowMultiLinks = 3;
repeated PortIdentity dependencies = 4;
}



message OutputPort {
enum OutputMode {
// outputs complete result set snapshot for each update
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ import org.apache.texera.amber.core.workflow.PartitionInfo
case class PortDescription(
portID: String,
displayName: String,
allowMultiInputs: Boolean,
disallowMultiInputs: Boolean,
isDynamicPort: Boolean,
partitionRequirement: PartitionInfo,
dependencies: List[Int] = List.empty
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,12 @@ class DummyOpDesc extends LogicalOp with PortDescriptor {
InputPort(
PortIdentity(idx),
displayName = portDesc.displayName,
allowMultiLinks = portDesc.allowMultiInputs,
disallowMultiLinks = portDesc.disallowMultiInputs,
dependencies = portDesc.dependencies.map(idx => PortIdentity(idx))
)
}
} else {
List(InputPort(PortIdentity(), allowMultiLinks = true))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to keep the PortIdentity() part. PortIdentity() returns PortIdentity(0).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will return the same result.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. do you have a test result to show they are equal?

PortIdentity(0) is not equal to 0 or null.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I will provide test result later. If it's not the same, the test case should fail.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InputPort() defaults to InputPort(PortIdentity()), which is equivalent to InputPort(PortIdentity(0)).

We use InputPort() consistently elsewhere; the only reason PortIdentity() was previously passed here was to explicitly set the positional argument.

image

List(InputPort())
}
val outputPortInfo = if (outputPorts != null) {
outputPorts.zipWithIndex.map {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ import com.fasterxml.jackson.databind.jsontype.NamedType
import com.fasterxml.jackson.databind.node.{ArrayNode, ObjectNode}
import com.kjetland.jackson.jsonSchema.JsonSchemaConfig.html5EnabledSchema
import com.kjetland.jackson.jsonSchema.{JsonSchemaConfig, JsonSchemaDraft, JsonSchemaGenerator}
import org.apache.texera.amber.core.workflow.OutputPort.OutputMode
import org.apache.texera.amber.core.workflow.{InputPort, OutputPort}
import org.apache.texera.amber.operator.LogicalOp
import org.apache.texera.amber.operator.source.scan.csv.CSVScanSourceOpDesc
Expand All @@ -45,6 +46,21 @@ case class OperatorInfo(
allowPortCustomization: Boolean = false
)

object OperatorInfo {
def forVisualization(
userFriendlyName: String,
operatorDescription: String,
operatorGroupName: String
): OperatorInfo =
OperatorInfo(
userFriendlyName,
operatorDescription,
operatorGroupName,
inputPorts = List(InputPort(disallowMultiLinks = true)),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
)
}

case class OperatorMetadata(
operatorType: String,
jsonSchema: JsonNode,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,7 @@ class SklearnTestingOpDesc extends PythonOperatorDescriptor {
InputPort(
PortIdentity(),
"model",
dependencies = List(PortIdentity(1)),
allowMultiLinks = true
dependencies = List(PortIdentity(1))
),
InputPort(PortIdentity(1), "data")
),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -143,12 +143,12 @@ class JavaUDFOpDesc extends LogicalOp {
InputPort(
PortIdentity(idx),
displayName = portDesc.displayName,
allowMultiLinks = portDesc.allowMultiInputs,
disallowMultiLinks = portDesc.disallowMultiInputs,
dependencies = portDesc.dependencies.map(idx => PortIdentity(idx))
)
}
} else {
List(InputPort(PortIdentity(), allowMultiLinks = true))
List(InputPort())
}
val outputPortInfo = if (outputPorts != null) {
outputPorts.zipWithIndex.map {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -137,11 +137,10 @@ class DualInputPortsPythonUDFOpDescV2 extends LogicalOp {
"User-defined function operator in Python script",
OperatorGroupConstants.PYTHON_GROUP,
inputPorts = List(
InputPort(PortIdentity(), displayName = "model", allowMultiLinks = true),
InputPort(PortIdentity(), displayName = "model"),
InputPort(
PortIdentity(1),
displayName = "tuples",
allowMultiLinks = true,
dependencies = List(PortIdentity(0))
)
),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -149,12 +149,12 @@ class PythonUDFOpDescV2 extends LogicalOp {
InputPort(
PortIdentity(idx),
displayName = portDesc.displayName,
allowMultiLinks = portDesc.allowMultiInputs,
disallowMultiLinks = portDesc.disallowMultiInputs,
dependencies = portDesc.dependencies.map(idx => PortIdentity(idx))
)
}
} else {
List(InputPort(PortIdentity(), allowMultiLinks = true))
List(InputPort())
}
val outputPortInfo = if (outputPorts != null) {
outputPorts.zipWithIndex.map {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -141,12 +141,12 @@ class RUDFOpDesc extends LogicalOp {
InputPort(
PortIdentity(idx),
displayName = portDesc.displayName,
allowMultiLinks = portDesc.allowMultiInputs,
disallowMultiLinks = portDesc.disallowMultiInputs,
dependencies = portDesc.dependencies.map(idx => PortIdentity(idx))
)
}
} else {
List(InputPort(PortIdentity(), allowMultiLinks = true))
List(InputPort())
}
val outputPortInfo = if (outputPorts != null) {
outputPorts.zipWithIndex.map {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ class UnionOpDesc extends LogicalOp {
"Union",
"Unions the output rows from multiple input operators",
OperatorGroupConstants.SET_GROUP,
inputPorts = List(InputPort(PortIdentity(0), allowMultiLinks = true)),
inputPorts = List(InputPort()),
outputPorts = List(OutputPort())
)
}
Original file line number Diff line number Diff line change
Expand Up @@ -52,12 +52,10 @@ class DotPlotOpDesc extends PythonOperatorDescriptor {
}

override def operatorInfo: OperatorInfo =
OperatorInfo(
OperatorInfo.forVisualization(
"Dot Plot",
"Visualize data using a dot plot",
OperatorGroupConstants.VISUALIZATION_BASIC_GROUP,
inputPorts = List(InputPort()),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
OperatorGroupConstants.VISUALIZATION_BASIC_GROUP
)

def createPlotlyFigure(): PythonTemplateBuilder = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,12 +70,10 @@ class IcicleChartOpDesc extends PythonOperatorDescriptor {
}

override def operatorInfo: OperatorInfo =
OperatorInfo(
OperatorInfo.forVisualization(
"Icicle Chart",
"Visualize hierarchical data from root to leaves",
OperatorGroupConstants.VISUALIZATION_BASIC_GROUP,
inputPorts = List(InputPort()),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
OperatorGroupConstants.VISUALIZATION_BASIC_GROUP
)

private def getIcicleAttributesInPython: String =
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,12 +48,10 @@ class ImageVisualizerOpDesc extends PythonOperatorDescriptor {
}

override def operatorInfo: OperatorInfo =
OperatorInfo(
OperatorInfo.forVisualization(
"Image Visualizer",
"visualize image content",
OperatorGroupConstants.VISUALIZATION_MEDIA_GROUP,
inputPorts = List(InputPort()),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
OperatorGroupConstants.VISUALIZATION_MEDIA_GROUP
)

def createBinaryData(): PythonTemplateBuilder = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,12 +66,10 @@ class ScatterMatrixChartOpDesc extends PythonOperatorDescriptor {
}

override def operatorInfo: OperatorInfo =
OperatorInfo(
OperatorInfo.forVisualization(
"Scatter Matrix Chart",
"Visualize datasets in a Scatter Matrix",
OperatorGroupConstants.VISUALIZATION_STATISTICAL_GROUP,
inputPorts = List(InputPort()),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
OperatorGroupConstants.VISUALIZATION_STATISTICAL_GROUP
)

def createPlotlyFigure(): PythonTemplateBuilder = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,12 +86,10 @@ class BarChartOpDesc extends PythonOperatorDescriptor {
}

override def operatorInfo: OperatorInfo =
OperatorInfo(
OperatorInfo.forVisualization(
"Bar Chart",
"Visualize data in a Bar Chart",
OperatorGroupConstants.VISUALIZATION_BASIC_GROUP,
inputPorts = List(InputPort()),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
OperatorGroupConstants.VISUALIZATION_BASIC_GROUP
)

def manipulateTable(): PythonTemplateBuilder = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,12 +78,10 @@ class BoxViolinPlotOpDesc extends PythonOperatorDescriptor {
}

override def operatorInfo: OperatorInfo =
OperatorInfo(
OperatorInfo.forVisualization(
"Box/Violin Plot",
"Visualize data using either a Box Plot or a Violin Plot. Box plots are drawn as a box with a vertical line down the middle which is mean value, and has horizontal lines attached to each side (known as “whiskers”). Violin plots provide more detail by showing a smoothed density curve on each side, and also include a box plot inside for comparison.",
OperatorGroupConstants.VISUALIZATION_STATISTICAL_GROUP,
inputPorts = List(InputPort()),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
OperatorGroupConstants.VISUALIZATION_STATISTICAL_GROUP
)

def manipulateTable(): PythonTemplateBuilder = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,12 +85,10 @@ class BubbleChartOpDesc extends PythonOperatorDescriptor {
}

override def operatorInfo: OperatorInfo =
OperatorInfo(
OperatorInfo.forVisualization(
"Bubble Chart",
"a 3D Scatter Plot; Bubbles are graphed using x and y labels, and their sizes determined by a z-value.",
OperatorGroupConstants.VISUALIZATION_BASIC_GROUP,
inputPorts = List(InputPort()),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have those changes on the output port? I thought the PR is only about input port?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a refactor. The input port and the output port is still the same. The duplicate port definition is moved to the parent class.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's separate refactor and feature change. even if you combine them, it will be good to annotate on the PR description. other wise this is unexpected change.

OperatorGroupConstants.VISUALIZATION_BASIC_GROUP
)

def manipulateTable(): PythonTemplateBuilder = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import org.apache.texera.amber.operator.PythonOperatorDescriptor
import org.apache.texera.amber.operator.metadata.annotations.AutofillAttributeName
import org.apache.texera.amber.operator.metadata.{OperatorGroupConstants, OperatorInfo}

import java.util
import java.util.{ArrayList, List => JList}
import scala.jdk.CollectionConverters._

Expand Down Expand Up @@ -57,7 +58,7 @@ class BulletChartOpDesc extends PythonOperatorDescriptor {
@JsonProperty(value = "steps", required = false)
@JsonSchemaTitle("Steps")
@JsonPropertyDescription("Optional: Each step includes a start and end value e.g., 0, 100.")
var steps: JList[BulletChartStepDefinition] = new ArrayList[BulletChartStepDefinition]()
var steps: JList[BulletChartStepDefinition] = new util.ArrayList[BulletChartStepDefinition]()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done by formatter.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unnecessary change. if it is caused by formatter then we should fix the formatter. does our formatter give inconsistent results across different runs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also checked, it's a better practice.


override def getOutputSchemas(
inputSchemas: Map[PortIdentity, Schema]
Expand All @@ -67,13 +68,11 @@ class BulletChartOpDesc extends PythonOperatorDescriptor {
}

override def operatorInfo: OperatorInfo =
OperatorInfo(
OperatorInfo.forVisualization(
"Bullet Chart",
"""Visualize data using a Bullet Chart that shows a primary quantitative bar and delta indicator.
|Optional elements such as qualitative ranges (steps) and a performance threshold are displayed only when provided.""".stripMargin,
OperatorGroupConstants.VISUALIZATION_FINANCIAL_GROUP,
inputPorts = List(InputPort()),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
OperatorGroupConstants.VISUALIZATION_FINANCIAL_GROUP
)

override def generatePythonCode(): String = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,12 +72,10 @@ class CandlestickChartOpDesc extends PythonOperatorDescriptor {
}

override def operatorInfo: OperatorInfo =
OperatorInfo(
OperatorInfo.forVisualization(
"Candlestick Chart",
"Visualize data in a Candlestick Chart",
OperatorGroupConstants.VISUALIZATION_FINANCIAL_GROUP,
inputPorts = List(InputPort()),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
OperatorGroupConstants.VISUALIZATION_FINANCIAL_GROUP
)

override def generatePythonCode(): String = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,12 +70,10 @@ class ChoroplethMapOpDesc extends PythonOperatorDescriptor {
}

override def operatorInfo: OperatorInfo =
OperatorInfo(
OperatorInfo.forVisualization(
"Choropleth Map",
"Visualize data using a Choropleth Map that uses shades of colors to show differences in properties or quantities between regions",
OperatorGroupConstants.VISUALIZATION_ADVANCED_GROUP,
inputPorts = List(InputPort()),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
OperatorGroupConstants.VISUALIZATION_ADVANCED_GROUP
)

def manipulateTable(): PythonTemplateBuilder = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,12 +57,10 @@ class ContinuousErrorBandsOpDesc extends PythonOperatorDescriptor {
}

override def operatorInfo: OperatorInfo =
OperatorInfo(
OperatorInfo.forVisualization(
"Continuous Error Bands",
"Visualize error or uncertainty along a continuous line",
OperatorGroupConstants.VISUALIZATION_STATISTICAL_GROUP,
inputPorts = List(InputPort()),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
OperatorGroupConstants.VISUALIZATION_STATISTICAL_GROUP
)

def createPlotlyFigure(): PythonTemplateBuilder = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,12 +77,10 @@ class ContourPlotOpDesc extends PythonOperatorDescriptor {
}

override def operatorInfo: OperatorInfo =
OperatorInfo(
OperatorInfo.forVisualization(
"Contour Plot",
"Displays terrain or gradient variations in a Contour Plot",
OperatorGroupConstants.VISUALIZATION_SCIENTIFIC_GROUP,
inputPorts = List(InputPort()),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
OperatorGroupConstants.VISUALIZATION_SCIENTIFIC_GROUP
)

override def generatePythonCode(): String = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -64,12 +64,10 @@ class DendrogramOpDesc extends PythonOperatorDescriptor {
}

override def operatorInfo: OperatorInfo =
OperatorInfo(
OperatorInfo.forVisualization(
"Dendrogram",
"Visualize data in a Dendrogram",
OperatorGroupConstants.VISUALIZATION_SCIENTIFIC_GROUP,
inputPorts = List(InputPort()),
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
OperatorGroupConstants.VISUALIZATION_SCIENTIFIC_GROUP
)

private def createDendrogram(): PythonTemplateBuilder = {
Expand Down
Loading
Loading