Skip to content

bugfix/cast_strings_as_floats#33

Open
fivetran-savage wants to merge 5 commits into
mainfrom
bugfix/cast_strings_as_floats
Open

bugfix/cast_strings_as_floats#33
fivetran-savage wants to merge 5 commits into
mainfrom
bugfix/cast_strings_as_floats

Conversation

@fivetran-savage
Copy link
Copy Markdown

@fivetran-savage fivetran-savage commented May 29, 2026

PR Overview

Package version introduced in this PR:

  • 0.9.2

This PR addresses the following Issue/Feature(s):

Summary of changes:

  • Fixes an issue in stg_workday__worker_history where annual_currency_summary_primary_compensation_basis, annual_currency_summary_total_base_pay, and annual_currency_summary_total_salary_and_allowances could surface as string datatypes by casting as float.

Submission Checklist

  • Alignment meeting with the reviewer (if needed)
    • Timeline and validation requirements discussed
  • Provide validation details:
    • Validation Steps: Check for unintentional effects (e.g., add/run consistency & integrity tests)
    • Testing Instructions: Confirm the change addresses the issue(s)
    • Focus Areas: Complex logic or queries that need extra attention
  • Merge any relevant open PRs into this PR

Changelog

  • Draft changelog for PR
  • Final changelog for release review

@fivetran-savage fivetran-savage self-assigned this Jun 1, 2026
@fivetran-savage fivetran-savage added the docs:ready Triggers the docs generator workflow. label Jun 1, 2026
Copy link
Copy Markdown
Contributor

@fivetran-avinash fivetran-avinash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-savage Sent some notes back, let me know if you have questions!

{% set string_dtypes = ['char', 'text', 'string'] %}
{% for col in adapter.get_columns_in_relation(ref('stg_workday__worker_base')) %}
{% if col.name.lower() in ['annual_currency_summary_primary_compensation_basis', 'annual_currency_summary_total_base_pay', 'annual_currency_summary_total_salary_and_allowances'] %}
{% if target.type == 'databricks' %}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a silent failure when running against databricks here. Variables set inside a {% for %} loop don't propagate out of it in Jinja2. So {% set is_str = true %} inside the {% for stype in string_dtypes %} loop has no effect, and is_str always stays false after the loop ends. This means string-typed compensation columns on Databricks are never cast to float and silently land as strings in the output table.

I tested with this configuration in integration_tests/dbt_project.yml

        annual_currency_summary_primary_compensation_basis: "{{ 'varchar(100)' if target.type in ('redshift', 'postgres') else 'string' }}"
        annual_currency_summary_total_base_pay: "{{ 'varchar(100)' if target.type in ('redshift', 'postgres') else 'string' }}"
        annual_currency_summary_total_salary_and_allowances: "{{ 'varchar(100)' if target.type in ('redshift', 'postgres') else 'string' }}" 

And when running

dbt compile -s stg_workday__worker_history -t databricks --vars '{"employee_history_enabled": true}'

You end up seeing that these fields are not cast to floats and remain as strings.

Image

So the Databricks case isn't quite solved as of yet. I wonder if we can use the Jinja namespace capability, to do the data type check, and then change the value to true that way. Might be worth testing.

Comment thread CHANGELOG.md
[PR #33](https://github.com/fivetran/dbt_workday/pull/33) includes the following update.

## Bug Fix
- Fixes an issue in `stg_workday__worker_history` where `annual_currency_summary_primary_compensation_basis`, `annual_currency_summary_total_base_pay`, and `annual_currency_summary_total_salary_and_allowances` could surface as string datatypes. These columns are now cast to float when the source type is a string.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These fields are also present in the base stg_workday__worker model. Should we apply the macro there as well?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per this comment in the Jira ticket, the decision was to only apply to the worker_history table.

annual_currency_summary_primary_compensation_basis,
annual_currency_summary_total_base_pay,
annual_currency_summary_total_salary_and_allowances,
{% set string_dtypes = ['char', 'text', 'string'] %}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think 'text' is a string data type in Databricks. It's harmless if it's there, but might be worth double-checking if it's safe to remove.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs:ready Triggers the docs generator workflow.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants