Skip to content

fix: 修复空文本被翻译的问题#211

Open
Alkacid wants to merge 1 commit intoBistutu:mainfrom
Alkacid:fix/empty_input
Open

fix: 修复空文本被翻译的问题#211
Alkacid wants to merge 1 commit intoBistutu:mainfrom
Alkacid:fix/empty_input

Conversation

@Alkacid
Copy link
Copy Markdown

@Alkacid Alkacid commented Apr 22, 2026

在多处添加了trim并检查待翻译文本是否为空的判断,解决了reddit帖子、github 贡献者头像处等网站出现大量类似“请提供翻译文本”的bug

前:
前1

前2

后:
后1

后2

Summary by Sourcery

Prevent translation logic from processing empty or whitespace-only text content across DOM scanning and translation entry points.

Bug Fixes:

  • Ignore whitespace-only text nodes when collecting DOM nodes for translation to avoid translating placeholder or empty content.
  • Skip translation in bilingual and single-translate flows when the cleaned text content is empty.
  • Short-circuit the translateText API when the input string is empty or only whitespace, returning the original value without calling translation services.

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Apr 22, 2026

Reviewer's Guide

This PR hardens the text selection and translation pipeline by stripping both ASCII and full-width whitespace before checks, and short-circuiting when content is effectively empty so that empty/whitespace-only nodes are no longer sent for translation or treated as translatable content.

Sequence diagram for the updated text translation pipeline ignoring empty content

sequenceDiagram
    actor User
    participant BrowserExtension as BrowserExtension
    participant DomModule as DomModule_grabAllNode
    participant TransModule as TransModule_trans
    participant TranslateApi as TranslateApi_translateText
    participant TranslationService as External_Translation_Service

    User->>BrowserExtension: Trigger_translation
    BrowserExtension->>DomModule: grabAllNode(rootNode)
    DomModule->>DomModule: Traverse_DOM_nodes
    DomModule->>DomModule: For_each_Text_node
    DomModule->>DomModule: textContent.replace(/[\s\u3000]/g, '')
    alt Text_has_non_empty_content
        DomModule-->>BrowserExtension: Return_filtered_nodes_including_Text_node
    else Text_is_empty_or_whitespace_only
        DomModule-->>BrowserExtension: Skip_Text_node
    end

    loop For_each_filtered_node
        BrowserExtension->>TransModule: singleTranslate(node) / bilingualTranslate(node)
        TransModule->>TransModule: cleanedText = node.textContent.replace(/[\s\u3000]/g, '')
        alt cleanedText_is_empty
            TransModule-->>BrowserExtension: Return_without_translation
        else cleanedText_not_empty
            TransModule->>TransModule: detectlang(cleanedText)
            alt Language_already_target_lang
                TransModule-->>BrowserExtension: Return_without_translation
            else Language_needs_translation
                TransModule->>TranslateApi: translateText(origin, context, options)
                TranslateApi->>TranslateApi: cleanedOrigin = origin.replace(/[\s\u3000]/g, '')
                alt cleanedOrigin_is_empty
                    TranslateApi-->>TransModule: Return_origin_without_calling_service
                else cleanedOrigin_not_empty
                    TranslateApi->>TranslateApi: detectlang(origin_without_whitespace)
                    alt Origin_language_is_target
                        TranslateApi-->>TransModule: Return_origin
                    else Origin_language_differs
                        TranslateApi->>TranslationService: Request_translation
                        TranslationService-->>TranslateApi: Translated_text
                        TranslateApi-->>TransModule: Return_translated_text
                    end
                end
                TransModule-->>BrowserExtension: Apply_translation_to_node
            end
        end
    end

    BrowserExtension-->>User: Display_page_without_placeholder_translations
Loading

Flow diagram for whitespace-aware text filtering in DOM scanning

flowchart TD
    A[Start_grabAllNode_with_rootNode] --> B[Create_TreeWalker_for_elements_and_text_nodes]
    B --> C[Get_next_node]
    C --> D{Node_is_null?}
    D -->|Yes| E[End_traversal_and_return_collected_nodes]
    D -->|No| F{Node_instanceof_Text?}

    F -->|Yes| G[Compute_textContent = node.textContent_or_empty_string]
    G --> H[cleanedText = textContent_without_spaces_and_full_width_spaces]
    H --> I{cleanedText.length > 0?}
    I -->|Yes| J[ACCEPT_Text_node]
    I -->|No| K[SKIP_Text_node]
    J --> L[Consider_node_for_translation_pipeline]
    K --> L

    F -->|No| M{Node_instanceof_Element?}
    M -->|No| N[SKIP_node]
    N --> C

    M -->|Yes| O[Inspect_childNodes]
    O --> P[Initialize_hasElement_false]
    P --> Q[Initialize_hasNonEmptyElement_false]
    Q --> R[Initialize_hasText_false]
    R --> S[For_each_child_in_childNodes]

    S --> T{child.nodeType_is_ELEMENT_NODE?}
    T -->|Yes| U[Set_hasElement_true]
    U --> V[childText = child.textContent_or_empty_string]
    V --> W[childText_without_spaces_and_full_width_spaces]
    W --> X{childText.length > 0?}
    X -->|Yes| Y[Set_hasNonEmptyElement_true]
    X -->|No| Z[Keep_hasNonEmptyElement_false]
    Y --> AA[Continue_children_loop]
    Z --> AA

    T -->|No| AB{child.nodeType_is_TEXT_NODE?}
    AB -->|Yes| AC[textContent = child.textContent_or_empty_string]
    AC --> AD[textContent_without_spaces_and_full_width_spaces]
    AD --> AE{textContent.length > 0?}
    AE -->|Yes| AF[Set_hasText_true]
    AE -->|No| AG[Keep_hasText_false]
    AF --> AH[Continue_children_loop]
    AG --> AH

    AB -->|No| AH[Continue_children_loop]

    AH --> AI{More_children?}
    AI -->|Yes| S
    AI -->|No| AJ[Evaluate_hasElement_hasNonEmptyElement_hasText]
    AJ --> AK[Decide_ACCEPT_or_SKIP_Element_node]
    AK --> C
Loading

File-Level Changes

Change Details Files
Filter out empty/whitespace-only text nodes during DOM traversal so only meaningful text is collected for translation.
  • Update the TreeWalker acceptNode logic to inspect Text node content, stripping both regular and full-width spaces before deciding whether to accept or skip the node.
  • Adjust child node inspection to compute trimmed text for element children and text children using a unified whitespace-stripping regex, and set hasNonEmptyElement/hasText only when the stripped content is non-empty.
entrypoints/main/dom.ts
Avoid triggering translation when a node’s text content is empty or whitespace-only in single and bilingual translation flows.
  • Introduce a cleanedText variable that strips ASCII and full-width spaces from node.textContent and early-return when it is empty before language detection.
  • Reuse cleanedText for detectlang calls to avoid duplicate replace operations and keep the language check consistent with the emptiness check in bilingualTranslate and singleTranslate.
entrypoints/main/trans.ts
Prevent backend translation calls for empty or whitespace-only origin strings in the generic translateText utility.
  • Normalize the origin string by stripping ASCII and full-width spaces into cleanedOrigin and return the original origin immediately when cleanedOrigin is empty.
  • Keep the existing detectlang-based short-circuit when the detected language matches the target, now after the explicit empty-string guard.
entrypoints/utils/translateApi.ts

Possibly linked issues

  • #: The PR prevents translating empty text, which removes the repeated “请提供翻译文本” shown in the translation interface issue.
  • #(none provided): PR implements whitespace-cleaning and empty-text checks before translation, exactly solving the issue’s requested local empty-text handling.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In translateText, you compute cleanedOrigin but still call detectlang(origin.replace(...)); reusing cleanedOrigin here would avoid duplicate work and keep the empty-text semantics consistent.
  • The checks if (!cleanedText || cleanedText.length === 0) / if (!cleanedOrigin || cleanedOrigin.length === 0) are redundant because the replace/|| '' guarantees a string; these conditions can be simplified to a single falsy check.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `translateText`, you compute `cleanedOrigin` but still call `detectlang(origin.replace(...))`; reusing `cleanedOrigin` here would avoid duplicate work and keep the empty-text semantics consistent.
- The checks `if (!cleanedText || cleanedText.length === 0)` / `if (!cleanedOrigin || cleanedOrigin.length === 0)` are redundant because the replace/`|| ''` guarantees a string; these conditions can be simplified to a single falsy check.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant