Skip to content

fix: add missing getContentByUrl import to crawl endpoint#651

Open
joelrb wants to merge 2 commits into
bettergovph:mainfrom
joelrb:fix/missing-getcontentbyurl-import
Open

fix: add missing getContentByUrl import to crawl endpoint#651
joelrb wants to merge 2 commits into
bettergovph:mainfrom
joelrb:fix/missing-getcontentbyurl-import

Conversation

@joelrb

@joelrb joelrb commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Description

Problem
functions/api/crawl.ts calls getContentByUrl(env, targetUrl, crawler) on line 256 to look up cached content when ?force=true is not specified. However, getContentByUrl was missing from the import statement on line 2. At runtime this throws ReferenceError: getContentByUrl is not defined, breaking the entire "fetch from cache" code path. Every non-force crawl request crashes instead of returning cached content.

Fix Applied
Added getContentByUrl to the named import on line 2 of functions/api/crawl.ts.

Framework Change
No framework changes.

Files Changed

  • functions/api/crawl.ts — Added getContentByUrl to import statement (line 2)

Security Severity: Low

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • CI/CD changes
  • Performance improvements

Related Issues

Fixes #missing-getcontentbyurl-import

Testing

Describe the testing performed for this PR:

  • Unit tests pass
  • Integration tests pass
  • Manual testing performed
  • All linting passes
  • Type checking passes
  • Dead code check passes

Include specific test scenarios or commands used.

Manual test:

  • Verified getContentByUrl is exported from functions/lib/crawler.ts
  • Confirmed the import on crawl.ts line 2 now includes getContentByUrl
  • Confirmed the usage on crawl.ts line 256 matches the imported function name

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have tested my changes locally
  • I have updated AGENTS.md if needed (for architectural changes)
  • I have added appropriate labels (Priority, Type, Area)
  • No new security vulnerabilities introduced

Performance & Security

  • Performance impact considered (no impact — one-line import fix)
  • Dependencies audited (no new dependencies)
  • Secrets not exposed
  • Error handling implemented appropriately

Screenshots (if applicable)

N/A

Additional Notes

This was introduced in the merged security fix #640. The getContentByUrl function was correctly added to the crawl endpoint's logic but was omitted from the import statement. This is a one-line fix that restores the "fetch from cache" code path in the crawl endpoint.

@DaijobuDes DaijobuDes added the bug Something isn't working label Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants