Skip to content

Conversation

@jiji-hoon96
Copy link

@jiji-hoon96 jiji-hoon96 commented Oct 13, 2025

Node.js v24 has deprecated url.parse(), causing deprecation warnings. This is related to security issues (hostname spoofing) and will be removed in future versions.

This PR replaces the deprecated url.parse() with the WHATWG URL API (new URL()) as recommended by Node.js, avoiding custom regex implementations for better reliability and maintainability.

Changes

  • Replaced deprecated url.parse() with native WHATWG URL API (new URL())
  • Implemented parseUrlString() wrapper function using native URL constructor
  • Added relative URL handling with base URL (http://localhost) for compatibility
  • Fixed hash fragment handling in fast-path parsing for Node.js 0.8 compatibility
  • Maintained existing behavior with whitespace trimming
  • All existing tests pass (20 tests)
  • ESLint code style compliance

Performance

Benchmark results show performance is maintained or improved compared to the original:

Test Case parseurl (new) url.parse() (legacy) WHATWG URL Performance vs Legacy
Full URL 1,181,856 ops/sec 650,146 ops/sec 1,611,026 ops/sec 1.8x faster
Path + Query 9,291,785 ops/sec 2,558,726 ops/sec 925,643 ops/sec 3.6x faster
Same Request (cached) 20,029,556 ops/sec 3,020,408 ops/sec 998,215 ops/sec 6.6x faster
Simple Path 68,538,878 ops/sec 7,547,222 ops/sec 1,249,746 ops/sec 9.1x faster
Root Path (/) 170,181,125 ops/sec 10,888,984 ops/sec 1,719,375 ops/sec 15.6x faster

@UlisesGascon
Copy link
Member

Hey @jiji-hoon96! Thanks for investing time and effort in helping the project! ❤️

You are right the url.parse() seems deprecated:

var url = require('url')
var parse = url.parse 
console.log(parse('https://expressjs.com/en/4x/api.html#req.protocol'))
(node:4158863) [DEP0169] DeprecationWarning: `url.parse()` behavior is not standardized and prone to errors that have security implications. Use the WHATWG URL API instead. CVEs are not issued for `url.parse()` vulnerabilities.
(Use `node --trace-deprecation ...` to show where the warning was created)

I was wondering if you explored the option to use the URL constructor directly like:

var data = new URL('https://expressjs.com/en/4x/api.html#req.protocol');
console.log(data);

The output is quite similar, but as mentioned here (url.parse supports a few use cases that are not supported with new URL().) not sure about the final impact for us.

// url.parse()

Url {
  protocol: 'https:',
  slashes: true,
  auth: null,
  host: 'expressjs.com',
  port: null,
  hostname: 'expressjs.com',
  hash: '#req.protocol',
  search: null,
  query: null,
  pathname: '/en/4x/api.html',
  path: '/en/4x/api.html',
  href: 'https://expressjs.com/en/4x/api.html#req.protocol'
}
//new URL()
URL {
  href: 'https://expressjs.com/en/4x/api.html#req.protocol',
  origin: 'https://expressjs.com',
  protocol: 'https:',
  username: '',
  password: '',
  host: 'expressjs.com',
  hostname: 'expressjs.com',
  port: '',
  pathname: '/en/4x/api.html',
  search: '',
  searchParams: URLSearchParams {},
  hash: '#req.protocol'
}

The regex you proposed in your parser seems to follow a linear complexity, but I prefer to avoid include custom implementations if we can use built-in utils or existing APIs.

wdyt @blakeembrey?

@jiji-hoon96
Copy link
Author

Thanks for the suggestion! You're right - using the URL constructor is a better approach than a custom implementation. I'll update the code to use new URL() instead. 👍

@bjohansebas
Copy link
Member

ref: nodejs/web-server-frameworks#71

@jiji-hoon96
Copy link
Author

Given the constraints, I believe is the best available option because

  • It removes the deprecation warnings (the main issue)
  • It uses Node.js built-in APIs as you prefer

However, I'm open to any direction you'd prefer. Should we

  • Keep the current WHATWG URL approach?
  • Go back to the custom parser?
  • Wait for a potential TolerantURL API in Node.js core?

Let me know your thoughts! 🙏

@raji-mahmud-a
Copy link

Proposal: Modernizing parseurl by Embracing Native Node.js APIs

Hello everyone,
I've been following the critical discussion around managing URL parsing in this package, particularly the need to address the deprecated url.parse method (as referenced in PR #44).
I'd like to propose a solution that leverages modern Node.js features while preserving the package's existing API contract, making the transition seamless for its extensive user base.
The Core Proposal: Plain Object URL Representation
The key change is to shift the implementation internally to use the native URL class but have parseurl return a plain JavaScript object with properties matching the expected URL structure.
This approach resolves our technical debt while maintaining stability:

  • Eliminates Deprecation Warnings: We stop using the deprecated url.parse method, resolving the primary technical issue.
  • Utilizes Built-in APIs: We lean on Node.js's optimized, modern URL constructor, which is generally preferred over custom parsing logic.
  • Retains Current API Structure: The returned value is explicitly not an instance of url.Url or global.URL, but a simple data object. This ensures consumers relying on direct property access will not break.

Implementation Preview

I have worked on a draft implementation of this change:

Impact Analysis: Backward Compatibility is Key
Given that Express.js accounts for about 90% of this package's weekly downloads, stability is paramount.

This update is designed to be fully backward compatible. Because the output retains the expected API structure as a plain object, existing consumers of the package should not require any modifications, ensuring a smooth transition for the entire ecosystem.

Critical CI/CD Pipeline Fixes

In parallel with the feature update, I've addressed critical maintenance issues in the existing CI/CD workflow (ci.yml) to ensure reliable testing on modern GitHub Actions runners:

Issue Recommended Fix
Critical Output Syntax Change all ::set-output commands to use the required $GITHUB_OUTPUT file format.
Coveralls Action Update to use the stable coverallsapp/github-action@v2 instead of the unstable @master tag.
Token Reference Correct the token reference in the coverage job to use the standard ${{ secrets.GITHUB_TOKEN }}.
Node Setup Simplification Replace the custom NVM/shell logic with the modern and recommended uses: actions/setup-node@v4.

CI/CD Fixes Implementation

I'll like to know what you think.
Thanks.

- Remove legacy Node.js versions (0.8, 0.10, 0.12) and EOL versions
  - Keep only active LTS versions: 16.x, 18.x, 20.x, 22.x
  - Replace deprecated ::set-output with $GITHUB_OUTPUT syntax
- Upgrade actions/setup-node to v4 (replacing custom nvm script
- Upgrade coverallsapp/github-action from @master to @v2
- Remove version-specific dependency overrides no longer needed
@jiji-hoon96
Copy link
Author

jiji-hoon96 commented Nov 24, 2025

@UlisesGascon @bjohansebas

Thank you for the constructive feedback!

I have updated the PR to address the comments and fix the CI failures:

  1. Dropped Legacy Support: Removed Node.js 0.8, 0.10, and 0.12 from the CI matrix as suggested.
  2. Updated CI Workflow: Upgraded GitHub Actions (checkout, setup-node) to v4 and fixed the deprecated set-output syntax.
  3. Fixed Typo

The CI checks are now passing. Ready for review!

@raji-mahmud-a
Copy link

raji-mahmud-a commented Nov 24, 2025

Nice 💜,

To remove the deprecated url.parse() method, I suggest removing all instances of url.Url or new URL() and converting it to a plain object and this makes it considerably faster since there is no object construction

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants