Skip to content

Commit 39d6db5

Browse files
Jaswanth51Copilotfs-eirewenqinICopilot
authored
Sync with Microsoft ONNX Runtime - 03/12/2025 (#867)
* Fix npm audit vulnerabilities in /js directory (microsoft#26632) ### Description Resolved all security vulnerabilities in JavaScript packages under `/js` by running `npm audit fix`. All updates are non-breaking patch/minor version bumps. **Fixed vulnerabilities:** - `/js` root: 1 high severity - `glob` 10.4.5 → 10.5.0 (command injection - GHSA-5j98-mcp5-4vw2) - `/js/react_native`: 7 vulnerabilities (1 high, 3 moderate, 3 low) - `image-size` → 1.2.1 (high: DoS via infinite loop - GHSA-m5qc-5hw7-8vg7) - `@babel/helpers` 7.25.6 → 7.28.4 (moderate: RegExp complexity - GHSA-968p-4wvh-cqc8) - `@babel/runtime` 7.25.6 → 7.28.4 (moderate: RegExp complexity - GHSA-968p-4wvh-cqc8) - `js-yaml` → fixed (moderate: prototype pollution - GHSA-mh29-5h37-fv8m) - `brace-expansion` 2.0.1 → 2.0.2 (low: ReDoS - GHSA-v6h2-p8h4-qcjw) - `on-headers` → fixed (low: header manipulation - GHSA-76c9-3jph-rj3q) **Files modified:** - `js/package-lock.json` - `js/react_native/package-lock.json` **Result:** All JS packages (`/js`, `/js/common`, `/js/web`, `/js/node`, `/js/react_native`) now report 0 vulnerabilities. ### Motivation and Context Security maintenance to address dependency vulnerabilities identified by `npm audit`. No breaking changes or code modifications required. <!-- START COPILOT CODING AGENT SUFFIX --> <details> <summary>Original prompt</summary> > Please create a pull request that runs `npm audit fix` for the JavaScript/TypeScript portion of the repository under the `/js` directory of [microsoft/onnxruntime](https://github.com/microsoft/onnxruntime). > > Requirements: > > 1. **Scope** > - Work only within the `/js` folder and its subpackages (e.g., `js/web`, `js/node`, `js/common`, etc.). > - Do not modify files outside `/js`. > > 2. **Dependency updates** > - Run `npm audit fix` (and, if necessary to fully resolve high/critical issues while staying non-breaking, `npm audit fix --force` on specific subpackages) to address security vulnerabilities. > - Prefer minimal, non-breaking version bumps (patch and minor) that satisfy `npm audit` while keeping semver ranges sensible. > - If any **major** upgrades are required to clear vulnerabilities, handle them cautiously: > - Apply the upgrade only if tests still pass and typings/build setup remain compatible. > - If a major bump would require code changes or creates breaking behavior, **do not** apply it; instead, leave a TODO comment in the PR description summarizing which packages remain vulnerable and why. > > 3. **Validation** > - Run the existing JS-related checks that the repo supports from `/js`, such as: > - `npm test` or package-specific test scripts. > - Any documented lint/build/test commands for JS packages (e.g., `npm run build`, `npm run lint`) where applicable. > - Ensure the updated lockfiles (if present) are consistent, and the project installs cleanly with `npm ci` (or the repo's documented install command) in the `/js` area. > > 4. **Files to update** > - Update `package.json` and lockfiles under `/js` (e.g., `package-lock.json`, `npm-shrinkwrap.json`, or workspace-specific lock files) to reflect the audited dependency tree. > - Do not manually edit `node_modules`; rely on `npm` to manage dependencies and only commit manifest/lockfile changes. > > 5. **Repository conventions** > - Follow this repo's existing conventions for formatting, commit messages, and JS tooling. > - Keep the diff focused on the dependency and lockfile updates plus any absolutely necessary code tweaks to maintain compatibility. > > 6. **Pull request description** > - In the PR body, include: > - A short summary: that `npm audit fix` was run in `/js` to address dependency vulnerabilities. > - A bullet list of notable dependency changes (especially any major version bumps), with packages and old/new versions. > - A brief testing summary (commands run and their results). > - A note about any remaining vulnerabilities that could not be fixed without breaking changes (if applicable), including the affected packages and advisories if available. > > The goal is a clean, minimal PR that improves the security posture of the JS packages under `/js` in `microsoft/onnxruntime` without introducing breaking changes. </details> *This pull request was created as a result of the following prompt from Copilot chat.* > Please create a pull request that runs `npm audit fix` for the JavaScript/TypeScript portion of the repository under the `/js` directory of [microsoft/onnxruntime](https://github.com/microsoft/onnxruntime). > > Requirements: > > 1. **Scope** > - Work only within the `/js` folder and its subpackages (e.g., `js/web`, `js/node`, `js/common`, etc.). > - Do not modify files outside `/js`. > > 2. **Dependency updates** > - Run `npm audit fix` (and, if necessary to fully resolve high/critical issues while staying non-breaking, `npm audit fix --force` on specific subpackages) to address security vulnerabilities. > - Prefer minimal, non-breaking version bumps (patch and minor) that satisfy `npm audit` while keeping semver ranges sensible. > - If any **major** upgrades are required to clear vulnerabilities, handle them cautiously: > - Apply the upgrade only if tests still pass and typings/build setup remain compatible. > - If a major bump would require code changes or creates breaking behavior, **do not** apply it; instead, leave a TODO comment in the PR description summarizing which packages remain vulnerable and why. > > 3. **Validation** > - Run the existing JS-related checks that the repo supports from `/js`, such as: > - `npm test` or package-specific test scripts. > - Any documented lint/build/test commands for JS packages (e.g., `npm run build`, `npm run lint`) where applicable. > - Ensure the updated lockfiles (if present) are consistent, and the project installs cleanly with `npm ci` (or the repo's documented install command) in the `/js` area. > > 4. **Files to update** > - Update `package.json` and lockfiles under `/js` (e.g., `package-lock.json`, `npm-shrinkwrap.json`, or workspace-specific lock files) to reflect the audited dependency tree. > - Do not manually edit `node_modules`; rely on `npm` to manage dependencies and only commit manifest/lockfile changes. > > 5. **Repository conventions** > - Follow this repo's existing conventions for formatting, commit messages, and JS tooling. > - Keep the diff focused on the dependency and lockfile updates plus any absolutely necessary code tweaks to maintain compatibility. > > 6. **Pull request description** > - In the PR body, include: > - A short summary: that `npm audit fix` was run in `/js` to address dependency vulnerabilities. > - A bullet list of notable dependency changes (especially any major version bumps), with packages and old/new versions. > - A brief testing summary (commands run and their results). > - A note about any remaining vulnerabilities that could not be fixed without breaking changes (if applicable), including the affected packages and advisories if available. > > The goal is a clean, minimal PR that improves the security posture of the JS packages under `/js` in `microsoft/onnxruntime` without introducing breaking changes. <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/microsoft/onnxruntime/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo. --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: fs-eire <[email protected]> * [webgpu] Optimize InstanceNormalization by removing redundant transpose (microsoft#26626) ### Description <!-- Describe your changes. --> This PR optimizes `InstanceNormalization` by removing redundant transpose. Given the implementation of `InstanceNormalization` for `NCHW` is more effiencient, we don't need to add wrapper `Transpose` to make it run in `NHWC`, which helps use to elide redundant transpose and improve performance. Testing on Lunar Lake shows about `~60%` performance improvement in `InstanceNormalization` operations. #### `InstanceNormalization` OP benchmark The input tensor shape: `(1,32,1048576)` The scale tensor shape: `(32)` The B tensor shape: `(32)` | time cost (ms) | baseline | opt | diff | | ---------------- | -------- | ---- | ---- | | Lunar Lake | 82.6 | 34.2 | 58% | #### Model benchmark | time cost (ms) | baseline | opt | diff | | ---------------- | -------- | ---- | ---- | | sd-turbo-vae-decoder-fp16-demo | 2437.6 | 1835.9 | 25% | ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Please see above * [webgpu] refactor a few "context" classes (microsoft#26602) ### Description This PR refactors a few "context" classes to make it clearer and support new features. --------- Co-authored-by: Copilot <[email protected]> Co-authored-by: Copilot <[email protected]> * Bump actions/checkout from 5 to 6 (microsoft#26641) Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/releases">actions/checkout's releases</a>.</em></p> <blockquote> <h2>v6.0.0</h2> <h2>What's Changed</h2> <ul> <li>Update README to include Node.js 24 support details and requirements by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li> <li>Persist creds to a separate file by <a href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li> <li>v6-beta by <a href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2298">actions/checkout#2298</a></li> <li>update readme/changelog for v6 by <a href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2311">actions/checkout#2311</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v5.0.0...v6.0.0">https://github.com/actions/checkout/compare/v5.0.0...v6.0.0</a></p> <h2>v6-beta</h2> <h2>What's Changed</h2> <p>Updated persist-credentials to store the credentials under <code>$RUNNER_TEMP</code> instead of directly in the local git config.</p> <p>This requires a minimum Actions Runner version of <a href="https://github.com/actions/runner/releases/tag/v2.329.0">v2.329.0</a> to access the persisted credentials for <a href="https://docs.github.com/en/actions/tutorials/use-containerized-services/create-a-docker-container-action">Docker container action</a> scenarios.</p> <h2>v5.0.1</h2> <h2>What's Changed</h2> <ul> <li>Port v6 cleanup to v5 by <a href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v5...v5.0.1">https://github.com/actions/checkout/compare/v5...v5.0.1</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2>V6.0.0</h2> <ul> <li>Persist creds to a separate file by <a href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li> <li>Update README to include Node.js 24 support details and requirements by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li> </ul> <h2>V5.0.1</h2> <ul> <li>Port v6 cleanup to v5 by <a href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li> </ul> <h2>V5.0.0</h2> <ul> <li>Update actions checkout to use node 24 by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li> </ul> <h2>V4.3.1</h2> <ul> <li>Port v6 cleanup to v4 by <a href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2305">actions/checkout#2305</a></li> </ul> <h2>V4.3.0</h2> <ul> <li>docs: update README.md by <a href="https://github.com/motss"><code>@​motss</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li>Add internal repos for checking out multiple repositories by <a href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li>Documentation update - add recommended permissions to Readme by <a href="https://github.com/benwells"><code>@​benwells</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li>Adjust positioning of user email note and permissions heading by <a href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li> <li>Update README.md by <a href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li>Update CODEOWNERS for actions by <a href="https://github.com/TingluoHuang"><code>@​TingluoHuang</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li> <li>Update package dependencies by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> </ul> <h2>v4.2.2</h2> <ul> <li><code>url-helper.ts</code> now leverages well-known environment variables by <a href="https://github.com/jww3"><code>@​jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li> <li>Expand unit test coverage for <code>isGhes</code> by <a href="https://github.com/jww3"><code>@​jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li> </ul> <h2>v4.2.1</h2> <ul> <li>Check out other refs/* by commit if provided, fall back to ref by <a href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li> </ul> <h2>v4.2.0</h2> <ul> <li>Add Ref and Commit outputs by <a href="https://github.com/lucacome"><code>@​lucacome</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li> <li>Dependency updates by <a href="https://github.com/dependabot"><code>@​dependabot</code></a>- <a href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>, <a href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li> </ul> <h2>v4.1.7</h2> <ul> <li>Bump the minor-npm-dependencies group across 1 directory with 4 updates by <a href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li> <li>Bump actions/checkout from 3 to 4 by <a href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li> <li>Check out other refs/* by commit by <a href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li> <li>Pin actions/checkout's own workflows to a known, good, stable version. by <a href="https://github.com/jww3"><code>@​jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li> </ul> <h2>v4.1.6</h2> <ul> <li>Check platform to set archive extension appropriately by <a href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li> </ul> <h2>v4.1.5</h2> <ul> <li>Update NPM dependencies by <a href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li> <li>Bump github/codeql-action from 2 to 3 by <a href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li> <li>Bump actions/setup-node from 1 to 4 by <a href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li> <li>Bump actions/upload-artifact from 2 to 4 by <a href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/actions/checkout/commit/1af3b93b6815bc44a9784bd300feb67ff0d1eeb3"><code>1af3b93</code></a> update readme/changelog for v6 (<a href="https://redirect.github.com/actions/checkout/issues/2311">#2311</a>)</li> <li><a href="https://github.com/actions/checkout/commit/71cf2267d89c5cb81562390fa70a37fa40b1305e"><code>71cf226</code></a> v6-beta (<a href="https://redirect.github.com/actions/checkout/issues/2298">#2298</a>)</li> <li><a href="https://github.com/actions/checkout/commit/069c6959146423d11cd0184e6accf28f9d45f06e"><code>069c695</code></a> Persist creds to a separate file (<a href="https://redirect.github.com/actions/checkout/issues/2286">#2286</a>)</li> <li><a href="https://github.com/actions/checkout/commit/ff7abcd0c3c05ccf6adc123a8cd1fd4fb30fb493"><code>ff7abcd</code></a> Update README to include Node.js 24 support details and requirements (<a href="https://redirect.github.com/actions/checkout/issues/2248">#2248</a>)</li> <li>See full diff in <a href="https://github.com/actions/checkout/compare/v5...v6">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=5&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add LogEvaluationStart for ReplayGraph (microsoft#26645) ### Description <!-- Describe your changes. --> add LogEvaluationStart for ReplayGraph to match LogEvaluationStop ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> So by using ETW, could capture run time correctly Co-authored-by: hualxie <[email protected]> * add LogCompileModel to mark the session usage (microsoft#26646) ### Description <!-- Describe your changes. --> add LogCompileModel to mark the session usage as Compile because that session will not be used for inference We could also use it to log compile model parameters if needed ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> We are building a profiling tool for WinML and we want to differentiate Compile session and inference session. I think there are two ways to do it but I don't know which is better microsoft#26646 microsoft#26647 --------- Co-authored-by: hualxie <[email protected]> * [webgpu] Fix bug introduced by RoE (microsoft#26661) Fix bug introduced by microsoft#26563 which used the wrong condition by accident and results incorrect result in graph capture mode. * [QNN-EP] Enable verbose and artifacts saving in onnxruntime_provider_test.exe (microsoft#26396) ### Description <!-- Describe your changes. --> - The change allows users to better debug unit tests by adding the following environment variables: - `QNN_DUMP_ONNX`: Dump input onnx model - `QNN_DUMP_JSON`: Dump json qnn graph with provider_option `dump_json_qnn_graph` - `QNN_DUMP_DLC`: Dump dlc with provider_option `qnn_ir_backend_path` - `QNN_VERBOSE`: Use the log level `ORT_LOGGING_LEVEL_VERBOSE` - Developers can use the environment variables above to save the artifacts of QNN-EP testcases to a directory named with `<TestSuite>_<TestName>` ``` . ├── QnnCPUBackendTests_BatchNorm2D_fp32 # RunQnnModelTest │ ├── dumped_f32_model.onnx # float32 ONNX model │ ├── QNNExecutionProvider_QNN_XXXX_X_X.dlc │ └── QNNExecutionProvider_QNN_XXXX_X_X.json ├── QnnHTPBackendTests_BatchNorm_FP16 # TestFp16ModelAccuracy │ ├── dumped_f16_model.onnx # float16 ONNX model │ ├── dumped_f32_model.onnx # float32 ONNX model │ ├── QNNExecutionProvider_QNN_XXXX_X_X.dlc │ └── QNNExecutionProvider_QNN_XXXX_X_X.json └── QnnHTPBackendTests_BatchNorm2D_U8U8S32 # TestQDQModelAccuracy ├── dumped_f32_model.onnx # float32 ONNX model ├── dumped_qdq_model.onnx # QDQ ONNX model ├── QNNExecutionProvider_QNN_XXXX_X_X.dlc └── QNNExecutionProvider_QNN_XXXX_X_X.json # All artifact files are placed under the current working directory from which the test binary is invoked. ``` ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> - The Json qnn graph/dlc are helpful for backend to debug performance/accuracy issues - By comparing the onnx and Json qnn graph/dlc, we can locate the issue about graph manipulation. * [webgpu] Use multiplication instead of pow if exponent is 2 (microsoft#26667) ### Description More accurately compute Pow(2.0) on WebGPU EP. Reproduction script: ```py from onnx import helper, TensorProto import onnxruntime as ort import numpy as np # 1. Create the ONNX model # Define input and output input_info = helper.make_tensor_value_info('X', TensorProto.FLOAT, [1, 1]) output_info = helper.make_tensor_value_info('Y', TensorProto.FLOAT, [1, 1]) # Create a constant tensor for the exponent (2.0) exponent_tensor = helper.make_tensor('exponent', TensorProto.FLOAT, [], [2.0]) exponent_node = helper.make_node('Constant', [], ['exponent_out'], value=exponent_tensor) # Create the Pow node # Pow takes two inputs: Base (X) and Power (exponent_out) pow_node = helper.make_node( 'Pow', inputs=['X', 'exponent_out'], outputs=['Y'], name='PowNode' ) # Create the graph graph_def = helper.make_graph( [exponent_node, pow_node], 'test-model', [input_info], [output_info] ) # Create the model model_def = helper.make_model(graph_def, producer_name='onnx-example') opset = model_def.opset_import[0] opset.version = 13 # Ensure opset version supports the operations # 2. Convert model to string (bytes) model_str = model_def.SerializeToString() # 3. Prepare input data np.random.seed(0) input_data = np.array([[-2e3]], dtype=np.float32) # 4. Run on CPUExecutionProvider sess_cpu = ort.InferenceSession(model_str, providers=['CPUExecutionProvider']) res_cpu = sess_cpu.run(['Y'], {'X': input_data})[0] print("CPU Result:", res_cpu) # 5. Run on WebGpuExecutionProvider sess_webgpu = ort.InferenceSession(model_str, providers=['WebGpuExecutionProvider']) res_webgpu = sess_webgpu.run(['Y'], {'X': input_data})[0] print("WebGPU Result:", res_webgpu) # Compare results diff = np.abs(res_cpu - res_webgpu) max_diff = diff.max().item() assert max_diff < 1e-5, f"Results do not match within tolerance! Max diff: {max_diff}" print("Results match!") ``` currently produces ``` CPU Result: [[4.e+06]] WebGPU Result: [[3.999999e+06]] --------------------------------------------------------------------------- AssertionError Traceback (most recent call last) Cell In[1], [line 56](vscode-notebook-cell:?execution_count=1&line=56) 54 diff = np.abs(res_cpu - res_webgpu) 55 max_diff = diff.max().item() ---> [56](vscode-notebook-cell:?execution_count=1&line=56) assert max_diff < 1e-5, f"Results do not match within tolerance! Max diff: {max_diff}" 57 print("Results match!") AssertionError: Results do not match within tolerance! Max diff: 1.0 ``` but with this PR: ``` CPU Result: [[4.e+06]] WebGPU Result: [[4.e+06]] Results match! ``` ### Motivation and Context Leads to downstream issues/inaccuracies for certain models, especially those which have larger values to compute pow(x,2) for. cc @guschmue * Avoid creation of temporary protobuf object (microsoft#26681) ### Description While profiling session creation time for large graphs (number of nodes, not size of tensors), we noticed that the creations and subsequent destructions of protobuf objects were the major hotspot. This PR avoids its creation. Signed-off-by: Christian Bourjau <[email protected]> * Use `std::string_view` directly as key to `absl::flat_hash_map::find` (microsoft#26682) ### Description Use `std::string_view` directly as key in `find` method of `flat_hash_map`. This part of the absl documentation may provide further insights: https://abseil.io/docs/cpp/guides/container#heterogeneous-lookup ### Motivation and Context We noticed this when profiling the session creation of large models (in terms of the number of nodes). Signed-off-by: Christian Bourjau <[email protected]> * [webgpu] Convert i32 to u32 in uniforms (microsoft#26676) In debug mode, `webgpu_context.cc:257 Run Uniform variable[5] (head_size) data type mismatch in program "SplitPackedQKVWithRotaryEmbeddingAndCopyKV", Expected: u32, Actual: i32`. No issue in release mode. Convert i32 to u32 to avoid this issue. * [webgpu] Fix BatchNormalization ShapeInferenceError for 2D inputs (microsoft#26659) ### Description Test model (happens with any 2D inputs): [2191__visual_projection_visual_projection.1_BatchNormalization.onnx.zip](https://github.com/user-attachments/files/23758390/2191__visual_projection_visual_projection.1_BatchNormalization.onnx.zip) Command: ``` python -c "import onnxruntime as ort; ort.InferenceSession('2191__visual_projection_visual_projection.1_BatchNormalization.onnx', providers=['WebGpuExecutionProvider'])" ``` Before (failure): ``` Op (BatchNormalization) [ShapeInferenceError] Tensor must have at least 3 dimensions to convert between channels first and channels last. ``` After (success): ``` (nothing, meaning success) ``` ### Motivation and Context This fixes BatchNormalization on WebGPU, matching CPU version. cc @guschmue * Clear cuda error on unsupported CudaMemPool test (microsoft#26629) ### Description <!-- Describe your changes. --> CudaMemPool test checks if it is supported in a given environment. We need to clear the error not to affect subsequent tests. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Potential test failure. * [QNN-EP] Include detailed error message in the returned status (microsoft#26546) ### Description <!-- Describe your changes. --> The original error message only shows: "Failed to setup QNN input tensors for graph: <graph_name>" This change adds more detailed error information by logging the failure reason from [SetupTensors](https://github.com/microsoft/onnxruntime/blob/ea55c160a36d658eae61a4c7aeda6cb55dd54dec/onnxruntime/core/providers/qnn/builder/qnn_model.cc#L386), making it easier to debug issues. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> User requires detailed error logging for the ORT online context binary generation. * add support for int32_t in webgpu / slice (microsoft#26693) fix for microsoft#26690 * [webgpu] Remove `global_id` and `workgroup_id` in gemm_utils.cc (microsoft#26662) ### Description This patch replaces `global_id` and `workgroup_id` with `logical_global_id` and `logical_workgroup_id` which are computed from `workgroup_idx` and the dispatch workgroup sizes set in `ProgramBase::SetDispatchGroupSize()`. ### Motivation and Context We shouldn't use `global_id` or `workgroup_id` directly because the dispatch workgroup sizes may be normalized in `ProgramManager::NormalizeDispatchGroupSize()`. * [webgpu] Correct definition of large numbers, fixes softmax(max_negative_number) in float32 (microsoft#26670) ### Description The correct definition of the most negative number is `-3.40282346638528e+38`, according to IEEE 754, but it is being incorrectly registered inline as a truncated version `-3.402823e+38f`. ```py >>> import numpy as np >>> np.finfo(np.float32).min np.float32(-3.4028235e+38) >>> np.finfo(np.float32).min.item() -3.4028234663852886e+38 ``` For this reason, values less than this threshold were handled incorrectly. While this may seem like a small/irrelevant detail, it's essential in attention masking, where we do in fact use this value, leading to large numerical errors down the line. Reproduction: ```py from onnx import helper, TensorProto import onnxruntime as ort import numpy as np # 1. Create the ONNX model # Define input and output input_shape = [1, 2] input_info = helper.make_tensor_value_info('X', TensorProto.FLOAT, input_shape) output_info = helper.make_tensor_value_info('Y', TensorProto.FLOAT, input_shape) # Create the Softmax node # Softmax takes one input: X softmax_node = helper.make_node( 'Softmax', inputs=['X'], outputs=['Y'], name='SoftmaxNode', axis=-1 # Default axis is -1, usually applied to the last dimension ) # Create the graph graph_def = helper.make_graph( [softmax_node], 'test-model', [input_info], [output_info] ) # Create the model model_def = helper.make_model(graph_def, producer_name='onnx-example') opset = model_def.opset_import[0] opset.version = 13 # Ensure opset version supports the operations # 2. Convert model to string (bytes) model_str = model_def.SerializeToString() # 3. Prepare input data np.random.seed(0) input_data = np.array( [[-3.40282346638528e+38, -3.40282346638528e+38]] # [[-3.4028234663852886e+38, -3.4028234663852886e+38]] ).astype(np.float32) print(input_data.tolist()) # 4. Run on CPUExecutionProvider sess_cpu = ort.InferenceSession(model_str, providers=['CPUExecutionProvider']) res_cpu = sess_cpu.run(['Y'], {'X': input_data})[0] print("CPU Result:", res_cpu) # 5. Run on WebGpuExecutionProvider sess_webgpu = ort.InferenceSession(model_str, providers=['WebGpuExecutionProvider']) res_webgpu = sess_webgpu.run(['Y'], {'X': input_data})[0] print("WebGPU Result:", res_webgpu) # Compare results diff = np.abs(res_cpu - res_webgpu) max_diff = diff.max().item() print(diff) print(f"Max diff: {max_diff}") assert max_diff < 1e-5, f"Results do not match within tolerance! Max diff: {max_diff}" print("Results match!") ``` Before: ``` [[-3.4028234663852886e+38, -3.4028234663852886e+38]] CPU Result: [[0.5 0.5]] WebGPU Result: [[0. 0.]] [[0.5 0.5]] Max diff: 0.5 AssertionError: Results do not match within tolerance! Max diff: 0.5 ``` After: ``` [[-3.4028234663852886e+38, -3.4028234663852886e+38]] CPU Result: [[0.5 0.5]] WebGPU Result: [[0.5 0.5]] [[0. 0.]] Max diff: 0.0 Results match! ``` cc @guschmue * [TRT/TRT RTX EP] Fix bug for missing outputs in the returning ComputeCapability/IndexedSubGraph (microsoft#26444) ### Description For TRT EP's `GetCapability()`, in some case, the `GetSubGraph()` won't add graph's output to the `ComputeCapability/IndexedSubGraph` returning to ORT. The issue if from following code: ````c++ ... if (node->GetOutputEdgesCount() > node->OutputDefs().size()) { ... // execute here } else { ... if (graph_output_names.find(output->Name()) != graph_output_names.end()) { graph_outputs_to_add[output] = output_order; // missing this } } ```` Update TRT RTX EP as well. ### Motivation and Context microsoft#25373 * [ROCM] Remove docker, contrib ops, ci scripts related to ROCM EP (microsoft#26697) ### Description This is follow up of microsoft#25181 to remove ROCM EP related files to avoid confusion. Documents will be updated later. ### Motivation and Context microsoft#26692 --------- Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Christian Bourjau <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: fs-eire <[email protected]> Co-authored-by: Wenqin Yang <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: xieofxie <[email protected]> Co-authored-by: hualxie <[email protected]> Co-authored-by: Jiajia Qin <[email protected]> Co-authored-by: qti-hungjuiw <[email protected]> Co-authored-by: Joshua Lochner <[email protected]> Co-authored-by: Christian Bourjau <[email protected]> Co-authored-by: Xiaofei Han <[email protected]> Co-authored-by: Dmitri Smirnov <[email protected]> Co-authored-by: chunghow-qti <[email protected]> Co-authored-by: Guenther Schmuelling <[email protected]> Co-authored-by: Jiawei Shao <[email protected]> Co-authored-by: Chi Lo <[email protected]> Co-authored-by: Tianlei Wu <[email protected]>
1 parent bb738ba commit 39d6db5

File tree

161 files changed

+1176
-8816
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

161 files changed

+1176
-8816
lines changed

.github/workflows/android.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ jobs:
2727
runs-on: ["self-hosted", "1ES.Pool=onnxruntime-github-Ubuntu2204-AMD-CPU"]
2828
steps:
2929
- name: Checkout repository
30-
uses: actions/checkout@v5
30+
uses: actions/checkout@v6
3131
with:
3232
submodules: false
3333

@@ -112,7 +112,7 @@ jobs:
112112
android_nnapi_ep:
113113
runs-on: ["self-hosted", "1ES.Pool=onnxruntime-github-Ubuntu2204-AMD-CPU"]
114114
steps:
115-
- uses: actions/checkout@v5
115+
- uses: actions/checkout@v6
116116

117117
- name: Use jdk 17
118118
uses: actions/setup-java@v5
@@ -187,7 +187,7 @@ jobs:
187187
name: Android CI Pipeline
188188
runs-on: ["self-hosted", "1ES.Pool=onnxruntime-github-Ubuntu2204-AMD-CPU"]
189189
steps:
190-
- uses: actions/checkout@v5
190+
- uses: actions/checkout@v6
191191

192192
- name: Use jdk 17
193193
uses: actions/setup-java@v5

.github/workflows/cffconvert.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ jobs:
1212
runs-on: ["self-hosted", "1ES.Pool=onnxruntime-github-Ubuntu2204-AMD-CPU"]
1313
steps:
1414
- name: Check out a copy of the repository
15-
uses: actions/checkout@v5
15+
uses: actions/checkout@v6
1616

1717
- name: Check whether the citation metadata from CITATION.cff is valid
1818
uses: citation-file-format/[email protected]

.github/workflows/codeql.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ jobs:
3838

3939
steps:
4040
- name: Checkout repository
41-
uses: actions/checkout@v5
41+
uses: actions/checkout@v6
4242

4343
# Initializes the CodeQL tools for scanning.
4444
- name: Initialize CodeQL

.github/workflows/gradle-wrapper-validation.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ jobs:
1515
name: "Validation"
1616
runs-on: ["self-hosted", "1ES.Pool=onnxruntime-github-Ubuntu2204-AMD-CPU"]
1717
steps:
18-
- uses: actions/checkout@v5
18+
- uses: actions/checkout@v6
1919
- uses: gradle/actions/wrapper-validation@v5
2020
concurrency:
2121
group: ${{ github.workflow }}-${{ github.event_name == 'pull_request' && github.ref || github.sha }}

.github/workflows/ios.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ jobs:
2020
runs-on: macos-14
2121
steps:
2222
- name: Checkout repository
23-
uses: actions/checkout@v5
23+
uses: actions/checkout@v6
2424
with:
2525
submodules: false
2626
- uses: microsoft/onnxruntime-github-actions/[email protected]

.github/workflows/lint.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ jobs:
1717
name: Optional Lint
1818
runs-on: ubuntu-latest
1919
steps:
20-
- uses: actions/checkout@v5
20+
- uses: actions/checkout@v6
2121
- name: misspell # Check spellings as well
2222
uses: reviewdog/action-misspell@v1
2323
with:
@@ -42,7 +42,7 @@ jobs:
4242
contents: read
4343
security-events: write
4444
steps:
45-
- uses: actions/checkout@v5
45+
- uses: actions/checkout@v6
4646
- name: Setup Python
4747
uses: actions/setup-python@v6
4848
with:
@@ -87,7 +87,7 @@ jobs:
8787
name: Optional Lint C++
8888
runs-on: ubuntu-latest
8989
steps:
90-
- uses: actions/checkout@v5
90+
- uses: actions/checkout@v6
9191
- name: Update PATH
9292
run: |
9393
echo "$HOME/.local/bin" >> "$GITHUB_PATH"
@@ -116,7 +116,7 @@ jobs:
116116
name: Lint JavaScript
117117
runs-on: ["self-hosted", "1ES.Pool=onnxruntime-github-Ubuntu2204-AMD-CPU"]
118118
steps:
119-
- uses: actions/checkout@v5
119+
- uses: actions/checkout@v6
120120
- uses: actions/setup-node@v6
121121
with:
122122
node-version: 20

.github/workflows/linux-wasm-ci-build-and-test-workflow.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ jobs:
4949
5050
steps:
5151
- name: Checkout code
52-
uses: actions/checkout@v5
52+
uses: actions/checkout@v6
5353
with:
5454
submodules: recursive
5555

.github/workflows/linux_cuda_ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ jobs:
4848
packages: read
4949
steps:
5050
- name: Checkout code
51-
uses: actions/checkout@v5
51+
uses: actions/checkout@v6
5252

5353
- uses: microsoft/onnxruntime-github-actions/[email protected]
5454
id: build_docker_image_step

.github/workflows/linux_minimal_build.yml

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ jobs:
2828
packages: write
2929
steps:
3030
- name: Checkout repository
31-
uses: actions/checkout@v5
31+
uses: actions/checkout@v6
3232
with:
3333
submodules: false
3434

@@ -65,7 +65,7 @@ jobs:
6565
id-token: write # If using OIDC for ACR login
6666
steps:
6767
- name: Checkout repository
68-
uses: actions/checkout@v5
68+
uses: actions/checkout@v6
6969
with:
7070
submodules: false
7171
- uses: actions/setup-node@v6
@@ -122,7 +122,7 @@ jobs:
122122
id-token: write # If using OIDC for ACR login
123123
steps:
124124
- name: Checkout repository
125-
uses: actions/checkout@v5
125+
uses: actions/checkout@v6
126126
with:
127127
submodules: false
128128
- uses: actions/setup-node@v6
@@ -156,7 +156,7 @@ jobs:
156156
id-token: write # If using OIDC for ACR login
157157
steps:
158158
- name: Checkout repository
159-
uses: actions/checkout@v5
159+
uses: actions/checkout@v6
160160
with:
161161
submodules: false
162162
- uses: actions/setup-node@v6
@@ -188,7 +188,7 @@ jobs:
188188
id-token: write # If using OIDC for ACR login
189189
steps:
190190
- name: Checkout repository
191-
uses: actions/checkout@v5
191+
uses: actions/checkout@v6
192192
with:
193193
submodules: false
194194
- uses: actions/setup-node@v6
@@ -222,7 +222,7 @@ jobs:
222222
id-token: write # If using OIDC for ACR login
223223
steps:
224224
- name: Checkout repository
225-
uses: actions/checkout@v5
225+
uses: actions/checkout@v6
226226
with:
227227
submodules: false
228228
- uses: actions/setup-node@v6
@@ -286,7 +286,7 @@ jobs:
286286
id-token: write # If using OIDC for ACR login
287287
steps:
288288
- name: Checkout repository
289-
uses: actions/checkout@v5
289+
uses: actions/checkout@v6
290290
with:
291291
submodules: false
292292

@@ -363,7 +363,7 @@ jobs:
363363
id-token: write # If using OIDC for ACR login
364364
steps:
365365
- name: Checkout repository
366-
uses: actions/checkout@v5
366+
uses: actions/checkout@v6
367367
with:
368368
submodules: false
369369

@@ -430,7 +430,7 @@ jobs:
430430
id-token: write # If using OIDC for ACR login
431431
steps:
432432
- name: Checkout repository
433-
uses: actions/checkout@v5
433+
uses: actions/checkout@v6
434434
with:
435435
submodules: false
436436

@@ -505,7 +505,7 @@ jobs:
505505
id-token: write # If using OIDC for ACR login
506506
steps:
507507
- name: Checkout repository
508-
uses: actions/checkout@v5
508+
uses: actions/checkout@v6
509509
with:
510510
submodules: false
511511
- uses: actions/setup-node@v6

.github/workflows/linux_tensorrt_ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ jobs:
4848
packages: read
4949
steps:
5050
- name: Checkout code
51-
uses: actions/checkout@v5
51+
uses: actions/checkout@v6
5252

5353
# --- Build the Docker image needed for testing ---
5454
- name: Build Docker Image for Testing

0 commit comments

Comments
 (0)