-
-
Notifications
You must be signed in to change notification settings - Fork 2
Add documentation for parse_xml and parse_winlog functions #147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
4 tasks
Contributor
|
🧹 Preview deployment has been cleaned up. |
c1a2b4c to
00e7f7e
Compare
e69766f to
9fb7c68
Compare
59aeaf1 to
3c81715
Compare
mavam
added a commit
to tenzir/tenzir
that referenced
this pull request
Jan 6, 2026
## Summary
Introduce two new TQL functions for parsing XML-formatted data:
- **`parse_xml`**: Generic XML parsing with XPath-based element
selection, configurable attribute prefixes, namespace handling, and
depth limiting
- **`parse_winlog`**: Specialized Windows Event Log XML parsing that
handles `System` and `EventData` sections with smart output formatting:
- Named `<Data Name="...">` elements → record fields
- Unnamed `<Data>` elements → list output
- Other elements (like `UserData`) → generic nested structure
Both functions use libexpat for efficient streaming XML parsing and
integrate with Tenzir's multi-series builder for schema inference.
## Examples
```tql
// Parse generic XML with XPath
from {xml: "<catalog><book id=\"1\"><title>Guide</title></book></catalog>"}
result = xml.parse_xml(xpath="//book")
// Parse Windows Event Log
from {xml: "<Event>...</Event>"}
event = xml.parse_winlog()
```
## Test plan
- [x] Integration tests for `parse_xml` (basic, xpath, attr_prefix,
namespaces)
- [x] Integration tests for `parse_winlog`
- [x] Microsoft ground truth tests (Events 4624, 4625, 4627, 4648, 4688
from Microsoft Security Auditing docs)
- [x] Build passes with libexpat dependency
## Related
- Documentation: tenzir/docs#147
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Document two new TQL functions for XML parsing: - parse_xml: Generic XML parsing with XPath support, configurable attribute prefixes, text keys, namespace handling, and force_list for consistent array output - parse_winlog: Specialized Windows Event Log XML parsing that handles System, EventData, UserData, and RenderingInfo sections Also add a "Parse Windows Event Log XML" section to the Windows Event Logs integration guide with practical examples for filtering and extracting event data. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Clarify EventData formatting: named Data → record, unnamed → list - Note UserData is preserved as nested structure - Replace example with real Microsoft Event 4688 (process creation)
- Use raw strings (r#"..."#) instead of invalid triple quotes - Change @value to line (read_lines output field) - Use verbatim tenzir output for example results - Improve second example to show unnamed Data → list behavior
Convert See Also sections to use Fn and Integration components. Add parse_xml and parse_winlog to functions overview page. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Remind to update functions.mdx or operators.mdx when adding, removing, or renaming function/operator reference pages. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
The read_lines operator produces events with a `line` field, not `@value`. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add parameter documentation and example showing how to transform attribute-keyed XML elements (like Qualys KEY[@name]) into flat records. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Using read_lines for XML input splits documents at newlines, producing invalid XML fragments. For multiple events, use read_delimited with </Event> as separator. For single events, use read_all.
014b20c to
2e9ec15
Compare
Reflect the implementation changes where System fields like EventID, Version, Level, etc. are now integers, TimeCreated is a direct timestamp, and EventData values use type inference. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
70f9b7d to
fa88de6
Compare
Contributor
|
🚀 Preview deployed! Visit the preview at: https://tenzir-docs-preview-147.surge.sh This preview will be updated automatically on new commits. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
parse_xmlfunction with XPath support, attribute prefix configuration, namespace handling, and examplesparse_winlogfunction optimized for Windows Event Log XML format with System, EventData, UserData, and RenderingInfo section handlingTest plan
Related
🤖 Generated with Claude Code