Skip to content

Conversation

@mavam
Copy link
Member

@mavam mavam commented Dec 30, 2025

Summary

  • Add reference documentation for parse_xml function with XPath support, attribute prefix configuration, namespace handling, and examples
  • Add reference documentation for parse_winlog function optimized for Windows Event Log XML format with System, EventData, UserData, and RenderingInfo section handling
  • Add "Parse Windows Event Log XML" section to the Windows Event Logs integration guide with practical filtering and extraction examples

Test plan

  • Preview documentation renders correctly
  • All code examples are syntactically valid TQL
  • Links to related functions work correctly

Related

🤖 Generated with Claude Code

@github-actions github-actions bot added reference Reference documentation integration Integration documentation labels Dec 30, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 30, 2025

🧹 Preview deployment has been cleaned up.

@mavam mavam force-pushed the topic/xml-parsing-docs branch from c1a2b4c to 00e7f7e Compare January 4, 2026 15:48
@github-actions github-actions bot added the site Site infrastructure label Jan 4, 2026
@mavam mavam force-pushed the topic/xml-parsing-docs branch from e69766f to 9fb7c68 Compare January 5, 2026 12:35
@mavam mavam marked this pull request as ready for review January 6, 2026 17:49
@mavam mavam force-pushed the topic/xml-parsing-docs branch from 59aeaf1 to 3c81715 Compare January 6, 2026 17:56
mavam added a commit to tenzir/tenzir that referenced this pull request Jan 6, 2026
## Summary

Introduce two new TQL functions for parsing XML-formatted data:

- **`parse_xml`**: Generic XML parsing with XPath-based element
selection, configurable attribute prefixes, namespace handling, and
depth limiting
- **`parse_winlog`**: Specialized Windows Event Log XML parsing that
handles `System` and `EventData` sections with smart output formatting:
  - Named `<Data Name="...">` elements → record fields
  - Unnamed `<Data>` elements → list output
  - Other elements (like `UserData`) → generic nested structure

Both functions use libexpat for efficient streaming XML parsing and
integrate with Tenzir's multi-series builder for schema inference.

## Examples

```tql
// Parse generic XML with XPath
from {xml: "<catalog><book id=\"1\"><title>Guide</title></book></catalog>"}
result = xml.parse_xml(xpath="//book")

// Parse Windows Event Log
from {xml: "<Event>...</Event>"}
event = xml.parse_winlog()
```

## Test plan

- [x] Integration tests for `parse_xml` (basic, xpath, attr_prefix,
namespaces)
- [x] Integration tests for `parse_winlog`
- [x] Microsoft ground truth tests (Events 4624, 4625, 4627, 4648, 4688
from Microsoft Security Auditing docs)
- [x] Build passes with libexpat dependency

## Related

- Documentation: tenzir/docs#147

🤖 Generated with [Claude Code](https://claude.com/claude-code)
mavam and others added 11 commits January 7, 2026 07:42
Document two new TQL functions for XML parsing:

- parse_xml: Generic XML parsing with XPath support, configurable
  attribute prefixes, text keys, namespace handling, and force_list
  for consistent array output
- parse_winlog: Specialized Windows Event Log XML parsing that
  handles System, EventData, UserData, and RenderingInfo sections

Also add a "Parse Windows Event Log XML" section to the Windows
Event Logs integration guide with practical examples for filtering
and extracting event data.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Clarify EventData formatting: named Data → record, unnamed → list
- Note UserData is preserved as nested structure
- Replace example with real Microsoft Event 4688 (process creation)
- Use raw strings (r#"..."#) instead of invalid triple quotes
- Change @value to line (read_lines output field)
- Use verbatim tenzir output for example results
- Improve second example to show unnamed Data → list behavior
Convert See Also sections to use Fn and Integration components.
Add parse_xml and parse_winlog to functions overview page.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Remind to update functions.mdx or operators.mdx when adding,
removing, or renaming function/operator reference pages.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
The read_lines operator produces events with a `line` field,
not `@value`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add parameter documentation and example showing how to transform
attribute-keyed XML elements (like Qualys KEY[@name]) into flat records.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Using read_lines for XML input splits documents at newlines, producing
invalid XML fragments. For multiple events, use read_delimited with
</Event> as separator. For single events, use read_all.
@mavam mavam force-pushed the topic/xml-parsing-docs branch from 014b20c to 2e9ec15 Compare January 7, 2026 06:42
Reflect the implementation changes where System fields like EventID,
Version, Level, etc. are now integers, TimeCreated is a direct timestamp,
and EventData values use type inference.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@mavam mavam force-pushed the topic/xml-parsing-docs branch from 70f9b7d to fa88de6 Compare January 7, 2026 09:24
@mavam mavam merged commit 8221bbf into main Jan 7, 2026
5 checks passed
@mavam mavam deleted the topic/xml-parsing-docs branch January 7, 2026 11:03
@github-actions
Copy link
Contributor

github-actions bot commented Jan 7, 2026

🚀 Preview deployed!

Visit the preview at: https://tenzir-docs-preview-147.surge.sh

This preview will be updated automatically on new commits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration Integration documentation reference Reference documentation site Site infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants