Commit 3980ed8
authored
XMLProcessor: Support namespaces (#126)
## Add Namespace Support to XMLProcessor
This PR upgrades `XMLProcessor` to fully support [[XML Namespaces
1.1](https://www.w3.org/TR/xml-names11/)](https://www.w3.org/TR/xml-names11/).
Tags and attributes are now consistently interpreted according to their
declared namespaces, fixing compatibility with WordPress WXR files and
EPUB metadata.
New methods signatures:
```php
public function next_tag( $query_or_namespace = null, $local_name_maybe = null );
public function get_tag_local_name();
public function get_tag_namespace();
public function get_attribute( $namespace, $local_name );
public function get_attribute_names_with_prefix( $full_namespace_prefix, $local_name_prefix );
public function set_attribute( $namespace, $local_name, $value );
```
Usage comparison:
```php
// Before
$processor->next_tag( 'wp:content' );
$processor->get_attribute( 'wp:post-type' );
// After
$processor->next_tag( 'http://wordpress.org/export/1.2/', 'content' );
// or
$processor->next_tag( [ 'http://wordpress.org/export/1.2/', 'content' ] );
$processor->next_tag( [ '*', 'content' ] );
$processor->get_attribute( 'http://wordpress.org/export/1.2/', 'post-type' );
```
## Rationale
The old parser treated tag and attribute names as opaque strings
(`wp:postmeta`, `wp:tag`, etc.), ignoring that these were syntactic
sugar for `{namespace}local-name`. This made it impossible to reliably
parse WXR files, which may use different namespace URIs for the same
`wp:` prefix.
## Implementation Details
* `$stack_of_open_elements` tracks the hierarchy of `XMLElement` frames
and the namespaces they define and remove.
* `set_attribute($ns, $attr, $value)` and `get_attribute($ns, $attr)`
accept the full namespace string as their first argument to force the
developer to take it into consideration.
* `next_tag()` and `matches_breadcrumbs()` accept two-tuples
`{$namespace, $local_tag_name}` instead of string-based tag names. Tag
names are still accepted. `*` wildcards are supported, too.
* `get_breadcrumbs()` return an array of two-tuples `{$namespace,
$local_tag_name}`, e.g. `[['', 'root'], ['http://wp.org/export/1.2/',
'post']]`
## Testing instructions
Confirm most of the CI tests pass (aside of the flaky network-related
ones)1 parent 6222e29 commit 3980ed8
File tree
10 files changed
+2267
-713
lines changed- components
- DataLiberation
- DataFormatConsumer
- EntityReader
- EntityWriter
- Tests
- wxr
- XML
- Tests
10 files changed
+2267
-713
lines changedLines changed: 29 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
63 | | - | |
| 63 | + | |
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
| |||
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
93 | | - | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
94 | 98 | | |
95 | 99 | | |
96 | 100 | | |
| |||
100 | 104 | | |
101 | 105 | | |
102 | 106 | | |
103 | | - | |
104 | | - | |
| 107 | + | |
| 108 | + | |
105 | 109 | | |
106 | 110 | | |
107 | 111 | | |
108 | 112 | | |
109 | 113 | | |
110 | | - | |
| 114 | + | |
111 | 115 | | |
112 | 116 | | |
113 | 117 | | |
| |||
124 | 128 | | |
125 | 129 | | |
126 | 130 | | |
127 | | - | |
128 | | - | |
| 131 | + | |
| 132 | + | |
129 | 133 | | |
130 | 134 | | |
131 | 135 | | |
| |||
211 | 215 | | |
212 | 216 | | |
213 | 217 | | |
214 | | - | |
215 | | - | |
| 218 | + | |
| 219 | + | |
216 | 220 | | |
217 | 221 | | |
218 | 222 | | |
| |||
297 | 301 | | |
298 | 302 | | |
299 | 303 | | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
300 | 320 | | |
301 | 321 | | |
302 | 322 | | |
| |||
Lines changed: 10 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
| 57 | + | |
57 | 58 | | |
58 | 59 | | |
59 | 60 | | |
| |||
137 | 138 | | |
138 | 139 | | |
139 | 140 | | |
140 | | - | |
| 141 | + | |
| 142 | + | |
141 | 143 | | |
142 | 144 | | |
143 | 145 | | |
144 | | - | |
| 146 | + | |
145 | 147 | | |
146 | 148 | | |
147 | 149 | | |
| |||
161 | 163 | | |
162 | 164 | | |
163 | 165 | | |
164 | | - | |
165 | | - | |
166 | | - | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
167 | 169 | | |
168 | | - | |
| 170 | + | |
169 | 171 | | |
170 | | - | |
| 172 | + | |
171 | 173 | | |
172 | 174 | | |
173 | | - | |
| 175 | + | |
174 | 176 | | |
175 | 177 | | |
176 | 178 | | |
| |||
0 commit comments