Skip to content

Commit 15b1de3

Browse files
vinitkumarampcode-comsourcery-ai[bot]
authored
Fix/review cleanup (#281)
* fix: replace mutable default argument in dicttoxml() xml_namespaces had a mutable default dict which is a classic Python footgun where mutations persist across calls. Changed to None with an 'or {}' guard inside the function body, matching the pattern already used in dicttoxml_fast.py. Amp-Thread-ID: https://ampcode.com/threads/T-019d6bf2-ce86-763d-8c62-08118f358cbe Co-authored-by: Amp <amp@ampcode.com> * fix: remove unused Dict and List imports from conftest.py These typing imports were dead code since the file already uses the built-in dict[...] and list[...] syntax everywhere. Amp-Thread-ID: https://ampcode.com/threads/T-019d6bf2-ce86-763d-8c62-08118f358cbe Co-authored-by: Amp <amp@ampcode.com> * fix: remove dead setUp/tearDown stubs from TestJson2xml These were unittest-style no-op methods that serve no purpose in pytest classes. Amp-Thread-ID: https://ampcode.com/threads/T-019d6bf2-ce86-763d-8c62-08118f358cbe Co-authored-by: Amp <amp@ampcode.com> * fix: hoist SystemRandom() to module level in dicttoxml Avoids creating a new SystemRandom instance on every make_id() call. The module-level _SAFE_RANDOM is reused across all invocations. Amp-Thread-ID: https://ampcode.com/threads/T-019d6bf2-ce86-763d-8c62-08118f358cbe Co-authored-by: Amp <amp@ampcode.com> * chore: stop tracking coverage data * Update json2xml/dicttoxml.py Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> * Align Rust list wrapper semantics with Python * fix: warnings --------- Co-authored-by: Amp <amp@ampcode.com> Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
1 parent 33dc6ba commit 15b1de3

9 files changed

Lines changed: 140 additions & 21 deletions

File tree

.coverage

-52 KB
Binary file not shown.

json2xml/dicttoxml.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
from defusedxml.minidom import parseString
1313

1414
# Create a safe random number generator
15+
_SAFE_RANDOM = SystemRandom()
1516

1617
# Set up logging
1718
LOG = logging.getLogger("dicttoxml")
@@ -29,8 +30,7 @@ def make_id(element: str, start: int = 100000, end: int = 999999) -> str:
2930
Returns:
3031
str: The generated ID.
3132
"""
32-
safe_random = SystemRandom()
33-
return f"{element}_{safe_random.randint(start, end)}"
33+
return f"{element}_{_SAFE_RANDOM.randint(start, end)}"
3434

3535

3636
def get_unique_id(element: str) -> str:
@@ -641,7 +641,7 @@ def dicttoxml(
641641
item_wrap: bool = True,
642642
item_func: Callable[[str], str] = default_item_func,
643643
cdata: bool = False,
644-
xml_namespaces: dict[str, Any] = {},
644+
xml_namespaces: dict[str, Any] | None = None,
645645
list_headers: bool = False,
646646
xpath_format: bool = False,
647647
) -> bytes:
@@ -797,6 +797,8 @@ def dicttoxml(
797797

798798
output = []
799799
namespace_str = ""
800+
if xml_namespaces is None:
801+
xml_namespaces = {}
800802
for prefix in xml_namespaces:
801803
if prefix == 'xsi':
802804
for schema_att in xml_namespaces[prefix]:

lat.md/behavior.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ Default output includes an XML declaration, wraps content in `all`, pretty print
1414

1515
[[json2xml/json2xml.py#Json2xml#to_xml]] calls [[json2xml/dicttoxml.py#dicttoxml]] with the configured wrapper, root, `attr_type`, `item_wrap`, `cdata`, and `list_headers` options. When `item_wrap=False`, list values repeat the parent tag instead of creating `<item>` children. When `pretty=False`, the library returns the serializer bytes directly.
1616

17+
The Rust fast path in [[rust/src/lib.rs#write_dict_contents]] and [[rust/src/lib.rs#write_list_contents]] mirrors those Python list-wrapper rules. `list_headers=True` suppresses the outer list container and repeats the parent tag only for nested dict items, while primitive items still use the same scalar tags that Python emits.
18+
1719
## XPath 3.1 format
1820

1921
XPath mode swaps the project-specific XML shape for the W3C `json-to-xml` mapping with typed element names and the XPath functions namespace.
@@ -24,4 +26,4 @@ When `xpath_format=True`, [[json2xml/dicttoxml.py#dicttoxml]] delegates payload
2426

2527
Pretty printing acts as a validation step, because the formatter reparses the generated XML before returning it.
2628

27-
[[json2xml/json2xml.py#Json2xml#to_xml]] uses `defusedxml.minidom.parseString` before `toprettyxml`. If the generated bytes are not well-formed XML, the converter raises `InvalidDataError` instead of returning broken pretty output.
29+
[[json2xml/json2xml.py#Json2xml#to_xml]] uses `defusedxml.minidom.parseString` before `toprettyxml`. If the generated bytes are not well-formed XML, the converter raises `InvalidDataError` instead of returning broken pretty output.

lat.md/tests.md

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,4 +28,16 @@ XPath mode should emit the W3C XPath functions namespace and typed child element
2828

2929
### Item-wrap false repeats parent tag
3030

31-
Disabling item wrapping should repeat the parent element name for primitive list items instead of producing nested `<item>` tags.
31+
Disabling item wrapping should repeat the parent element name for primitive list items instead of producing nested `<item>` tags.
32+
33+
### Default xml namespaces stay empty
34+
35+
Calling `dicttoxml` without `xml_namespaces` should preserve the legacy root output and avoid adding namespace declarations or `xsi:` attributes implicitly.
36+
37+
### Explicit xml namespaces emit schema attributes
38+
39+
Supplying namespace prefixes and an `xsi` mapping should emit the expected `xmlns:*` declarations plus supported schema attributes without mutating the caller input.
40+
41+
### Xml namespace inputs are not mutated across calls
42+
43+
Reusing one `xml_namespaces` mapping across multiple `dicttoxml` calls should return identical XML each time so namespace declarations never accumulate on the shared dict.

rust/src/lib.rs

Lines changed: 30 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -340,7 +340,14 @@ fn write_dict_contents(
340340

341341
// Lists in dicts get special wrapping treatment
342342
if let Ok(list) = val.cast::<PyList>() {
343-
if cfg.item_wrap {
343+
let first_is_scalar = list
344+
.get_item(0)
345+
.ok()
346+
.map(|item| is_python_scalar(&item))
347+
.unwrap_or(false);
348+
let wrap_list_container = (cfg.item_wrap || !first_is_scalar) && !cfg.list_headers;
349+
350+
if wrap_list_container {
344351
write_open_tag(out, &xml_key, name_attr, type_attr(cfg, "list"));
345352
write_list_contents(py, out, list, &xml_key, cfg)?;
346353
write_close_tag(out, &xml_key);
@@ -354,6 +361,18 @@ fn write_dict_contents(
354361
Ok(())
355362
}
356363

364+
/// Return true when a Python object is treated as a primitive scalar by the
365+
/// pure-Python serializer for list-wrapper decisions.
366+
#[cfg(feature = "python")]
367+
#[inline]
368+
fn is_python_scalar(obj: &Bound<'_, PyAny>) -> bool {
369+
obj.is_none()
370+
|| obj.is_instance_of::<PyBool>()
371+
|| obj.is_instance_of::<PyInt>()
372+
|| obj.is_instance_of::<PyFloat>()
373+
|| obj.is_instance_of::<PyString>()
374+
}
375+
357376
/// Write all items of a list into the buffer.
358377
#[cfg(feature = "python")]
359378
fn write_list_contents(
@@ -363,7 +382,8 @@ fn write_list_contents(
363382
parent: &str,
364383
cfg: &ConvertConfig,
365384
) -> PyResult<()> {
366-
let tag_name = if cfg.list_headers {
385+
let scalar_tag_name = if cfg.item_wrap { "item" } else { parent };
386+
let dict_tag_name = if cfg.list_headers {
367387
parent
368388
} else if cfg.item_wrap {
369389
"item"
@@ -375,14 +395,19 @@ fn write_list_contents(
375395
// Dicts inside lists have special wrapping logic
376396
if let Ok(dict) = item.cast::<PyDict>() {
377397
if cfg.item_wrap || cfg.list_headers {
378-
write_open_tag(out, tag_name, None, type_attr(cfg, "dict"));
398+
let dict_type_attr = if cfg.list_headers {
399+
None
400+
} else {
401+
type_attr(cfg, "dict")
402+
};
403+
write_open_tag(out, dict_tag_name, None, dict_type_attr);
379404
write_dict_contents(py, out, dict, cfg)?;
380-
write_close_tag(out, tag_name);
405+
write_close_tag(out, dict_tag_name);
381406
} else {
382407
write_dict_contents(py, out, dict, cfg)?;
383408
}
384409
} else {
385-
write_value(py, out, &item, tag_name, None, cfg, true)?;
410+
write_value(py, out, &item, scalar_tag_name, None, cfg, true)?;
386411
}
387412
}
388413
Ok(())

tests/conftest.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
import json
55
from pathlib import Path
6-
from typing import TYPE_CHECKING, Any, Dict, List
6+
from typing import TYPE_CHECKING, Any
77

88
import pytest
99

tests/test_dict2xml.py

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -500,6 +500,92 @@ def test_dicttoxml_with_xml_namespaces(self) -> None:
500500
result = dicttoxml.dicttoxml(data, xml_namespaces=namespaces)
501501
assert b'xmlns="http://example.com"' in result
502502

503+
# @lat: [[tests#Conversion behavior#Default xml namespaces stay empty]]
504+
def test_dicttoxml_without_xml_namespaces_keeps_previous_output(self) -> None:
505+
"""Test dicttoxml without xml_namespaces keeps the default XML shape."""
506+
data = {"bike": "blue"}
507+
result = dicttoxml.dicttoxml(data, attr_type=False)
508+
assert (
509+
b'<?xml version="1.0" encoding="UTF-8" ?>'
510+
b"<root><bike>blue</bike></root>" == result
511+
)
512+
assert b"xmlns" not in result
513+
assert b"xsi:" not in result
514+
515+
# @lat: [[tests#Conversion behavior#Explicit xml namespaces emit schema attributes]]
516+
def test_dicttoxml_with_explicit_xml_namespaces_emits_schema_attributes(self) -> None:
517+
"""Test dicttoxml emits explicit namespace declarations and XSI schema attributes."""
518+
data = {"bike": "blue"}
519+
namespaces = {
520+
"veh": "https://example.com/vehicle",
521+
"xsi": {
522+
"schemaInstance": "http://www.w3.org/2001/XMLSchema-instance",
523+
"schemaLocation": "https://example.com/vehicle vehicle.xsd",
524+
"noNamespaceSchemaLocation": "vehicle-no-namespace.xsd",
525+
},
526+
}
527+
namespaces_before = {
528+
prefix: value.copy() if isinstance(value, dict) else value
529+
for prefix, value in namespaces.items()
530+
}
531+
532+
result = dicttoxml.dicttoxml(
533+
data,
534+
custom_root="vehicle",
535+
attr_type=False,
536+
xml_namespaces=namespaces,
537+
)
538+
539+
assert (
540+
b'<?xml version="1.0" encoding="UTF-8" ?>'
541+
b'<vehicle xmlns:veh="https://example.com/vehicle" '
542+
b'xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" '
543+
b'xsi:schemaLocation="https://example.com/vehicle vehicle.xsd">'
544+
b"<bike>blue</bike>"
545+
b"</vehicle>" == result
546+
)
547+
assert b"xsi:noNamespaceSchemaLocation" not in result
548+
assert namespaces == namespaces_before
549+
550+
# @lat: [[tests#Conversion behavior#Xml namespace inputs are not mutated across calls]]
551+
def test_dicttoxml_reuses_xml_namespaces_without_mutating_input(self) -> None:
552+
"""Test reusing xml_namespaces across calls does not mutate or accumulate state."""
553+
data = {"bike": "blue"}
554+
namespaces = {
555+
"veh": "https://example.com/vehicle",
556+
"xsi": {
557+
"schemaInstance": "http://www.w3.org/2001/XMLSchema-instance",
558+
"schemaLocation": "https://example.com/vehicle vehicle.xsd",
559+
},
560+
}
561+
namespaces_before = {
562+
prefix: value.copy() if isinstance(value, dict) else value
563+
for prefix, value in namespaces.items()
564+
}
565+
566+
first = dicttoxml.dicttoxml(
567+
data,
568+
custom_root="vehicle",
569+
attr_type=False,
570+
xml_namespaces=namespaces,
571+
)
572+
second = dicttoxml.dicttoxml(
573+
data,
574+
custom_root="vehicle",
575+
attr_type=False,
576+
xml_namespaces=namespaces,
577+
)
578+
579+
assert first == second
580+
assert first.count(b'xmlns:veh="https://example.com/vehicle"') == 1
581+
assert first.count(
582+
b'xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"'
583+
) == 1
584+
assert first.count(
585+
b'xsi:schemaLocation="https://example.com/vehicle vehicle.xsd"'
586+
) == 1
587+
assert namespaces == namespaces_before
588+
503589
def test_datetime_conversion(self) -> None:
504590
"""Test datetime conversion."""
505591
data = {"key": datetime.datetime(2023, 2, 15, 12, 30, 45)}

tests/test_json2xml.py

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,6 @@
2121
class TestJson2xml:
2222
"""Tests for `json2xml` package."""
2323

24-
def setUp(self) -> None:
25-
"""Set up test fixtures, if any."""
26-
27-
def tearDown(self) -> None:
28-
"""Tear down test fixtures, if any."""
29-
3024
def test_read_from_json(self) -> None:
3125
"""Test something."""
3226
data = readfromjson("examples/bigexample.json")

tests/test_rust_dicttoxml.py

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -252,7 +252,8 @@ def test_item_wrap_false(self):
252252
def test_list_headers(self):
253253
data = {"colors": ["red", "green"]}
254254
result = rust_dicttoxml(data, list_headers=True)
255-
assert b"<colors" in result
255+
assert b"<item type=\"str\">red</item>" in result
256+
assert b"<item type=\"str\">green</item>" in result
256257

257258

258259
class TestRustVsPythonCompatibility:
@@ -334,21 +335,18 @@ def test_item_wrap_false_matches(self):
334335
rust, python = self.compare_outputs(data, item_wrap=False)
335336
assert rust == python
336337

337-
@pytest.mark.xfail(reason="Rust list_headers implementation differs from Python - uses different wrapping semantics")
338338
def test_list_headers_true_matches(self):
339339
"""Test that list_headers=True produces matching output."""
340340
data = {"items": ["one", "two", "three"]}
341341
rust, python = self.compare_outputs(data, list_headers=True)
342342
assert rust == python
343343

344-
@pytest.mark.xfail(reason="Rust item_wrap=False with nested dicts differs from Python - known limitation")
345344
def test_item_wrap_false_with_nested_dict_matches(self):
346345
"""Test item_wrap=False with nested dicts in list."""
347346
data = {"users": [{"name": "Alice"}, {"name": "Bob"}]}
348347
rust, python = self.compare_outputs(data, item_wrap=False)
349348
assert rust == python
350349

351-
@pytest.mark.xfail(reason="Rust list_headers with nested structures differs from Python - known limitation")
352350
def test_list_headers_with_nested_matches(self):
353351
"""Test list_headers=True with nested structures."""
354352
data = {"products": [{"id": 1, "name": "Widget"}, {"id": 2, "name": "Gadget"}]}

0 commit comments

Comments
 (0)