Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MARC::File::XML can generate invalid XML records #3

Open
minusdavid opened this issue Jul 26, 2024 · 0 comments
Open

MARC::File::XML can generate invalid XML records #3

minusdavid opened this issue Jul 26, 2024 · 0 comments

Comments

@minusdavid
Copy link

The record() method creates the XML as a string rather than using XML::LibXML objects, and it passes the $field->data directly into controlfield and datafield strings.

It's possible for invalid XML characters (like STX or US characters) to be passed from the MARC::Field object into the XML string.

--

I'm not 100% sure of the best solution.

Using a regex like /[^\x{0009}\x{000A}\x{000D}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}\x{10000}-\x{10FFFF}]/ we could raise an exception/die, but that would be a behaviour change.

We could silently or noisily erase the invalid characters, but that would be a bit like hiding the problem.

I suppose having a configurable option for both could be good. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant