Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML document import not working for text elements #340

Open
perklason opened this issue Aug 22, 2017 · 4 comments
Open

XML document import not working for text elements #340

perklason opened this issue Aug 22, 2017 · 4 comments

Comments

@perklason
Copy link

perklason commented Aug 22, 2017

Hi when I try to import XML with tags like this one (I am using jackalope-doctrine-dbal in Symfony):

<title>HarryPotter</title>

I get the following error:

[PHPCR\InvalidSerializedDataException]             

Unexpected element in stream: #text="HarryPotter"  
  
Exception trace:
 () at /var/www/html/std1/vendor/jackalope/jackalope/src/Jackalope/ImportExport/ImportExport.php:631

If I do a $xml->isValid just before that line, it returns true. So what is wrong? Bug?

This is the XML doc I try to import for testing:

<bookstore>
  <book category="children">
    <title>HarryPotter</title>
    <author>J K. Rowling</author>   
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="web">
    <title>Learning XML</title>
    <author>Erik T. Ray</author>    
    <year>2003</year>
    <price>39.95</price>
  </book>
</bookstore> 
@dbu
Copy link
Member

dbu commented Aug 23, 2017

are you trying to import this as "system view" or as "document view"?

@perklason
Copy link
Author

perklason commented Aug 23, 2017

It gets auto imported as document view (ends up in function importDocumentView).

Seems something weird happens here somewhere:

// get current depth to detect self-closing tag. unfortunately, we do
        // not get an END_ELEMENT for self-closing tags but read() just jumps
        // to the next element, even moving up in the tree,
        $depth = $xml->depth;
        if (! $hasAttributes) {
            // we where on an empty element, thus not inside its attributes. change depth to 1 deeper
            // thanks XMLReader, great work :-(
            $depth++;
        }
        $xml->read(); // move out of current node to next

        // TODO: what about significant whitespace? maybe the read above should not even skip significant empty whitespace...

        // while we are on element and at same depth, these are children of the current node
        while (XMLReader::ELEMENT == $xml->nodeType && $xml->depth == $depth) {
            self::importDocumentView($node, $ns, $xml, $uuidBehavior, $namespaceMap);
        }

        if (XMLReader::END_ELEMENT != $xml->nodeType && $xml->depth != $depth - 1) {
            throw new InvalidSerializedDataException('Unexpected element in stream: '.$xml->name.'="'.$xml->value.'"');
        }

@dbu
Copy link
Member

dbu commented Aug 23, 2017

it looks like we missed to implement handling of text nodes in the document view parser. according to https://docs.adobe.com/docs/en/spec/jcr/2.0/11_Import.html we should create a node jcr:xmltext with a property jcr:xmlcharacters.

do you want to implement this? i think we can simply add a check after the while ELEMENT before the if ! END_ELEMENT and create the text child node. would need to keep track of the nodes we create in the self::importDocumentView so we can attach it to the right place.

a first step would be to modify the test in https://github.com/phpcr/phpcr-api-tests/ so that it contains a string content and see it fail, then make it work...

@perklason
Copy link
Author

It would be very nice if this can be implemented. Would help out if I can, but this is a bit beyond me, unfortunately.

@dbu dbu changed the title XML not working for text elements XML document import not working for text elements Aug 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@dbu @perklason and others