Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to parse a tag which can have multiple names in a single property #160

Open
sdipendra opened this issue Jun 23, 2023 · 4 comments
Open

Comments

@sdipendra
Copy link

sdipendra commented Jun 23, 2023

How to parse a tag which can have multiple names.

Specifically for example:
For a tag named "link:Stat"

some of my XML documents have fully qualified name: <link:Stat></link:Stat>
some of my XML documents just have: <Stat></Stat> without the namespace

No document has both formats.

I want them to be mapped to the same single property: val stat: Stat

How can I achieve this? Thanks!

@pdvrieze
Copy link
Owner

There are two approaches. One is currently broken (I've fixed it): adding a custom handler for unknown content in the policy. The other is to have a filter on the parser that just remaps tags. A final option for your case is to override the mechanism by which the policy maps kotlin types to tag names. This is global, but can allow you to use the same serializer with a different policy to parse either.

@sdipendra
Copy link
Author

sdipendra commented Jun 23, 2023

For the third approach, I'm trying to override the policy behaviour but I'm unable to identify the method that I should override.

I've created a failing test setup for the same if you can point the policy method that I should override that will be great.

In the current setup the first test case with prefix passes & the second test case without prefix fails.

package com.kodepad.xml

import kotlinx.serialization.Serializable
import kotlinx.serialization.decodeFromString
import nl.adaptivity.xmlutil.ExperimentalXmlUtilApi
import nl.adaptivity.xmlutil.serialization.DefaultXmlSerializationPolicy
import nl.adaptivity.xmlutil.serialization.XML
import nl.adaptivity.xmlutil.serialization.XmlElement
import nl.adaptivity.xmlutil.serialization.XmlSerialName
import nl.adaptivity.xmlutil.serialization.XmlSerializationPolicy
import nl.adaptivity.xmlutil.serialization.XmlValue
import org.junit.jupiter.api.Test
import org.slf4j.LoggerFactory
import kotlin.test.assertEquals

@OptIn(ExperimentalXmlUtilApi::class)
internal class XMLUtilFailingTest {
    @Serializable
    @XmlSerialName(
        namespace = "http://www.kodepad.com/xml/equipment",
        prefix = "equipment",
        value = "device",
    )
    data class Device(
        @XmlElement(value = true) val stat: Stat?,
    )

    @Serializable
    @XmlSerialName(
        namespace = "http://www.kodepad.com/xml/link",
        prefix = "link",
        value = "Stat",
    )
    data class Stat(
        @XmlValue val value: String,
    )

    class XmlSerializationPolicyProxy(xmlSerializationPolicy: XmlSerializationPolicy) :
        XmlSerializationPolicy by xmlSerializationPolicy {
        // todo: Override method to map "Stat" to "link:Stat"
    }

    companion object {
        private val log = LoggerFactory.getLogger(this::class.java.declaringClass.name)

        private val expectedValue = Device(Stat("WORKING"))
    }

    private val xml = XML {
        this.policy = XmlSerializationPolicyProxy(
            DefaultXmlSerializationPolicy(
                false, encodeDefault = XmlSerializationPolicy.XmlEncodeDefault.NEVER
            )
        )
    }

    @Test
    fun `parse xml with prefix`() {
        val xmlString =
            "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" + "<equipment:device xmlns:equipment=\"http://www.kodepad.com/xml/equipment\"\n" + "                  xmlns:link=\"http://www.kodepad.com/xml/link\">\n" + "    <link:Stat>WORKING</link:Stat>\n" + "</equipment:device>\n"

        val device = xml.decodeFromString<Device>(xmlString)
        log.info("device: $device")

        assertEquals(expectedValue, device)
    }

    @Test
    fun `parse xml without prefix`() {
        val xmlString =
            "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" + "<equipment:device xmlns:equipment=\"http://www.kodepad.com/xml/equipment\"\n" + "                  xmlns:link=\"http://www.kodepad.com/xml/link\">\n" + "    <Stat>WORKING</Stat>\n" + "</equipment:device>\n"

        val device = xml.decodeFromString<Device>(xmlString)
        log.info("device: $device")

        assertEquals(expectedValue, device)
    }
}

Included dependencies:

plugins {
    kotlin("jvm") version "1.8.20"
    kotlin("plugin.serialization") version "1.8.20"
}

dependencies {
    // Serialization
    implementation("org.jetbrains.kotlinx:kotlinx-serialization-json:1.5.0")
    implementation("io.github.pdvrieze.xmlutil:core:0.86.0")
    implementation("io.github.pdvrieze.xmlutil:serialization:0.86.0")
}

@pdvrieze
Copy link
Owner

Unfortunately there is a bug in the handling (now fixed in dev). What should be overridden is handleUnknownContentRecovering. To see how this works look at:

fun testDeserializeRecoveringWithParser() {
val xml = XML {
policy = object: DefaultXmlSerializationPolicy(true) {
@ExperimentalXmlUtilApi
override fun handleUnknownContentRecovering(
input: XmlReader,
inputKind: InputKind,
descriptor: XmlDescriptor,
name: QName?,
candidates: Collection<Any>
): List<XML.ParsedData<*>> {
XmlSerializationPolicy.recoverNullNamespaceUse(inputKind, descriptor, name)?.let { return it }
return super.handleUnknownContentRecovering(input, inputKind, descriptor, name, candidates)
}
}
}
val input = "<Container><Stat value=\"foo\"/></Container>"
val parsed = xml.decodeFromString<Container>(input)
assertEquals(Container(Stat("foo")), parsed)
}

and:

/**
* Helper function that allows more flexibility on null namespace use. If either the found
* name has the null namespace, or the candidate has null namespace, this will map (for the
* correct child).
*/
@ExperimentalXmlUtilApi
public fun recoverNullNamespaceUse(inputKind: InputKind, descriptor: XmlDescriptor, name: QName?): List<XML.ParsedData<*>>? {
if (name != null) {
if (name.namespaceURI == "") {
for (idx in 0 until descriptor.elementsCount) {
val candidate = descriptor.getElementDescriptor(idx)
if (inputKind.mapsTo(candidate.effectiveOutputKind) &&
candidate.tagName.localPart == name.getLocalPart()) {
return listOf(XML.ParsedData(idx, Unit, true))
}
}
} else {
for (idx in 0 until descriptor.elementsCount) {
val candidate = descriptor.getElementDescriptor(idx)
if (inputKind.mapsTo(candidate.effectiveOutputKind) &&
candidate.tagName.isEquivalent(QName(name.localPart))) {
return listOf(XML.ParsedData(idx, Unit, true))
}
}
}
}
return null
}

But please note that this is broken in master (the helper function is new - but more significantly recovery for elements is broken (it fails to read the end tag))

@sdipendra
Copy link
Author

Checked on dev. This works for my use case. Thank you.

One suggestion though instead of having a specific method for handling null namespace wouldn't it better to have a method that provides ability to map a parsed QName to some other QName. That will enable the null namespace and many other use cases as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants