Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML template is more for green field devices #139

Open
relu91 opened this issue Dec 3, 2021 · 14 comments
Open

XML template is more for green field devices #139

relu91 opened this issue Dec 3, 2021 · 14 comments
Labels
data mapping discussions on data mapping concepts Has Use Case Potential The use case can be extracted and explained Selected for Use Case xml

Comments

@relu91
Copy link
Member

relu91 commented Dec 3, 2021

Reviewing XML Binding template document I noticed that it is trying more to describe a way to serialize JSON data to XML rather than trying to describe existing XML documents with our data schema. Given that I can't really find any modern API that is returning XML data I think we should be more open and don't force to use a particular "serialization" pattern even for XML.

Taking for example this XML payload:

<SaveContactResponse xmlns="http://schemas.datacontract.org/2004/07/SmashFly.WebServices.ContactManagerService.v2"› 
    <ContactId>2147483647</ContactId>
    <Errors>
        <string xmlns="http://schemas.microsoft.com/2003/10/Serialization/Arrays">Error 1</string>
        <string xmlns="http://schemas.microsoft.com/2003/10/Serialization/Arrays">Error 2</string>
    </Errors>
    <HasErrors>true</HasErrors>
</SaveContactResponse>

I think we can correctly map to this JSON object:

{
  SaveContactResponse: {
    ContectId : 2147483647,
    Errors: [ { string: "Error 1" }, { string: "Error 2"} ],
    HasErrors: true,
  }
}

Consequently, knowing this mapping, we can define the correct data schema:

{
  type: "object"
  properties: {
    "SaveContactResponse": { type: "object" /* etc. */}
  }
}

I think that giving a clear algorithm to map an XML document to a JSON document is better than the current specification, maybe we can even find something already defined. Opinions?

@danielpeintner
Copy link
Contributor

danielpeintner commented Dec 6, 2021

I think that giving a clear algorithm to map an XML document to a JSON document is better than the current specification, maybe we can even find something already defined. Opinions?

The current algorithm is essentially inspired by EXI for JSON.

It maps JSON to XML.

What you are describing is mapping XML to JSON.
I think this is not possible in a generic way since XML is more powerful, e.g., XML has the power to

  • have attributes and character for a simple type element <ContactId hasFoo="true">2147483647</ContactId>
  • well-defined sequences of elements ...
  • use xsi:type casting
  • use nillable in an instance to indicate that there is no content..
  • ...

Hence, I don't think this will ever work consistently..

w.r.t. describe XML to JSON conversion I am not sure how to move on from here best.

@egekorkan
Copy link
Contributor

In my own defense I have just copied from @takuki's initial contribution and worked with XML only in configuration files and not in APIs, so I do not have a good opinion.

@danielpeintner is it possible to say that we can always define a loose schema which would validate more XML payloads than intended.

By the way, I had asked the JSON Schema community about documented support for such cases and the answer is no. There is a possible clean way with https://json-schema.org/draft/2020-12/json-schema-validation.html#rfc.section.8.5 where we can clearly say that a schema is only for this xml payload.

@relu91
Copy link
Member Author

relu91 commented Dec 6, 2021

I think that giving a clear algorithm to map an XML document to a JSON document is better than the current specification, maybe we can even find something already defined. Opinions?

The current algorithm is essentially inspired by EXI for JSON.

It maps JSON to XML.

What you are describing is mapping XML to JSON. I think this is not possible in a generic way since XML is more powerful, e.g., XML has the power to

  • have attributes and character for a simple type element <ContactId hasFoo="true">2147483647</ContactId>
  • well-defined sequences of elements ...
  • use xsi:type casting
  • use nillable in an instance to indicate that there is no content..
  • ...

Hence, I don't think this will ever work consistently..

w.r.t. describe XML to JSON conversion I am not sure how to move on from here best.

I see, what if we describe a default algorithm that converts attributes/well-defined seq/xsi:type/nillable etc. and then configure it using XML protocol binding vocabulary terms? For example:

  • if a node has one or more attributes it is converted in object like the following:
{
 "hasFoo" : true
 "node_value": 2146 // real value
}
  • you can disable this behavior either in the form level with xml:no-attributes: true or at servient level. 🤔

Just some additional options....

By the way, I had asked the JSON Schema community about documented support for such cases and the answer is no. There is a possible clean way with https://json-schema.org/draft/2020-12/json-schema-validation.html#rfc.section.8.5 where we can clearly say that a schema is only for this xml payload.

yeah but this is really almost like no validation at all... it just says put everything inside a string and treat it as XML.

@relu91
Copy link
Member Author

relu91 commented Dec 7, 2021

Found also this document: https://www.w3.org/2011/10/integration-workshop/s/ExperienceswithJSONandXMLTransformations.v08.pdf

It defines "friendly xml". maybe we can have the rule to map only "friendly XML" and fall back to string if it is unfriendly. just throwing ideas on the table...

@danielpeintner
Copy link
Contributor

Many people nowadays run into the need to support JSON & XML at the same time. Even XQuery, which used a query language for XML has support for JSON nowadays.

Having said that, I don't think it is possible to properly support XML validation based on JSON schema. Let me throw in another proposal. Since I believe proper XML validation needs XML schema.
WHAT if we use real XML schema.

XSD example for a person

<xs:element name="person">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="firstname" type="xs:string"/>
      <xs:element name="lastname" type="xs:string"/>
      <xs:any minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element> 

Would it be feasible to represent the XSD in JSON schema... such as

{
  "title": "Person",
  "type": "string",
  "format": "xml",
  "format-constraints": "<xs:element name=\"person\"> ..."
}

The keys format and format-constraints are just there to highlight what I mean. We can use whatever term we want. We should just make sure that JSON validators can handle it properly.. in the worst case a validator should accept any string (or respectively any XML).
Others validators that support proper XML validation can use the information given in format-constraints to validate...

Moreover, this allows to support other formats in the future also with the same principle.

What do you think?

@relu91
Copy link
Member Author

relu91 commented Dec 13, 2021

I think it is an option. One downside that I see is that from the application point of view we are just talking about strings. It would be better to still maintain the ability to properly describe a formal data type. Basically, it works well for validating the payload but not to "transform"/"map"/"convert" to a consistent DataSchema value.

@egekorkan
Copy link
Contributor

From the call of 15.12:

  • @danielpeintner : Any JSON can be mapped to XML but not the other way around: EXI4JSON is the proof of this being possible.
  • @relu91 : data schema should be abstract and the data schema should be available in the application level, i.e. saying string in data schema and giving the exact schema in XML case.
  • @egekorkan : we can use the contentMediaSchema in the forms and extend the affordance level schema with an XML schema like @danielpeintner example above.
  • @mjkoster : we should be able to describe these payloads in data schema. (This is a thesis that needs evaluation). We should be careful when extending the data schema which would imply more work for the consumer.

Decision:
@danielpeintner will provide XML, XML schema examples where these corner cases exist. These edge cases should not require the expansion of the data schema. We will document how implementors should handle these cases but we need some examples and further iteration on this.

@danielpeintner
Copy link
Contributor

Let me give you some examples where I think it is difficult to represent XML as JSON.

Note1: I am not saying these are good examples. I just want to show that I think XML is more expressive.
Note2: I tried to use https://www.freeformatter.com/xml-to-json-converter.html to experiment a bit and some results are a bit surprising to me.

In the end I don't know whether the examples I give below are good practice... I don't think so.. in most of the cases.

Nillable and Type-Casts

XML allows to type-cast types to a subtype. For example an xsd:decimal can by typed to xsd:integer or xsd:unsignedInt in the instance document.
TypeCasts are also possible with complexTypes also.

Moreover, any element can be marked as nillable in an XML instance (xsi:nil="true") if the schema contains nillable="true" for this element

Simple Content with attributes

XSD allows to specify simple values like a simpleValue types as integer but still have an attribute. This would need to be mapped to an object in JSON schema.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
  <xs:element name="simpleValue">
    <xs:complexType>
      <xs:simpleContent>
        <xs:extension base="xs:integer">
          <xs:attribute name="id" use="required" type="xs:string"/>
        </xs:extension>
      </xs:simpleContent>
    </xs:complexType>
  </xs:element>
</xs:schema>

-->

<simpleValue id="dd">12</simpleValue>

The online tool converts it to

{
   "@id": "dd",
   "#text": "12"
}

Which is surprising to me.

Conflicting names (attributes vs element)

value is once an attribute and the other time a element name and can have different type.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
  <xs:element name="conflictingNames">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="value" type="xs:integer" />
      </xs:sequence>
      <xs:attribute name="value" type="xs:string"/>
    </xs:complexType>
  </xs:element>
</xs:schema>

-->

<conflictingNames value="XX">
    <value>12</value>
</conflictingNames>

Sequence of elements

In complex types by default using xsd:sequence the order of element matters. In JSON this is not the case. Anyhow, I don't think this is a big deal.
What I am not sure about is if we have the same element name declared twice. Not sure if this is actually an issue

  <xs:element name="orderWithDifferentElements">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="value" type="xs:integer" />
        <xs:element name="value" type="xs:integer" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>

any and anyAttribute

XSD allows to specify any element or anyAttribute to appear. I think this is covered by JSON schema and "additionalProperties": true

What is not covered is the feature than anyX can be limited to a given namespace.

@egekorkan
Copy link
Contributor

I had to learn a bit of XML Schema to follow, so I am not sure if my answers are really correct.

  • Nillable: This is possible with JSON Schema if I say anyOf and give type:null for one of the options
  • Typecasting for simple types: First of all, there are less baked-in types but an integer validates a number schema. If you mean that a number like 12.0 can be casted into 12 during validation, I am not sure if it exists by the standard itself.
  • Typecasting for complex types: Not sure what the example can be here
  • Simple value with attribute: This was not part of some of the beginner tutorials but of course I see the uses. I am also surprised by the conversion but I think that this can have different results in JSON Schema since attribute is simply not supported in JSON. However, the same knowledge can be represented and we might have to prescribe how it should be done.
  • Conflicting names: Maybe depending on the result of solving the attribute description, this can be fine.
  • Sequence of elements: I think this is a big deal since nothing in JSON Schema can constrain the order of object keys since there is no need to do so in JSON. A custom keyword with a JSON Schema vocabulary would be possible...
  • Sequence of elements, part 2: Well this is annoying since the key's value can have a different type based on its location. Even if we add a custom keyword to ensure order in JSON Schema objects, we cannot have two keys with the same name (or if we do, most JSON parsers take the second one)
  • any and anyAttribute: I think I did not understand this, can you elaborate?

My assessment of the situation: In X percent of the cases we can have a JSON Schema representation. To me that X feels like 80% but then it is my opinion with not much experience with XML-based APIs.

@danielpeintner
Copy link
Contributor

Nillable: This is possible with JSON Schema if I say anyOf and give type:null for one of the options

In XSD you can use that for any type. With JSON schema you would need to wrap any typo in anyOf right, I think.

Typecasting for simple types: First of all, there are less baked-in types but an integer validates a number schema. If you mean that a number like 12.0 can be casted into 12 during validation, I am not sure if it exists by the standard itself.

XSD has a type hierarchy (see https://www.w3.org/TR/xmlschema-2/#built-in-datatypes), I think JSON schema may have one for integer & number but not the rest.

Typecasting for complex types: Not sure what the example can be here

An example can be any hierarchy again

grafik

One can say I expect Person and at runtime I can say it is a Student.

Simple value with attribute: This was not part of some of the beginner tutorials but of course I see the uses. I am also surprised by the conversion but I think that this can have different results in JSON Schema since attribute is simply not supported in JSON. However, the same knowledge can be represented and we might have to prescribe how it should be done.

Yes, probably.

Conflicting names: Maybe depending on the result of solving the attribute description, this can be fine.

Mhh, since JSON has no attributes this is tricky ... maybe

Sequence of elements: I think this is a big deal since nothing in JSON Schema can constrain the order of object keys since there is no need to do so in JSON. A custom keyword with a JSON Schema vocabulary would be possible...

Agree

Sequence of elements, part 2: Well this is annoying since the key's value can have a different type based on its location. Even if we add a custom keyword to ensure order in JSON Schema objects, we cannot have two keys with the same name (or if we do, most JSON parsers take the second one)

I know.

any and anyAttribute: I think I did not understand this, can you elaborate?

Essentially in XSD one can say I expect some attribute (or element). As said similar to "additionalProperties" in JSON Schema. The difference is that I can say the any MUST be of a certain namespace... e.g., the attribute must be in the context "https://w3id.org/saref#" only. Having said that, other attributes are not allowed.

My assessment of the situation: In X percent of the cases we can have a JSON Schema representation. To me that X feels like 80% but then it is my opinion with not much experience with XML-based APIs.

I think the percentage is even higher.. but not sure either.

@egekorkan egekorkan added the xml label Jun 8, 2022
@egekorkan
Copy link
Contributor

Call from 02.11:

  • https://www.w3.org/XML/EXI/docs/json/exi-for-json.html can help with the conversion to XML on the wire. We can use it for XML generation by referring to it.
  • We can describe most XML payloads with JSON Schema but there are cases where XML (Schema) is more powerful. A generic subset can be done where we are underspecfying the payloads
  • In both cases (generation and description), we can list the corner cases.
  • One easy solution would be to describe the generation of json from XML and then we describe the JSON payload with JSON Schema. So there is never an XML Schema validation. This would be the same for other payload formats (text, CBOR, etc.). The PR Introduce draft for text binding template #140 can be referred to.
  • @danielpeintner will do a first proposal
  • @sebastiankb: Taking an example of devices exchanging XML data would be beneficial. ISO/IEC 15118, OCPP and OPCUA (one serialization option) would be two standards that use it. People interested in those standards would be interested in this.

@danielpeintner
Copy link
Contributor

Some initial findings / thoughts in a Gist

@egekorkan egekorkan mentioned this issue Feb 1, 2023
9 tasks
@danielpeintner
Copy link
Contributor

Personally I still see XML used in APIs nowadays (even though I don't have a good use-case to share). Anyhow I must admit that in most of the cases JSON is used.

Does this mean allowing to describe XML payloads is no longer a valid use case? I don't think so...

FYI: RAML (the API modelling language) allows for both formats, JSON and XML.

@danielpeintner danielpeintner added the Has Use Case Potential The use case can be extracted and explained label Jan 29, 2024
@egekorkan egekorkan added data mapping discussions on data mapping concepts Selected for Use Case labels Feb 5, 2024
@egekorkan
Copy link
Contributor

Given that we have data mapping in our charter, I am adding selected label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data mapping discussions on data mapping concepts Has Use Case Potential The use case can be extracted and explained Selected for Use Case xml
Projects
None yet
Development

No branches or pull requests

3 participants