Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ordered output/file writing of JSONObjects (may be repeated, nevertheless please read carefully ...) #822

Open
openworld42 opened this issue Oct 27, 2023 · 13 comments

Comments

@openworld42
Copy link

openworld42 commented Oct 27, 2023

A complete summary of the main issues of the comments can be found here.

In one of my repositories I am going to save the simulation state (lots of data) in a JSON file, which is shorter and can be easier changed/debugged than XML (which I used a lot in past projects). To do this in a realistic way, there is a need of ordering (similar to XML).

First of all: I had a deep dive into the code, made a fork (for my own); and yes, I read the FAQ and other files, especially "Will ordering ever be allowed ...", and I know the (IMHO unfortunate) spec "unordered" ;-)

In any case, I will use JSON-java. It suits the needs close to perfect, pretty printing, easy usage, functionality, light weight, etc.
Therefore, I will add ordering in the fork.

My suggestion: I can contribute the ordering to JSON-java - as far as I can understand not breaking the interfaces.
Due to the fact, that this a lot more work than doing it for my repo needs, I want to discuss the item and get possibly an ok for it (Assuming my code will please so ;-)

As far as I can see, there are several possibilities to add ordering not breaking the interfaces - if this is not rejected in general for some reasons:

  1. Add a class member keyList (ArrayList) to JSONObject - which is null by default, not a big memory consumption. In a Context class (or a static variable/setter within JSONObject: let say setOrdering(true) before the user takes action) the keyList gets a small ArrayList. If the keyList or flag exists/is set, all put(), accumulate(), populate()-dependent ops will add the key(s) to that keyList - this preserves ordering, remove ops have to take care of keyList. All get(), keys(), keySet(), entrySet(), ... ops will return ordered items, cause the keys will come in an ordered way or Set from keyList. All map-dependent ops will still work in an unordered way.
    AFAIK, no interface is broken, all existing code works like in the past, old output/files/Strings are still the same - e.g. a previous JSONObject.setOrdered(true) makes things ordered. Some work for me ;-)

  2. Extend JSONObject to let say JSONObjectOrdered, add a member keyList (ArrayList) and override most of the above methods, using the keyList to retrieve any key, entrySet etc. Possibly some private methods have to be copied (e.g. private wrap() - not a smart solution).
    In that case, other classes may undergo a lot of changes like: isInstanceOf JSONObjectOrdered.
    Any ordered attempt can be made by using JSONObjectOrdered instead of JSONObject to generate the JSON tree.
    Default behaviour does not change, no interface is broken, but it looks a little bit unaesthetic. Anyone who wants to write ordered uses JSONObjectOrdered. Not my preferred solution for a public framework, but I may use this for my own repo if a contribution is not within the projects meaning. (Far less work for me).

  3. Any of your suggestions, better than above - please tell.

My 2 cents for the public, please respond.
(You may also contact me on the mail address found at the Github user openworld42)

@openworld42
Copy link
Author

openworld42 commented Oct 28, 2023

Some hours, some coding and the unit tests later: I implemented suggestion 1) including javadoc on the JSON-java fork https://github.com/openworld42/JSON-java on branch master:

To demonstrate the ordered usage: before using JSONObject a call of

JSONObject.setOrdered(true);

will set all direct or indirect JSONObject creation to be ordered (in the same order as the adding of JSON key/value pairs/objects/JSONArray/... etc).

A simple default package class Main.java (will be deleted later) demonstrates the ordered usage. In addition, I checked all the references and methods of JSONObject and the other classes. If setOrdered(true) is not called, the behaviour is still the same as before.

There are only a few things that have still built-in ordering: e.g. JSONObject(Map m) or JSONObject(Object bean) where the ordering is based on reflection.

Please check it out and tell me your opinions.

@rudrajyotib
Copy link
Contributor

@openworld42 - can you please elaborate your use case. If it is that the generated JSON file be written with keys in sorted manner, or if it is that you want to read the keys using list API and expect a sorted list, it would be better to implement the same either in toString or the getNames/keys/keySet methods. If the need is to read in sorted manner, it would be safer to not touch the storage mechanism.

@johnjaylward
Copy link
Contributor

johnjaylward commented Oct 30, 2023

I don't hate the implementation, but I think it could be a little more flexible overall. Have you tested this parsing a nested JSON object?

{
  "key1":"some string",
  "key2":{
    "key3":"child object string value",
    "key4":123,
    "key5":{
      "a":"a",
      "b":"b",
      "c":"c"
    },
    "key6":true
  },
  "key7":654,
  "key8":false,
  "key9":true
}

I feel like any changes here would need to also be reflected in the JSONTokener class to ensure full tokenization maintains order.

@openworld42
Copy link
Author

openworld42 commented Oct 30, 2023

@rudrajyotib

@openworld42 - can you please elaborate your use case. If it is that the generated JSON file be written with keys in sorted manner, or if it is that you want to read the keys using list API and expect a sorted list, it would be better to implement the same either in toString or the getNames/keys/keySet methods. If the need is to read in sorted manner, it would be safer to not touch the storage mechanism.


I was talking about sorted output (IMHO the FAQ "ordering" was talking about that too). If one uses JSON-java only to pass messages between server/client or similar, ordering is not important - the algorithm will find all keys/values, regardless of ordering it or not, due to key access.

But if one writes it to a file/string/something that is displayed or debugged/developed: you never get it in the same ordering (due to Map access inside) as you created it, cause of hashed access.
This makes it difficult to look at or change at larger JSON structures, and users looking at those files will think something like "who the hell made the mess of this file?". Yes, I know, it's the spec.
What I have done with this mod: any output is in the same order as the JSON structure/tree was created if one calls JSONObject.setOrdered(true) before starting creation of the structure. This does not change the structure itself nor touches it the storage mechanism: it's still the Map.
The mechanism of ordering it has been described above: if one wants it to be ordered, a list of the keys will be created/maintained in the same order of adding JSON objects. Thus, finding the sub-elements/keys
comes ordered(e.g. returning the ordered keys, if you look at the code). Nothing else is changed, if one doesn't call JSONObject.setOrdered(true) the behaviour is still the same: it also passed the unit tests.

Last but not least, the cost (nothing is in vain ;-) : if not using ordering but it has been built in: one null reference to an ArrayList (member) for each JSONObject (usually 8 bytes). Performance: for most key accesses one "if (isOrdered) ... else "as before".
If it is ordered - I already implemented the code for myself: an ArrayList for each JSONObject. Each key within the JSONObject is added at the end of that list, e.g. for a put(). Performance: as above "if (isOrdered) ... else ; Key set ops: LinkedHashSet instead of map.keySet(); remove() - the drawback: linear search and remove, move the sublist part behind the key one backwards within the list, usually at the cost of size() / 2.

To make it short: I did NOT touch the storage algorithms, I just maintain a list inside to return the keys in an ordered way, if so. JSONObject is the only modified class.

My personal use case: maintain lots of structured data in a text file (for obvious reasons ordered) for an application I am writing on. I already did it in a (personal) copy of JSON-java, and it works already. I was just asking if there is a will to add it for the public. As I understand the FAQ, there are more people out there with the same need (has two items within the FAQ), other JSON frameworks have more complex ways to order it. Code of my JSONObject mod can be compared to the original JSON-java JSONObject to see the differences - and feel free to ask me anything about the mod.

@openworld42
Copy link
Author

@johnjaylward

  1. Have you tested this parsing a nested JSON object?

Yes, (of course ;-)

I put some nesting in the fork in Main.java, and commented out test1() for test2(): please look at the end of this comment, output has been included. My own application has much deeper structures, and works too. In my above comment (@rudrajyotib) I explained it a little bit more.

  1. ... changes here would need to also be reflected in the JSONTokener ...

No: JSONTokener - besides parsing and String ops - only creates JSONObject and JSONArray. The JSONObject creation leads to a for() ... put(); therefore, the order (of an ordered) JSONObject will be the order the parsing JSONTokener comes along - as far as I can see the same order as within the tokenized/parsed string/InputStream.
I only maintain the keys in the order they are put() to the JSONObjects by JSONTokener.

To be more precise: there are parts of code in JSONObject, where an intended order cannot be detected (e.g. new JSONObject(map), no way to get another order than the hashed map.Keys() or keySet(); new JSONObject(bean) is another example). But if one creates a JSONObject from a map, ordering is not expected (usually, 😄). One item we can/should discuss and is not build now within my mod: new JSONObject(jsonObject) should take the ordering of the embedded object, if set.
new JSONObject(jsonObject, String ... names) supports ordering in the mod, due to the "names" iteration.

--- code nested objects, ordered output using JSONObject.setOrdered(true) below (unordered output at last - the "messy" one)

    public void test2() {
    	JSONObject obj1 = new JSONObject()
                .put("key1", "abc")
                .put("key2", 1)
                .put("key3", "def");
        JSONObject obj2 = new JSONObject()
                .put("key21", "ghi")
                .put("key22", 2)
                .put("key23", "jlm");
        JSONObject obj3 = new JSONObject()
                .put("key31", "nop")
                .put("key32", 3)
                .put("key33", obj1);
        List jsonObjects = new ArrayList(
                Arrays.asList(obj1, obj2, obj3)
        );
        JSONObject obj4 = new JSONObject()
                .put("key41", "qrt")
                .put("key42", 2.0)
                .put("key43", obj3)
        	.put("key44", jsonObjects)
        	.put("key45", new JSONArray(jsonObjects));
        JSONObject test = new JSONObject()
        	.put("obj1", obj1)
        	.put("obj2", obj2)
        	.put("obj3", obj3)
        	.put("wholeTree of obj4", obj4);
        System.out.println(test.toString(2));
}

--- ordered output, indentation of 2
{
  "obj1": {
    "key1": "abc",
    "key2": 1,
    "key3": "def"
  },
  "obj2": {
    "key21": "ghi",
    "key22": 2,
    "key23": "jlm"
  },
  "obj3": {
    "key31": "nop",
    "key32": 3,
    "key33": {
      "key1": "abc",
      "key2": 1,
      "key3": "def"
    }
  },
  "wholeTree of obj4": {
    "key41": "qrt",
    "key42": 2,
    "key43": {
      "key31": "nop",
      "key32": 3,
      "key33": {
        "key1": "abc",
        "key2": 1,
        "key3": "def"
      }
    },
    "key44": [
      {
        "key1": "abc",
        "key2": 1,
        "key3": "def"
      },
      {
        "key21": "ghi",
        "key22": 2,
        "key23": "jlm"
      },
      {
        "key31": "nop",
        "key32": 3,
        "key33": {
          "key1": "abc",
          "key2": 1,
          "key3": "def"
        }
      }
    ],
    "key45": [
      {
        "key1": "abc",
        "key2": 1,
        "key3": "def"
      },
      {
        "key21": "ghi",
        "key22": 2,
        "key23": "jlm"
      },
      {
        "key31": "nop",
        "key32": 3,
        "key33": {
          "key1": "abc",
          "key2": 1,
          "key3": "def"
        }
      }
    ]
  }
}

--- unordered, output as the origin, indentation 2
{
  "obj2": {
    "key23": "jlm",
    "key22": 2,
    "key21": "ghi"
  },
  "obj1": {
    "key1": "abc",
    "key2": 1,
    "key3": "def"
  },
  "wholeTree of obj4": {
    "key45": [
      {
        "key1": "abc",
        "key2": 1,
        "key3": "def"
      },
      {
        "key23": "jlm",
        "key22": 2,
        "key21": "ghi"
      },
      {
        "key33": {
          "key1": "abc",
          "key2": 1,
          "key3": "def"
        },
        "key32": 3,
        "key31": "nop"
      }
    ],
    "key44": [
      {
        "key1": "abc",
        "key2": 1,
        "key3": "def"
      },
      {
        "key23": "jlm",
        "key22": 2,
        "key21": "ghi"
      },
      {
        "key33": {
          "key1": "abc",
          "key2": 1,
          "key3": "def"
        },
        "key32": 3,
        "key31": "nop"
      }
    ],
    "key43": {
      "key33": {
        "key1": "abc",
        "key2": 1,
        "key3": "def"
      },
      "key32": 3,
      "key31": "nop"
    },
    "key42": 2,
    "key41": "qrt"
  },
  "obj3": {
    "key33": {
      "key1": "abc",
      "key2": 1,
      "key3": "def"
    },
    "key32": 3,
    "key31": "nop"
  }
}

@rudrajyotib
Copy link
Contributor

@openworld42 - I have kept some comments around the proposed implementation in your forked commits.
@johnjaylward @stleary - I think we can discuss this feature, because this can actually be useful for many, especially to compare the JSONs in serialized form, or using JSON data as list/sequence.

@openworld42
Copy link
Author

@rudrajyotib
I annotated your comments, please read.

Note: I also merged the latest changes of JSON-java into the fork to stay up-to-date.

@openworld42
Copy link
Author

openworld42 commented Nov 4, 2023

Summing up of the present suggestions of an ordered JSONObject output

Note: I came up with the suggestion to supply a feature to get the output of a JSONObject ordered (precisely: the order of insertion into a JSONObject), and user @rudrajyotib and I commented code on a (friendly) fork I made to demonstrate the behaviour. The comments can be found here.

Up to now, there are the following suggestions (in the order of arrival, some additional arguments can be found above):

1. No change for the current behaviour if ordering is not used: a static method JSONObject.setOrdered(boolean), called before any JSONObject usage, if ordering is desired. If the flag is true, JSONObject maintains internally an ArrayList where the order of insertion (keys only) is stored. Output, keySet() and similar operations would then have the order of insertion. Code of a preliminary draft implementation can be found in the fork.

  • Pros
    Does not change the behaviour or interface of current applications, unless one uses JSONObject.setOrdered(true).
    No changes to other classes.

  • Cons
    Somewhat "unaesthetic" approach (the cost of not changing the behaviour or interface). Future changes to the code needs the awareness of the contributor to think about all key-dependent operations - even if using only methods that already exist and are safe to ordering. Around 11 internal if (isOrdered) ... else <code as before> requests.. The risk of maintaining two different storage places for the keys for future coding (the map, the ArrayList keyList): divergence.
    There are situations were ordering is not supported: imagine new JSONObject(hashMap), ordering for such an object is always hash-dependent.

2. Change the class member privat final map (now HashMap<String, Object>()) to LinkedHashMap<String, Object>.

  • Pros
    The easy approach: No changes to other classes, only a few changes to the code. Almost no risks.

  • Cons
    Breaking the "behaviour" interface (not any Java interface). From the next release on a user will get a different output - the order of insertion, like in approach 1., instead of unordered but consistent (hashing) behaviour. If someone out there relies on that behaviour (IMHO not a good design practice, but imaginable), the users application may not work properly any longer as before. (e.g. imagine JSON -> XML, XML parsing applications, using schemas, XSLT, and similar).
    May be the community does not care about it ("If one uses a newer version of JSON-java, output dependent Applications need to be adapted").
    A minor point is both memory and performance of LinkedHashMap over HashMap - no more important nowadays.

3. Create a new class like JSONObjectOrdered extends JSONObject which maintains a LinkedHashMap<String, Object> instead of the current HashMap<String, Object>.

  • Pros
    Just a new class, consistent behaviour to previous versions, no interface broken.
    (Note: member map of JSONObject needs protected access (instead of private final) or a protected setter, or additional constructors to set the map to LinkedHashMap<String, Object>.
    (IMHO for a framework like JSON-java it is a always a good idea to have at least the important members protected (or use protected setters/constructors etc) to enable users to extend framework classes.)

  • Cons
    Needs changes within other classes to decide if JSONObject or JSONObjectOrdered should be created internally (e.g. see JSONTokener#nextValue()). This has some analogy to suggestion number 1.) (e.g. a new constructor JSONTokener(..., boolean isOrdered) or a setter to make it ordered or both), but it will affect other classes - with a fast looking over the code: Cookie, CookieList, HTTP, JSONArray, JSONML, and others - all of them use new JSONObject(). Several class constructors need changes or something similar to distinguish between JSONObject or JSONObjectOrdered.
    Another possible solution: a new class JSONObjectFactory creates JSONObject or JSONObjectOrdered, at least for internal use. Ordering or not ordered can be set for JSONObjectFactory.

4. Do nothing, let the users (like me) do their work according to their needs: to copy/fork the code and modify it .

  • Pros: nothing to do, no risk. Cons: missing functionality (ordered and "un-messy" output).

These are my 2 cents, let me remark:

@rudrajyotib asked for this summary. I politely ask the community and @stleary for a decision to integrate it or not. No other suggestions were brought up. I (openworld42) personally would vote for suggestion number 2., if the users are not unhappy with a changed (ordered) output and behaviour, otherwise for suggestion number 1.

If the decision is yes and there is a majority for one of the suggestions, I am willing to do the work and open a pull request to contribute the code - if there is nobody else out there who wants to do this.
But: it needs a yes and a decision, I don't want to do it in vain.

(You may also contact me on the mail address found at the Github user @openworld42)

@openworld42
Copy link
Author

@jpuerto
Thanks for looking over it.
My Remarks: this is true, but only for suggestion 2. (I stated this within the Cons of 2.).

Suggestion 1. and 3. do not have this drawback. Therefore I commented at the end of the summary, (my vote): " ... I vote for ... number 2., if the users are not unhappy with a changed (ordered) output and behaviour, otherwise for suggestion number 1."
Any comments from others? (Not too many up to now. 😃 )

@mmariuzzo
Copy link

Hi @openworld42,

I've forked in the past the library to use a TreeMap instead of an HashMap to have sorted "toString()" output to help human checks and replicable/easy-comparable strings.

Obvously a TreeMap works only in my scenarios.

I love you proposal and my vote goes to "number 2"

@openworld42
Copy link
Author

Hi @mmariuzzo

thanks for voting and an answer to my proposal.

It seems that there is not a lot of interest on this item - I thought more people would have the need of what you call "help human checks and replicable/easy-comparable strings".

So, after about 2 1/2 months of waiting for votes and answers, a "No changes at this time" label after my offer to contribute, and an outstanding decision: I will wait on that before taking any actions or investing effort in a PR, and invest my time otherwise.

@stleary
Copy link
Owner

stleary commented Jan 6, 2024

@openworld42 Thanks for submitting this issue. I think you have made a good case for ordering. However, the RFC states that objects are unordered, and as a reference app, JSON-Java tries to follow the spec. You have the right idea to fork the repo and make changes that are suitable for your application.

@daswass
Copy link

daswass commented May 11, 2024

Could be simpler than forking if you dont mind a little reflection. Something like the following:

public class OrderedJSONObject extends JSONObject {

    public OrderedJSONObject() {
        super();
        createOrdered();
    }

    public OrderedJSONObject(String jsonString) {
        this();
        fromString(jsonString);
    }

    /**
     * Use reflection to override the underlying data structure
     */
    private void createOrdered() {
        try {
            Field mapField = JSONObject.class.getDeclaredField("map");
            mapField.setAccessible(true);
            mapField.set(this, new LinkedHashMap<>());
        } catch (NoSuchFieldException | IllegalAccessException e) {
            throw new RuntimeException("Unable to preserve JSONObject ordering. Failed to override map.");
        }
    }

    private void fromString(String source) {
        JSONTokener x = new JSONTokener(source);
        if (x.nextClean() != '{') {
            throw x.syntaxError("A JSONObject text must begin with '{'");
        } else {
            while (true) {
                char c = x.nextClean();
                switch (c) {
                    case '\u0000':
                        throw x.syntaxError("A JSONObject text must end with '}'");
                    case '}':
                        return;
                    default:
                        x.back();
                        String key = x.nextValue().toString();
                        c = x.nextClean();
                        if (c != ':') {
                            throw x.syntaxError("Expected a ':' after a key");
                        }

                        this.putOnce(key, x.nextValue());
                        switch (x.nextClean()) {
                            case ',':
                            case ';':
                                if (x.nextClean() == '}') {
                                    return;
                                }

                                x.back();
                                break;
                            case '}':
                                return;
                            default:
                                throw x.syntaxError("Expected a ',' or '}'");
                        }
                }
            }
        }
    }

    @Override
    public String toString(int indentSpaces) {
        return super.toString(indentSpaces);
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants