0

I have json mapping under properties key as below in a file Customer.json,

{ "customer": { "properties": { "customerId": { "type": "string", "index": "not_analyzed" }, "name": { "type": "string", "index": "not_analyzed" } } } } 

Which I want to convert into following with key and displayName duplicated and type from above mapping,

field(key: 'customerId', displayName: 'customerId', type: 'String') field(key: 'name', displayName: 'name', type: 'String') 

I hit and tried bash + python as below assuming it fetched customer key first and loops inside properties ,

$ cat Customer.json | python -c 'import sys; import simplejson as json; \ print "\n".join( [i["properties"] for i in json.loads( sys.stdin.read() )["customer"]] )' Traceback (most recent call last): File "<string>", line 2, in <module> TypeError: string indices must be integers, not str 

I'm open to other solutions as well.

    3 Answers 3

    1

    Parsing such structured data is best done using a dedicated parser, just as you are doing. However, in this particular case, it is simple enough that you could do:

    $ grep -B 1 '"type":' Customer.json | tr $'"' $"'" | sed 's/[:,]//g' | awk '{print "field(key: "$1",\tdisplayName: "$1",\t type: "$NF")"}' RS="--" 

    Which returns:

    field(key: 'customerId', displayName: 'customerId', type: 'string') field(key: 'name', displayName: 'name', type: 'string') 
      1

      The error seems clear to me, the variable "i" is a string as the for loop iterates over the value of "customer" dict/mapping. That value is a dict/mapping itself and iterating over it will get you the succesion of the list of keys (i.e. ["properties"].

      cat Customer.json | python -c 'import sys; import simplejson as json; \ print "\n".join( [i for i in json.loads( sys.stdin.read() )["customer"]["properties"] ] )' 

      will give you

       customerid name 

      and the following should get you closer to your goal:

      cat Customer.json | python -c 'import sys; import simplejson as json; \ print "\n".join( ["{} {}".format(k, v) for k,v in json.loads( sys.stdin.read() )["customer"]["properties"].iteritems() ] )' 

      which gives:

      customerId {'index': 'not_analyzed', 'type': 'string'} name {'index': 'not_analyzed', 'type': 'string'} 

      From there I would recommend you actually make the python in a script. You will have to decide on how to get from string to String and do the additional formatting. A multiliner is always more easy to debug (case proven by your Question) and maintainable and gives more meaningful (line numbers) error messages.

        0

        With jq:

        jq -r '.customer.properties | to_entries[] | "field(key: \u0027\(.key)\u0027, displayName: \u0027\(.key)\u0027, type: \u0027\(.value.type)\u0027)"' Customer.json 

        This turns the properties values into a list objects with keys key and value. Each key will have the value of the properties keys (i.e., customerId and name), and each value will have the value that is the properties sub-object.

        Given one such entry, the key value and the type value of the value entry are extracted and inserted into the text string template. Each \u0027 that you see in the string is a single-quote character.

        The output of this, given your example input file, would be

        field(key: 'customerId', displayName: 'customerId', type: 'string') field(key: 'name', displayName: 'name', type: 'string') 

          You must log in to answer this question.

          Start asking to get answers

          Find the answer to your question by asking.

          Ask question

          Explore related questions

          See similar questions with these tags.