9

I'm looking at solutions to convert JSON to a CSV. It seems most of the solutions expect the JSON to be a single object rather than an array of objects.

All the solutions I've tried from here seem to break with my input, which comes from curling this site.

How can I convert the JSON into a CSV with jq or another tool when the input is an array instead of an object.

[ { "id": "4", "link": "https://pressbooks.online.ucf.edu/amnatgov/", "metadata": { "@context": "http://schema.org", "@type": "Book", "name": "American Government", "inLanguage": "en", "copyrightYear": "2016", "disambiguatingDescription": "The content of this textbook has been developed and arranged to provide a logical progression from the fundamental principles of institutional design at the founding, to avenues of political participation, to thorough coverage of the political structures that constitute American government. The book builds upon what students have already learned and emphasizes connections between topics as well as between theory and applications. The goal of each section is to enable students not just to recognize concepts, but to work with them in ways that will be useful in later courses, future careers, and as engaged citizens. ", "image": "https://pressbooks.online.ucf.edu/app/uploads/sites/4/2020/01/American-Government.png", "isBasedOn": "https://ucf-dev.pb.unizin.org/pos2041", "author": [ { "@type": "Person", "name": "OpenStax" } ], "datePublished": "2016-01-06", "copyrightHolder": { "@type": "Organization", "name": "cnxamgov" }, "license": { "@type": "CreativeWork", "url": "https://creativecommons.org/licenses/by/4.0/", "name": "CC BY (Attribution)" } }, "_links": { "api": [ { "href": "https://pressbooks.online.ucf.edu/amnatgov/wp-json/" } ], "metadata": [ { "href": "https://pressbooks.online.ucf.edu/amnatgov/wp-json/pressbooks/v2/metadata" } ], "self": [ { "href": "https://pressbooks.online.ucf.edu/wp-json/pressbooks/v2/books/4" } ] } } ] 

Desired Format:

id, link, context, type, name, inLanguage, image, author_type, author_name, license_type, license_url, license_name 
7
  • What is your expected output?
    – jesse_b
    CommentedMar 1, 2020 at 15:22
  • @jesse_b a CSV with each value with metadata flattened (so that name, copyright, ect, are all csv rows).CommentedMar 1, 2020 at 15:41
  • 2
    @PhilipKirkbride, considering your multi-leveled input structure, it would be better if you posted the expected csv output - adding it to your questionCommentedMar 1, 2020 at 16:06
  • 1
    @PhilipKirkbride "author" is an array. How would you like to handle multiple authors?
    – Kusalananda
    CommentedMar 1, 2020 at 17:59
  • 1
    @PhilipKirkbride See update to my answer.
    – Kusalananda
    CommentedMar 1, 2020 at 20:35

2 Answers 2

17

The issue is not really that the JSON that you show is an array, but that each element of the array (of which you only have one) is a fairly complex structure. It is straight forward to extract the relevant data from each array entry into a shorter flat array and convert that into CSV with @csv in jq:

jq -r '.[] | [ .id, .link, .metadata."@context", .metadata."@type", .metadata.name, .metadata.inLanguage, .metadata.image, .metadata.author[0]."@type", .metadata.author[0].name, .metadata.license."@type", .metadata.license.url, .metadata.license.name ] | @csv' file.json 

... but notice how I'm forced to decide that we're only ever interested in the first author (the .metadata.author sub-structure is an array).

The output:

"4","https://pressbooks.online.ucf.edu/amnatgov/","http://schema.org","Book","American Government","en","https://pressbooks.online.ucf.edu/app/uploads/sites/4/2020/01/American-Government.png","Person","OpenStax","CreativeWork","https://creativecommons.org/licenses/by/4.0/","CC BY (Attribution)" 

To create author name strings that are concatenations of all author names (and similarly for author types), with ; as delimiter, you could instead of .metadata.author[0].name in the above use [.metadata.author[].name]|join(";") (and [.metadata.author[]."@type"]|join(";") for the type), so that your command becomes

jq -r '.[] | [ .id, .link, .metadata."@context", .metadata."@type", .metadata.name, .metadata.inLanguage, .metadata.image, ( [ .metadata.author[]."@type" ] | join(";") ), ( [ .metadata.author[].name ] | join(";") ), .metadata.license."@type", .metadata.license.url, .metadata.license.name ] | @csv' file.json 
2
  • This is most elegant pure jq solution I've seen for generating CSV in bash, thanks a lot!
    – Majky
    CommentedDec 9, 2020 at 12:14
  • @Majky Note that this has nothing whatsoever to do with bash or any other shell. :-) But thanks.
    – Kusalananda
    CommentedJan 23, 2021 at 17:38
11

Using Miller (https://github.com/johnkerl/miller) you can "flat" the JSON, running

mlr --j2c cat input.json >output.csv 
+----+---------------------------------------------+-------------------+----------------+---------------------+---------------------+------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------+---------------------------------------+-------------------------+------------------------+------------------------+--------------------------------+-------------------------------+------------------------+----------------------------------------------+-----------------------+-----------------------------------------------------+---------------------------------------------------------------------------+-----------------------------------------------------------------+ | id | link | metadata:@context | metadata:@type | metadata:name | metadata:inLanguage | metadata:copyrightYear | metadata:disambiguatingDescription | metadata:image | metadata:isBasedOn | metadata:author:0:@type | metadata:author:0:name | metadata:datePublished | metadata:copyrightHolder:@type | metadata:copyrightHolder:name | metadata:license:@type | metadata:license:url | metadata:license:name | _links:api:0:href | _links:metadata:0:href | _links:self:0:href | +----+---------------------------------------------+-------------------+----------------+---------------------+---------------------+------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------+---------------------------------------+-------------------------+------------------------+------------------------+--------------------------------+-------------------------------+------------------------+----------------------------------------------+-----------------------+-----------------------------------------------------+---------------------------------------------------------------------------+-----------------------------------------------------------------+ | 4 | https://pressbooks.online.ucf.edu/amnatgov/ | http://schema.org | Book | American Government | en | 2016 | The content of this textbook has been developed and arranged to provide a logical progression from the fundamental principles of institutional design at the founding, to avenues of political participation, to thorough coverage of the political structures that constitute American government. The book builds upon what students have already learned and emphasizes connections between topics as well as between theory and applications. The goal of each section is to enable students not just to recognize concepts, but to work with them in ways that will be useful in later courses, future careers, and as engaged citizens. | https://pressbooks.online.ucf.edu/app/uploads/sites/4/2020/01/American-Government.png | https://ucf-dev.pb.unizin.org/pos2041 | Person | OpenStax | 2016-01-06 | Organization | cnxamgov | CreativeWork | https://creativecommons.org/licenses/by/4.0/ | CC BY (Attribution) | https://pressbooks.online.ucf.edu/amnatgov/wp-json/ | https://pressbooks.online.ucf.edu/amnatgov/wp-json/pressbooks/v2/metadata | https://pressbooks.online.ucf.edu/wp-json/pressbooks/v2/books/4 | +----+---------------------------------------------+-------------------+----------------+---------------------+---------------------+------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------+---------------------------------------+-------------------------+------------------------+------------------------+--------------------------------+-------------------------------+------------------------+----------------------------------------------+-----------------------+-----------------------------------------------------+---------------------------------------------------------------------------+-----------------------------------------------------------------+ 

and then extract the fields you want and renaming them using

mlr -I --csv cut -f "id","link","metadata:@context","metadata:@type","metadata:name","metadata:inLanguage","metadata:image","metadata:author:0:@type","metadata:author:0:name","metadata:license:@type","metadata:license:url","metadata:license:name" \ then label id,link,context,type,name,inLanguage,image,author_type,author_name,license_type,license_url,license_name output.csv 

The output will be

+----+---------------------------------------------+-------------------+------+---------------------+------------+---------------------------------------------------------------------------------------+-------------+-------------+--------------+----------------------------------------------+---------------------+ | id | link | context | type | name | inLanguage | image | author_type | author_name | license_type | license_url | license_name | +----+---------------------------------------------+-------------------+------+---------------------+------------+---------------------------------------------------------------------------------------+-------------+-------------+--------------+----------------------------------------------+---------------------+ | 4 | https://pressbooks.online.ucf.edu/amnatgov/ | http://schema.org | Book | American Government | en | https://pressbooks.online.ucf.edu/app/uploads/sites/4/2020/01/American-Government.png | Person | OpenStax | CreativeWork | https://creativecommons.org/licenses/by/4.0/ | CC BY (Attribution) | +----+---------------------------------------------+-------------------+------+---------------------+------------+---------------------------------------------------------------------------------------+-------------+-------------+--------------+----------------------------------------------+---------------------+ 

    You must log in to answer this question.

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.