1

I've got an array that contains duplicate items, e.g.

THE_LIST=( "'item1' 'data1 data2'" "'item1' 'data2 data3'" "'item2' 'data4'" ) 

Based on the above, I want to create an associative array that would assign itemN as key and dataN as value.

My code iterates over the list, and assigns key => value like this (the additional function is shortened, as it performs some additional jobs on the list):

function get_items(){ KEY=$1 VALUES=() shift $2 for VALUE in "$@"; do VALUES[${#VALUES[@]}]="$VALUE" done } declare -A THE_LIST for ((LISTID=0; LISTID<${#THE_LIST[@]}; LISTID++)); do eval "LISTED_ITEM=(${THE_LIST[$LISTID]})" get_items "${LISTED_ITEM[@]}" THE_LIST=([$KEY]="${VALUES[@]}") done 

when I print the array, I'm getting something like:

item1: data1 data2 item1: data2 data3 item2: data4 

but instead, I want to get:

item1: data1 data2 data3 item2: data4 

Cannot find a way of merging the duplicate keys as well as removing duplicate values for the key.

What would be the approach here?

UPDATE

The actual code is:

THE_LIST=( "'item1' 'data1 data2'" "'item1' 'data2 data3'" "'item2' 'data4'" ) function get_backup_locations () { B_HOST="$2" B_DIRS=() B_DIR=() shift 2 for B_ITEM in "$@"; do case "$B_ITEM" in -*) B_FLAGS[${#B_FLAGS[@]}]="$B_ITEM" ;; *) B_DIRS[${#B_DIRS[@]}]="$B_ITEM" ;; esac done for ((B_IDX=0; B_IDX<${#B_DIRS[@]}; B_IDX++)); do B_DIR=${B_DIRS[$B_IDX]} ...do stuff here... done } function get_items () { for ((LOCIDY=0; LOCIDY<${#LOCATIONS[@]}; LOCIDY++)); do eval "LOCATION=(${LOCATIONS[$LOCIDY]})" get_backup_locations "${LOCATION[@]}" THE_LIST=([$B_HOST]="${B_DIR[@]}") done | sort | uniq } 

when printing the array with:

for i in "${!THE_LIST[@]}"; do echo "$i : ${THE_LIST[$i]}" done 

I get

item1: data1 data2 item1: data2 data3 item2: data4 
10
  • 1
    Your code as given won't work at all - THE_LIST is already a normal array, so you can't redeclare it as an associative array, and even if you could, you're overwriting it each time in the loop with THE_LIST=([$KEY]="${VALUES[@]}").
    – muru
    CommentedJun 13, 2019 at 7:34
  • @muru, so, by what you're saying, I cannot convert an array into associative array, or just not this way?
    – Bart
    CommentedJun 13, 2019 at 7:41
  • I'm saying that the code has no relation to the output that you say you're getting.
    – muru
    CommentedJun 13, 2019 at 7:42
  • 2
    This is not helping your question, but have you taken a look at python? Complex stuff like this is often easy as hell in python.
    – Panki
    CommentedJun 13, 2019 at 7:52
  • 1
    @Panki, yes, Python or perl might be better approach here, however, I'm adding additional feature to an existing bash script, thus the pain.. it's simply too large to rewrite the whole thing in time. if I don't find a way, I may just as well use another language for the task.
    – Bart
    CommentedJun 13, 2019 at 7:54

2 Answers 2

1

If the keys and values are guaranteed to be purely alphanumerical, something like this might work:

declare -A output make_list() { local IFS=" " declare -A keys # variables declared in a function are local by default for i in "${THE_LIST[@]}" do i=${i//\'/} # since everything is alphanumeric, the quotes are useless declare -a keyvals=($i) # split the entry, filename expansion isn't a problem key="${keyvals[0]}" # get the first value as the key keys["$key"]=1 # and save it in keys for val in "${keyvals[@]:1}" do # for each value declare -A "$key[$val]=1" # use it as the index to an array. done # Duplicates just get reset. done for key in "${!keys[@]}" do # for each key declare -n arr="$key" # get the corresponding array output["$key"]="${!arr[*]}" # and the keys from that array, deduplicated done } make_list declare -p output # print the output to check 

With the example input, I get this output:

declare -A output=([item1]="data3 data2 data1" [item2]="data4" ) 

The data items are out of order, but deduplicated.


Might be best to use Python with the csv module instead.

6
  • that does the job after some adjustments as bash version on target machine doesn't support namerefs declaration. thanks for pointers!
    – Bart
    CommentedJun 13, 2019 at 9:36
  • @Bart I'm curious: how did you fix that?
    – muru
    CommentedJun 13, 2019 at 10:28
  • I was too fast being cheerful. this indeed works on a newer bash, but what I thought was a workaround, didn't work out well in the end. I'll most probably rewrite the script, worst case, put an RFC to update bash on a server ;)
    – Bart
    CommentedJun 13, 2019 at 10:46
  • @Bart are you on Bash 4.2, or something older?
    – muru
    CommentedJun 13, 2019 at 10:54
  • that's bash 4.2.46
    – Bart
    CommentedJun 13, 2019 at 11:17
1

If there is no whitespace in any of the values, this solution might work. Use awk associative arrays to build up declare -A commands.

#!/bin/bash THE_LIST=( "'item1' 'data1 data2'" "'item1' 'data2 data3'" "'item2' 'data4'" ) eval "$(\ for i in "${THE_LIST[@]}"; do row=($(eval echo $i)) echo "${row[@]}" done | awk '{ for (i=2; i<=NF; i++) if (seen[$1] !~ $i) { seen[$1]=seen[$1]$i" " } } END { for (s in seen) print "declare -A new_list["s"]=\""seen[s] }' | sed 's/[[:space:]]*$/"/' )" for i in "${!new_list[@]}"; do echo "$i: ${new_list[$i]}" done 

This prints:

item2: data4 item1: data1 data2 data3 

The order of the values is preserved, but the keys are reordered. I couldn't figure out how to trim the trailing whitespace of an array entry in awk so I just used sed to replace it with a quote, but it's already a total hack to begin with.

    You must log in to answer this question.

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.