5

I made an associative array as follows. To give a few details, the keys refer to specific files because I will be using this array in the context of a larger script (where the directory containing the files will be a getopts argument).

declare -A BAMREADS echo "BAMREADS array is initialized" BAMREADS[../data/file1.bam]=33285268 BAMREADS[../data/file2.bam]=28777698 BAMREADS[../data/file3.bam]=22388955 echo ${BAMREADS[@]} # Output: 22388955 33285268 28777698 echo ${!BAMREADS[@]} # Output: ../data/file1.bam ../data/file2.bam ../data/file3.bam 

So far, this array seems to behave as I expect. Now, I want to build another associative array based on this array. To be specific: my second array will have the same keys as my first one but I want to divide the values by a variable called $MIN.

I am not sure which of the following strategies is best and I can't seem to make either work.

Strategy 1: copy the array and modify the array?

MIN=33285268 declare -A BRAMFRACS echo "BAMFRACS array is initialized" BAMFRACS=("${BAMREADS[@]}") echo ${BAMFRACS[@]} # Output: 22388955 33285268 28777698 echo ${!BAMFRACS[@]} # Output: 0 1 2 

This is not what I want for the keys. Even if it works, I would then need to perform the operation I mentioned on all the values.

Stragegy 2: build the second array when looping through the first.

MIN=33285268 declare -A BRAMFRACS echo "BAMFRACS array is initialized" for i in $(ls $BAMFILES/*bam) do echo $i echo ${BAMREADS[$i]} BAMFRACS[$i] = ${BAMREADS[$i]} done echo ${BAMFRACS[@]} echo ${!BAMFRACS[@]} #When I run this, I get the following error which I am unsure how to solve: ../data/file1.bam 33285268 script.bash: line 108: BAMFRACS[../data/file1.bam]: No such file or directory ../data/file2.bam 28777698 script.bash: line 108: BAMFRACS[../data/file2.bam]: No such file or directory ../data/file3.bam 22388955 script.bash: line 108: BAMFRACS[../data/file3.bam]: No such file or directory 

Thanks

0

    4 Answers 4

    9

    To answer the more general question about copying associative arrays.

    The bash maintainers made the unfortunate decision to copy the ksh93 API rather than the zsh one when they introduced their own associative arrays in 4.0.

    ksh93/bash do support setting an associative array as a whole, but it's with the:

    hash=([k1]=v1 [k2]=v2) 

    syntax. While with zsh, it's

    hash=(k1 v1 k2 v2) 

    (support for the ([k]=v...) ksh93 syntax was also added later on for compatibility).

    What that means though is that with ksh93 and bash, it's very tricky to create a hash that way from an arbitrary list of keys and values.

    With the zsh syntax, you just need to pass the list as alternating keys and values. For instance, to copy two associative arrays:

    h2=("${(@kv)h1}") 

    Or from a CSV with two columns:

    IFS=$'\n,'; h=($(<file.csv)) 

    Or from arrays of key and values:

    h=("${(@)keys:^values}") 

    With the ksh93/bash syntax, while there's "${!h[@]}" and "${h[@]}" to expand to the list of keys and values (like "${(@k)h}" and "${(@v)h}" in zsh), there's no operator to expand to both keys and values in the [key]=value syntax expected by h=(...) (the "${(@kv)h}" in zsh).

    A trick you can use in those shells to copy associative arrays though (other than copying elements in a loop), is to use the output of typeset -p.

    For instance, the equivalent of zsh's h2=("${(@kv)h1}") to copy h1 into h2 could be done in ksh93 or bash with:

    h1_definition=$(typeset -p h1) && eval "typeset -A h2=${h1_definition#*=}" 

    Which with bash you can shorten to:

    h1_definition=$(typeset -p h1) && typeset -A h2="${h1_definition#*=}" 

    (While like in ksh93, typeset -A h=value is short for typeset -A h=([0]=value) in bash, if value starts with ( and ends with ), then the content is interpreted as a compound associative assignment as if passed to eval (even if the ( are quoted or the result of some expansion)).

    In the end, it's about as easy to use the loop instead:

    for k in "${!h1[@]}"; do h2[$k]=${h1[$k]}; done 
    3
    • *which with bash (≥4.4) you can shorten to typeset -A h2="${h1_definition#*=}". I was confused why this wasn't working for me on CentOS 7 (Bash 4.2.x) so I tried 3.x through 5.x by pulling images from Docker, and with earlier Bashes you'll either get an error or just a literal string assigned to [0] instead (but no error).
      – Kevin E
      CommentedMar 14, 2021 at 5:18
    • Any particular reason why using typeset instead of declare? Judging from help typeset it seems declare is the "real" command.CommentedSep 10, 2021 at 14:52
    • 1
      @MestreLion, typeset is the name ksh chose nearly 40 years ago and is supported by all Bourne-like shells that have variable types including bash, so that's the one I'm used to. bash decided to call it declare for some reason, but has always supported typeset as an alias and will likely support it forever, like other shells. There have been more shells recently adding declare as an alias for typeset for compatibility with bash, but it will take more than that to change my habits :-)CommentedSep 10, 2021 at 14:57
    8

    Build the new array from the old:

    MIN=33285268 declare -A BRAMFRACS for key in "${!BAMREADS[@]}"; do BRAMFRACS[$key]=$(( ${BAMREADS[$key]} / MIN )) done 

    Comments on your code:

    • Your first suggested code does not work as it copies the values from the associative array to the new array. The values automatically gets the keys 0, 1 and 2 but the original keys are not copied. You need to copy the array key by key as I have shown above. This way you assign the wanted value to the correct key.

    • Your second suggested code contains a syntax error in that it has spaces around = in an assignment. This is where the errors that you see come from. variable = value is interpreted as "the command variable executed with the operands = and value".

    • If you wish to iterate over a set of pathnames, don't use ls. Instead just do for pathname in "$BAMFILES"/*bam; do.

    • Quote you variable expansions.

    • Consider using printf instead of echo to output variable data.

    Related:

    0
      0

      The following bash assigns the values from associative array AA2 (which may be unset) into another associative array AA1 (much must be declared -A).

      LIST="$(declare -p AA2 2>/dev/null)" [[ "$LIST" ]] && AA1+=${LIST#*=} 
      • declare -p echoes the value of a variable as a declare statement which can be directly passed into the interpreter without any word splitting problems.
      • ${LIST#*=} removes the equals sign and everything before it.
      • The hiding of an error (2>/dev/null) during declare and the non-zero length test ([[ "$LIST" ]]) allow AA2 to be unset.
      • Unexpected results (not errors) will happen if AA1 or AA2 are not associative arrays.
        -3

        this should do it (can also add additional key-value):

        declare -A origDict=( [keya]=value_a [keyb]=value_b [keyc]=value_c ) declare -a newDict=( echo ${origDict[*]} [keynew]=new_value ) 
        2
        • Mostly works, but you lose the original keys, as newDict is declared as just an ordinary (numeric-indexed) array in this case. If you run declare -p newDict afterward, you will see: declare -a newDict='([0]="new_value" [1]="value_c" [2]="value_b" [3]="value_a")'. Thanks for your answer, though!
          – Kevin E
          CommentedMar 14, 2021 at 0:25
        • if newDict is an ordinary array instead of an associative one, then this fails to answer the original questionCommentedSep 10, 2021 at 14:54

        You must log in to answer this question.

        Start asking to get answers

        Find the answer to your question by asking.

        Ask question

        Explore related questions

        See similar questions with these tags.