1

I have a variable called DISK_INFO with the following contents:

diskid HGST HUSMH8010BSS204 serial no no [0] Slot00 diskid HGST HUH728080AL4204 serial no no [0] Slot02 diskid HGST HUH728080AL4204 serial no no [0] Slot03 diskid HGST HUH728080AL4204 serial no no [0] Slot04 diskid HGST HUH728080AL4204 serial no no [0] Slot05 diskid HGST HUH728080AL4204 serial no no [0] Slot06 diskid HGST HUH728080AL4204 serial no no [0] Slot07 diskid HGST HUH728080AL4204 serial no no [0] Slot08 diskid HGST HUH728080AL4204 serial no no [0] Slot09 diskid HGST HUH728080AL4204 serial no no [0] Slot10 diskid HGST HUH728080AL4204 serial no no [0] Slot11 diskid HGST HUH728080AL4204 serial no no [0] Slot12 diskid HGST HUH728080AL4204 serial no no [0] Slot13 diskid HGST HUH728080AL4204 serial no no [0] Slot14 diskid HGST HUH728080AL4204 serial no no [0] Slot15 diskid HGST HUH728080AL4204 serial no no [0] Slot16 diskid HGST HUH728080AL4204 serial no no [0] Slot17 diskid HGST HUH728080AL4204 serial no no [0] Slot18 diskid HGST HUH728080AL4204 serial no no [0] Slot19 diskid HGST HUH728080AL4204 serial no no [0] Slot20 diskid HGST HUH728080AL4204 serial no no [0] Slot21 diskid HGST HUH728080AL4204 serial no no [0] Slot22 diskid HGST HUH728080AL4204 serial no no [0] Slot23 diskid HGST HUH728080AL4204 serial no no [1] Slot00 diskid HGST HUH728080AL4204 serial no no [1] Slot01 diskid HGST HUH728080AL4204 serial no no [1] Slot02 diskid HGST HUH728080AL4204 serial no no [1] Slot03 diskid HGST HUH728080AL4204 serial no no [1] Slot04 diskid HGST HUH728080AL4204 serial no no [1] Slot05 diskid HGST HUH728080AL4204 serial no no [1] Slot06 diskid HGST HUH728080AL4204 serial no no [1] Slot07 diskid HGST HUH728080AL4204 serial no no [1] Slot08 diskid HGST HUH728080AL4204 serial no no [1] Slot09 diskid HGST HUH728080AL4204 serial no no [1] Slot10 diskid HGST HUH728080AL4204 serial no no [1] Slot11 c2t0d0 Kingston DataTraveler 2.0 - - - - 

When a disk has failed it will be removed from this list, in this example the disk in enclosure 0 Slot 01 has failed.

Assuming enclosure 0 will always have 24 disks 00-23 and enclosure 1 will always have 12 disks 00-11, how can I efficiently and accurately determine the missing disk(s)?

I currently have the following but I'm sure this can be done in a single awk command:

enclosure0=($(awk '$7 ~ "[0]"{print $8}' <<<"$DISK_INFO" | sort -n)) enclosure1=($(awk '$7 ~ "[1]"{print $8}' <<<"$DISK_INFO" | sort -n)) for n in {00..23}; do grep -q "$n" <<<"${enclosure0[@]}" || missing+=("Enclosure 0 - Slot$n") done for n in {00..11}; do grep -q "$n" <<< "${enclosure1[@]}" || missing+=("Enclosure 1 - Slot$n") done 

    3 Answers 3

    4

    Without awk, for each enclosure:

    { printf '[0] Slot%s\n' {00..23} ; grep -Eo '\[0\] Slot..' disks ; } | sort | uniq -u 

    In slow-mo:

    • printf '[0] Slot%s\n' {00..23} generates the list of all possible disks
    • grep -Eo '\[0\] Slot..' disks extracts the existing disks
    • {..} concatenates the output of the two commands
    • sort | uniq -u extracts the lines that appear only once

    You can replace the printf and grep steps by adequate functions, or the printf part by a similar grep on another file which is the expected list of disks.

    1
    • Excellent approach, uses the GNU philosophy to enable extensibility and customization.
      – Spidey
      CommentedMay 11, 2018 at 4:43
    1

    Since you know in advance which items need to exist, create a list and tick them off as you see them.

    awk ' BEGIN { for (i = 0; i < 24; i++) missing[0][sprintf("%02d", i)] = 1; for (i = 0; i < 12; i++) missing[1][sprintf("%02d", i)] = 1; } $7 ~ /^\[[0-9]+\]$/ && $8 ~ /^Slot[0-9]+$/ { gsub(/[^0-9]/, "", $7); sub(/^[^0-9]+/, "", $8); delete missing[$7][$8]; } END { for (enclosure in missing) { for (slot in missing[enclosure]) { printf "Missing enclosure %d Slot%s\n", enclosure, slot; } } } ' 
    0
      0
      perl -sle ' my(@e, @AoA) = qw/ 24 12 /; $AoA[$1][$2]++ while /\[([01])]\h+(?:(?!\d)\S)+0*(\d+)$/mg; for my $enc ( 0 .. $#e ) { for my $m_slot ( grep { ! defined $AoA[$enc][$_] } 0 .. $e[$enc]-1 ) { print "in enclosure $enc - Slot$m_slot is missing."; } } ' -- -_="$DISK_INFO"; 

      Explanation:

      ° Initialize the array @e which holds the number of slots in the various enclosures. ° The Disk info variable is passed into the command line as $_ initialized to $DISK_INFO. ° progressively scan and match the $_ variable using the while loop and look for the numbers in the '[..]' and the 'Slot...' locations. Using these we update the array of array @AoA, it can be viewed as a matrix. ° Now once we have ingested all the data, its time to process it now in two for loops. ° The outer for loops on the enclosures, in our case, they are two. ° The inner for loop computes the indices of the current enclosure elements that are undefined, IOW, those slots that were never encountered during the data collection drive in the while loop. 

        You must log in to answer this question.

        Start asking to get answers

        Find the answer to your question by asking.

        Ask question

        Explore related questions

        See similar questions with these tags.