2

I have a directory with some text files that have data like below; I read them via a loop and put each line in an array! I don't know the correct way and if you know any better way, let me know.

Attack On Titan S03E20 720p WEB x264-URANiME[eztv] 6/17/2019 394 MB 114 37 Attack On Titan S003E020 WEB x264-URANiME Yesterday 172 MB 76 3 Attack On Titan S03E18 1080p WEB x264-URANiME 6/5/2019 1 GB 46 3 Attack On Titan S003E017 720p WEB x264-URANiME[eztv] 5/27/2019 555 MB 41 10 Attack On Titan s02E20 WEB x264-URANiME[eztv] 6/17/2019 171 MB 40 7 Attack On Titan S03e18 WEB x264-URANiME 6/3/2019 200 MB 23 3 Attack On Titan S03E16 720p WEB x264-URANiME[eztv] 5/20/2019 522 MB 23 3 Attack On Titan s03e19 WEB x264-URANiME Today 196 MB 20 0 Attack On Titan S03E14 720p WEB x264-URANiME[eztv] 5/6/2019 545 MB 19 2 

Elements of each line are separated by a tab, as torrent name, added time, size(with MB/GB extension), seeds, and leech.

For example, for the first line sample data, mentioned Elements would be:

torrent name: Attack On Titan S03E20 720p WEB x264-URANiME[eztv] season number: 3 episode number: 20 added time: 6/17/2019 size: 394 MB seed: 114 leech: 37 

note: size numbers are variant (some times MB - some times GB) though, I need to have them all as MB, how should I do it?

So I read each line of array with a loop and pull out each part that I needed for the next steps!

for array_list in "${TORRENT_FILE_ARRAY[@]}" do TORRENT_NAME=`echo "$array_list" | awk '{print $1}' FS="\t"` SEASON_NUMBER=`echo "$array_list" | awk '{print $1}' FS="\t" | awk '{print $4}' FS=" " | awk 'BEGIN {IGNORECASE = 1} {print $1}' FS="E" | sed "s/[Ss]//g" | sed 's/^0*//'` EPISODE_NUMBER=`echo "$array_list" | awk '{print $1}' FS="\t" | awk '{print $4}' FS=" " | awk 'BEGIN {IGNORECASE = 1} {print $2}' FS="E" | sed "s/[Ee]//g" | sed 's/^0*//' ` FILE_SIZE=`echo "$array_list" | awk '{print $3}' FS="\t"` LEECH_NUMBER=`echo "$array_list" | awk '{print $4}' FS="\t"` SEED_NUBMBER=`echo "$array_list" | awk '{print $5}' FS="\t"` # echo $TORRENT_NAME # echo $FILE_SIZE # echo $LEECH_NUMBER # echo $SEED_NUBMBER # echo "SEASON_NUMBER:" $SEASON_NUMBER # echo "EPISODE_NUMBER:" $EPISODE_NUMBER done 

after reading each variable I want to add it to an array for each element I mentioned earlier, I mean something like:

TORRENT_NAME[$x]=$extracted_TORRENT_NAME FILE_SIZE[$x]=$extracted_FILE_SIZE LEECH_NUMBER[$x]=$extracted_LEECH_NUMBER SEED_NUBMBER[$x]=$extracted_SEED_NUBMBER SEASON_NUMBER[$x]=$extracted_SEASON_NUMBER EPISODE_NUMBER[$x]=$extracted_EPISODE_NUMBER 

I want to fill these arrays in a loop, but don't know to grab the data correctly and what tool should I use to catch more efficiently!

Also, it is all in a script, and I'm not allowed to make any further file except script files and even changing source files!

5
  • It'd be a lot easier to visualize the fields in your data if you separated them with commas (or some other visible character) instead of tabs for the purposes of your question. Whatever help/answer you get you can trivially replace commas with tabs in to apply to your real data or just convert tabs to commas in your real data before running the script on it.
    – Ed Morton
    CommentedNov 26, 2019 at 19:38
  • @EdMorton thanks, I made some clarification; if it isn't clear or you have any questions let me know!
    – amkyp
    CommentedNov 26, 2019 at 21:07
  • I still just don't understand it at all (given how much time I'm willing to invest in trying to!) but maybe someone else will. Good luck!
    – Ed Morton
    CommentedNov 26, 2019 at 21:59
  • 1
    Maybe this is a question for less is better? For example, do you need the directory explanation to solve your problem? If you can be more concise the more probable to get an answer. IMHOCommentedNov 26, 2019 at 22:38
  • @guillermochamorro I need to implement conditions for checking other lines of the same episode, to append names to the destination file sorted respected to conditions!
    – amkyp
    CommentedNov 26, 2019 at 23:05

1 Answer 1

1

I'm sorry, I don't understand the rest of your question but here's how to start separating the fields you really want from your input:

$ cat file Attack On Titan S03E20 720p WEB x264-URANiME[eztv] 6/17/2019 394 MB 114 37 $ cat tst.awk BEGIN { FS=OFS="\t" } { name = $1 sub(/ [^ ]+$/,"",name) sub(/.* [Ss]/,"",$1) sub(/[Ee]/,OFS,$1) $2 = $1 $1 = name print } $ awk -f tst.awk file Attack On Titan 03 20 6/17/2019 394 MB 114 37 

Replacing tabs with commas in the input/output for visibility:

$ tr $'\t' ',' < file Attack On Titan S03E20,720p WEB x264-URANiME[eztv],6/17/2019,394 MB,114,37 $ awk -f tst.awk file | tr $'\t' ',' Attack On Titan,03,20,6/17/2019,394 MB,114,37 

and pipe the output of the awk script to a shell loop so you call awk once at the start instead of calling awk multiple times for each input line:

awk -f tst.awk file | while IFS=$'\t' read -r torrent_name season_name episode_number file_size leech_number seed_number; do whatever you need to do with creating directories and files done 

I'm assuming that a shell loop is appropriate for whatever it is you're trying to do but idk.

6
  • Could you elaborate more on the structure you used as awk script? I can't figure it out, and it's not getting the, defined elements I was asking for? there are a list of duplicate lines in the output!
    – amkyp
    CommentedNov 27, 2019 at 21:48
  • There's nothing in the script I posted that would produce duplicate lines in the output so you must not be running the script I posted. Add print "###" ORS $0 ORS name ORS $1 ORS $2 or similar after every line so you can see what's changing at each step and it should be extremely obvious what it's doing.
    – Ed Morton
    CommentedNov 27, 2019 at 21:57
  • 1
    I actually changed the concept of question I asked because I got what should I do for next step, but figuring about the first step, I'm thinking about a better solution rather than my lines of awk's :) Also thanks for your comments and answer!
    – amkyp
    CommentedNov 27, 2019 at 22:26
  • By now everyones tried to understand your original, given up and moved on so changing it now probably won't help you get answers. If you decide to ask a new question then see my original comment, unix.stackexchange.com/questions/554242/…
    – Ed Morton
    CommentedNov 27, 2019 at 22:30
  • I just add echo "$array_list" | awk 'BEGIN { FS=OFS="\t" } { name = $1; sub(/ [^ ]+$/,"",name); sub(/.* [Ss]/,"",$1); sub(/[Ee]/,OFS,$1); $2 = $1; $1 = name; print }' in the array loop and after running the script I saw everything was twice, although name argument should until before the date!
    – amkyp
    CommentedNov 27, 2019 at 22:35

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.