2

PCAP script.

I have a server with 32 processors, and need to maximize this script to utilize those processors. Currently, the program tshark only uses 1 processor, so I need to run multiple instances of tshark at the same time. Currently the loop defined in the script below does 1 pcap at a time which is very slow. I need to run up to 15 tsharks at one time, but not more than that until the loop gets to the end of the file.

Essentially the script reads certain pcap files and lists the pcaps in a text file, and then uses tshark to filter, and then merges using mergecap.

In this example there are 5 pcap files to filter.

  1. full_cap_1589
  2. full_cap_1590
  3. full_cap_1591
  4. full_cap_1592
  5. full_cap_1593

 #!/bin/bash # Test Script to parse pcap files #DATE=`date |awk '{print $2}'` set -x echo "Start Time - Month/Day TIME example: 07/19 08:00" read -e date1 echo "End Time - Month/Day TIME example 07/19 08:35" read -e date2 echo "What IP address to filter on?" read -e ip echo $ip FIND=`find /mnt/pcap/captures/ -type f -newermt "$date1" ! -newermt "$date2" | cut -c20-40 > /home/username/loading_dock/load.txt` #for full_caps in "${FIND[@]}" for i in `cat /home/username/loading_dock/load.txt` do tshark -r /mnt/pcap/captures/$i "-Y ip.addr == $ip" -w /home/username/loading_dock/$i.pcap done mergecap -w /home/username/loading_dock/*.pcap -w /home/username/test1.pcap rm -rf /home/username/loading_dock/*.pcap rm -f /home/username/loading_dock/load.txt exit 0 

    1 Answer 1

    2

    Can you use GNU Parallel:

    parallel -j15 tshark -r /mnt/pcap/captures/{} \'"-Y ip.addr == $ip"\' -w /home/username/loading_dock/{}.pcap :::: /home/username/loading_dock/load.txt 

    GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.

    If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:

    Simple scheduling

    GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:

    GNU Parallel scheduling

    Installation

    For security reasons you should install GNU Parallel with your package manager, but if GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:

    (wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash 

    For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README

    Learn more

    See more examples: http://www.gnu.org/software/parallel/man.html

    Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

    Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html

    Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel

    1
    • Yes, that worked just fine. < Thank you!!!
      – backspin
      CommentedJul 20, 2017 at 12:44

    You must log in to answer this question.

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.