I have this script in which I read an XML and I pass it to CSV and at the end of the script I transform it into SQLITE
#!/bin/bash rm -f -r rshost rscname rsctype ttstamp tservice tformat trdata trdata2 cat $1 | grep Telegram | sed -e 's/"/ /g' | awk '{ print $9 }' | cut -c27-30 > trdata cat $1 | grep Telegram | sed -e 's/"/ /g' | awk '{ print $3 }'| cut -c1-19 > ttstamp a=`cat $1 | grep RecordStart | head -1 | sed -e 's/"/ /g'| awk '{ print $15 }'` b=`cat $1 | grep RecordStart | head -1 | sed -e 's/"/ /g' | awk '{print $12 }' | sed -e 's/;/ /g' | sed -e 's/=/ /g' | awk '{print $4 }'` c=`cat $1 | grep RecordStart | head -1 | sed -e 's/"/ /g' | awk '{print $9,$10 }'` touch rsctype rshost rscname kk=`wc -l trdata | awk '{ print $1 }'` for i in `seq 1 $kk` do echo $a >> rsctype echo $b >> rshost echo $c >> rscname done cat $1 | grep Telegram | sed -e 's/"/ /g' | awk '{ print $5 }' > tservice cat $1 | grep Telegram | sed -e 's/"/ /g' | awk '{ print $7 }' > tformat cat $1 | grep Telegram | sed -e 's/"/ /g' | awk '{ print $9 }' | cut -c41-44 > trdata2 cat $1 | grep Telegram | sed -e 's/"/ /g' | awk '{ print $9 }' | cut -c31-34 > grhost awk -Wposix '{printf("%d\n","0x" $1)}' trdata > trdata3 awk -Wposix '{printf("%d\n","0x" $1)}' trdata2 > trdata4 sed -i "s%^0%0/%g" grhost cat grhost | cut -c1-3 > grhost2 sed -i "s%.\{4\}%/%g" grhost pr -mts, grhost2 grhost > grhostfinal sed -i "s/,//g" grhostfinal cat grhostfinal | cut -c1-4 > grhostfinal1 cat grhostfinal | cut -c5 > grhostfinal2 awk -Wposix '{printf("%d\n","0x" $1)}' grhostfinal2 > grhostfinal3 pr -mts, grhostfinal1 grhostfinal3 > grhostfinal4 sed -i "s/,//g" grhostfinal4 pr -mts, rshost ttstamp rsctype tservice tformat trdata4 trdata3 rscname grhostfinal4 > conjunto.csv sed -i "s|^|,|g" conjunto.csv sqlite3 test2.sqlite "select fecha from testxml4;" > data.csv cat data.csv | sort | uniq > data2.csv for k in `cat data2.csv` do grep "$k" conjunto.csv >> quitar done diff quitar conjunto.csv | grep ">" | sed 's/^> //g' > diferencia.csv echo `sqlite3 test2.sqlite < testxml` python csv2sqlite.py diferencia.csv test2.sqlite testxml4 rm -f -r rshost rscname rsctype ttstamp tservice tformat trdata trdata2 trdata3 trdata4 grhost2 grhost grhostfinal3 grhostfinal1 grhostfinal2 grhostfinal grhostfinal4 a b c data.csv conjunto.csv data2.csv quitar
I have this XML (The data is a private)
<CommunicationLog xmlns="http://knx.org/xml/telegrams/01"> <RecordStart Timestamp="" Mode="" Host="" ConnectionName="" ConnectionOptions="" ConnectorType="" MediumType="" /> <Telegram Timestamp="" Service="" FrameFormat="" RawData="" /> <Telegram Timestamp="" Service="" FrameFormat="" RawData="" /> <RecordStop Timestamp="" /> <RecordStart Timestamp="" Mode="" Host="" ConnectionName="" ConnectionOptions="" ConnectorType="" MediumType="" /> <Telegram Timestamp="" Service="" FrameFormat="" RawData="" /> <Telegram Timestamp="" Service="" FrameFormat="" RawData="" /> <RecordStop Timestamp="" /> </CommunicationLog>
Once analyzed the data, I take them to a CSV and with the Python program csv2sqlite.py
python csv2sqlite.py CSVFILE.csv DB.sqlite TABLESQLITE
My question is how can I make this script faster and more efficient, since it takes a long time to analyze all the data.
CREATE TABLE
statement you used and the format of the attributes. An explanation of what each file should contain would also be nice, as currently I'm gettingRawData=
fortservice
andService=
forttstamp
, which is super confusing.\$\endgroup\$