I am working on a CSV
data Sheet and want to parse and filter the data out it. While working on a code, I found a similar code someone has asked on SO POST and the author having almost the same H/W data as I see that related to HPE H/W where I have some data and columns are different.
I want to know how we can define this code in a better and performant way. Any help will be much appreciated.
Sample Data:
Status Server Server Name Bay # Model Processor Proc. Count Memory Serial Number State Power State iLO FW Firmware Appliance Name Critical enc2010, bay 1 tdm2066.example.com 1 ProLiant BL460c Gen9 Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz 2 262144 2M272101N9 Unmanaged On 2.53 May 03 2017 I36 v2.40 (02/17/2017) OV C7000 enclosures 1 OK enc1011, bay 1 tdm1068.example.com 1 ProLiant BL460c Gen9 Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz 2 262144 2M272101P6 Monitored On 2.55 Aug 16 2017 I36 v2.74 (07/21/2019) OV C7000 enclosures 1 OK enc1012, bay 1 tdm1083.example.com 1 ProLiant BL460c Gen9 Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz 2 262144 2M272101NX Monitored On 2.61 Jul 27 2018 I36 v2.60 (05/21/2018) OV C7000 enclosures 1 OK ENC2004, bay 1 tdm2033.example.com 1 ProLiant BL460c Gen9 Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz 2 524288 2M262602L2 Monitored On 2.55 Aug 16 2017 I36 v2.52 (10/25/2017) OV C7000 enclosures 1 OK ENC2006, bay 1 vds2009 1 ProLiant BL460c Gen9 Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz 2 524288 2M263604ZZ Monitored On 2.40 Dec 02 2015 I36 v2.20 (05/05/2016) OV C7000 enclosures 1 OK ENC2011, bay 1 tdm2081.example.com 1 ProLiant BL460c Gen9 Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz 2 524288 2M2708027Z Monitored On 2.55 Aug 16 2017 I36 v2.52 (10/25/2017) OV C7000 enclosures 1 OK ENC1003, bay 1 tdm1024.example.com 1 ProLiant BL460c Gen9 Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz 2 524288 2M262602KW Monitored On 2.73 Feb 11 2020 I36 v2.52 (10/25/2017) OV C7000 enclosures 1 OK ENC1006, bay 1 vds1009 1 ProLiant BL460c Gen9 Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz 2 524288 2M262505V5 Monitored On 2.40 Dec 02 2015 I36 v2.00 (12/28/2015) OV C7000 enclosures 1 OK ENC1007, bay 1 vds1023 1 ProLiant BL460c Gen9 Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz 2 524288 2M264800TR Monitored On 2.50 Sep 23 2016 I36 v2.30 (09/12/2016) OV C7000 enclosures 1
How Data-Frame Looks Like:
Server Name Appliance Name Bay Enclosure 0 tdm2066 OV C7000 enclosures 1 bay 1 ENC2010 1 tdm1068 OV C7000 enclosures 1 bay 1 ENC1011 2 tdm1083 OV C7000 enclosures 1 bay 1 ENC1012 3 tdm2033 OV C7000 enclosures 1 bay 1 ENC2004 4 vds2009 OV C7000 enclosures 1 bay 1 ENC2006 5 tdm2081 OV C7000 enclosures 1 bay 1 ENC2011 6 tdm1024 OV C7000 enclosures 1 bay 1 ENC1003 7 vds1009 OV C7000 enclosures 1 bay 1 ENC1006 8 vds1023 OV C7000 enclosures 1 bay 1 ENC1007 9 vds0003 OV C7000 enclosures 1 bay 1 ENT0003 10 tdm7123 OV C7000 enclosures 1 bay 1 ENC7003 --------------- stripped lines ----------------------
Code:
df = pd.read_csv("testcreate.csv", sep="\t") df = df[[ 'Server', 'Server Name', 'Bay #', 'Appliance Name']] df['Bay'] = df['Server'].str.split(',').str[1].str.lower() df['Enclosure'] = df['Server'].str.split(',').str[0].str.upper() df['Server Name'] = df['Server Name'].str.split('.').str[0] df = df.drop(['Server', 'Bay #'], axis=1) df = df[df['Appliance Name'].str.contains('C7000')] df = pd.concat([g.set_index('Bay')['Server Name'].rename(f'{n}') for n, g in df.groupby('Enclosure')], axis=1, sort=False)
The above code works for me and outputs the below result but looking t learn the right way of writing from the experts.
Returned Sample Result: This works.
ENC1002 ENC1003 ENC1005 bay 1 tdm1012 tdm1024 vds1001