Loading text into an array when fields are in random order

Question

This script reads records from a CSV file of customer names and addresses. The first record of the file is the header (field names); the fields could be in any order, and sometimes have different variations on the field names.

I import the header into a one-dimensional array of field names, and use this loop to assign a variable to each column number:

' Read the first line from the file. TextLine = MyFile.ReadLine ' Split the line into an array arrCustomers = Split(TextLine, ",") For i = 0 To UBound(arrCustomers, 1) Select Case LCase(arrCustomers(i)) case "customer name", "custname", "name" CustName = i case "customer number", "custno","custnum" CustNo = i case "address", "street" Address = i case "city" City = i case "country" Country = i case "state" State = i case else ' list of unused fields s = s & ", " & arrCustomers(i) End Select Next msgbox "These fields are unused: " & s

For instance, if "Customer Name" is in field 3, then CustName = 3 and arrCustomers(CustName) is the Customer Name field of the current record.

This loop processes each record:

Do While MyFile.AtEndOfStream <> True ' Read the first line from the file. TextLine = MyFile.ReadLine 'Split the line into an array arrCustomers = Split(TextLine, ",") 'Process the current record msgbox "Our customer " & arrCustomers(CustName) & " lives in " & arrCustomers(City) & ", " & arrCustomers(State) Loop

My main problem is that the Case statements in the first loop must include every possible value for the field names. If the address field is called "Address1" or something else I didn't forsee, the Address variable becomes 0, and arrCustomers(Address) becomes the same as whatever's in the first column.

I suspect that there's a well-known algorithm somewhere for loading an array when the fields could be different each time – one that's better than my method. If not, maybe there's a way to map the "unused fields" on the fly?

carlossierra · Accepted Answer · 2016-06-08 06:00:10Z

A more compact and extensible solution would be to use pattern matching to try and take common parts of your field names and use them to map the uneven field headers you get in your csv files.

But I don't think this will work in your case, because you don't seem to have common patterns for all your field names. For instance, adress and street are field headers that should match to the Address field, but as you can see, there's no common usable pattern there. And in cases where there is a common pattern like cust, it won't work either because it could be for the fields CustName or CustNo.

So in this particular case I think you will definitively need to provide the full list of possible field names for each field. Now, you can do this in a more compact way using a VBScript dictionary that allows you to check in one line if certain field name (key) exists using dict.Exists(key), or see the field address matching a particular field name using dict.item(key). You still need to add all possible combinations of field names and addresses to the dictionary, but after doing that (just one time), you can use your dictionary very efficiently to get the address for all your headers.

Stack Exchange Network

Loading text into an array when fields are in random order

1 Answer 1

Hot Network Questions

Loading text into an array when fields are in random order

1 Answer 1

Related

Hot Network Questions