I wrote a program to scan files, extract text, build a data set and export it to a CSV. My initial program was ugly, but it worked. I have since refactored it, but the latest version seems to run a lot slower than its previous rendition.
A brief rundown of the program:
GetFiles()
- The program gets a folder to scan
- It then sets a csv file to write the data to
ReadFiles
foreach
loop - each file in the folder- creates a
StreamReader
- extracts each key/value pair in the file and adds it to a list
- closes the
StreamReader
- creates a
AddToTable
- Creates a DataTable
foreach
loop - each key/value pair in the list- if a key does not exist in the DataTable columns, add it as a column
- build each row, based on the key/value
SaveFiles
- Creates a
StreamWriter
- builds the csv based on the information in the DataTable.
- Creates a
namespace FlirAdept_Plugin { class FLIR_Datatable { public void GetFiles() { FolderBrowserDialog fbd = new FolderBrowserDialog(); if (fbd.ShowDialog() == DialogResult.OK) { string filesToScan = fbd.SelectedPath; SaveFileDialog sfd = new SaveFileDialog(); sfd.Filter = "CSV Files (*.csv)|*.csv|All Files (*.*)|*.*"; sfd.FilterIndex = 1; sfd.RestoreDirectory = true; if (sfd.ShowDialog() == DialogResult.OK) { Stream fileCheck = null; if ((fileCheck = sfd.OpenFile()) != null) { fileCheck.Close(); string csvFile = sfd.FileName.ToString(); if (!csvFile.EndsWith(".csv")) csvFile += ".csv"; List<Dictionary<string, string>> dictionary = ReadFiles(filesToScan); DataTable table = AddToTable(dictionary); SaveFiles(table, csvFile); } } } } List<Dictionary<string, string>> ReadFiles(string filesToScan) { string[] files = Directory.GetFiles(filesToScan); List<string> errorFiles = new List<string>(); Dictionary<string, string> record = new Dictionary<string, string>(); List<Dictionary<string, string>> recordList = new List<Dictionary<string, string>>(); foreach (string file in files) { string fileName = Path.GetFileName(file); string findText = ".0 index"; string match = @"(\.\d index)|(\.\d\.label)|(\.\d\.value)"; StreamReader reader = new StreamReader(file); string header = null; string data = null; List<string> fileData = new List<string>(); record = new Dictionary<string, string>(); if (file.Contains(".jpg")) { reader = new StreamReader(file); string line = null; record.Add("Filename", fileName); while ((line = reader.ReadLine()) != null) { if (line.Contains(findText)) { break; } } try { // Look for ".n" where 'n' is a digit Match m = Regex.Match(line, match, RegexOptions.IgnoreCase); // Read file, and split text to identify "Label" and "Value" while (m.Success && line != null) { var result = Regex.Replace(line, match, $"{Environment.NewLine}$&"); var lines = result.Split(new[] { Environment.NewLine }, StringSplitOptions.None); foreach (string s in lines) { // Adds only the "metadata" lines to the List if ((Regex.Match(s, match)).Success) fileData.Add(s); } line = reader.ReadLine(); } } catch (Exception e) { // If a file cannot be read, it is added to a list // These files do not contain metadata errorFiles.Add(fileName); } // read "Label" and compare to header foreach (string s in fileData) { int start; int end; if (s.Contains(".label")) { start = s.IndexOf('"'); end = s.LastIndexOf('"'); if (start >= 0 && end > start) { header = s.Substring(start + 1, end - start - 1); continue; } } else if (s.Contains(".value")) { start = s.IndexOf('"'); end = s.LastIndexOf('"'); if (start >= 0 && end > start) { data = s.Substring(start + 1, end - start - 1).Replace(",","."); record.Add(header, "," + data); } } } } recordList.Add(record); reader.Close(); } return recordList; } DataTable AddToTable(List<Dictionary<string, string>> dataList) { DataTable table = new DataTable(); DataColumn column; DataRow row; foreach (var item in dataList) { row = table.NewRow(); foreach (var record in item) { try { if (!table.Columns.Contains(record.Key)) { column = new DataColumn(); column.ColumnName = record.Key.ToString(); column.DefaultValue = ""; table.Columns.Add(column); } row[record.Key.ToString()] = record.Value.ToString(); } catch (Exception e) { MessageBox.Show(e.Message); } } table.Rows.Add(row); } return table; } void SaveFiles(DataTable table, string csvFile) { StreamWriter writer = new StreamWriter(csvFile); string headerRow = ""; string dataRow = ""; foreach (DataColumn col in table.Columns) { headerRow += col + ","; } writer.WriteLine(headerRow.TrimEnd(',')); foreach (DataRow row in table.Rows) { dataRow = ""; foreach (string s in row.ItemArray) { dataRow += s; } writer.WriteLine(dataRow); } writer.Close(); } } }
I don't know for certain, but I believe it might be something to do with the disposal of the StreamReader
in my ReadFiles
method, but I'm not entirely sure. I definitely know it's something to do with that loop.
What is causing the program to slow down, and how can I fix it?
time
command or automated testing?\$\endgroup\$