3
\$\begingroup\$

Recently I had an interview question about extracting records that matched certain criteria from a 2D List of strings. The premise was a CSV file was parsed into a List<List<string>>, and I need to filter them based on a record type (only keep "SALE" and "REFUND") and sort them by the date and time found in the record.

The Main function, I was unable to edit besides adding more records to, simply called the function I was to write with the list of sample records, then wrote the records returned by my function to standard output.

/* Each record has 4 elements. 1- date in day month year format 2- time in hour : minute format with a 24 hour clock 3- record type 4- dollar value */ public class Program { public static void Main() { var record1 = new List<string>() { "30-01-2012", "14:50", "SALE", "10" }; var record2 = new List<string>() { "30-01-2015", "14:50", "REFUND", "10" }; var record3 = new List<string>() { "23-01-2012", "14:50", "CANCEL", "0" }; var record4 = new List<string>() { "30-01-2012", "14:51", "SALE", "10" }; var records = new List<List<string>>() { record1, record2, record3, record4 }; var results = RecordExtractor.Extract(records); foreach(var result in results) { Console.WriteLine(String.Join(",", result)); } } } 

I was able to come up with a solution for this challenge, but am unsure how efficient it is for larger sets of data. In specific, I'm not sure if the string comparison or the ordering by date could be done in a faster way. The function I created is as follows:

public static class RecordExtractor { private static readonly string _sale = "SALE"; private static readonly string _refund = "REFUND"; public static List<List<string>> Extract(List<List<string>> records) { var filteredRecords = new List<List<string>> (); foreach(var record in records) { var recordType = record[2]; if(string.Equals(recordType, _sale, StringComparison.InvariantCultureIgnoreCase) || string.Equals(recordType, _refund, StringComparison.InvariantCultureIgnoreCase)) { filteredRecords.Add(record); } } var orderedOutput = filteredRecords.OrderBy(x => { DateTime.TryParseExact($"{x[0]} {x[1]}", "dd-MM-yyyy HH:mm", System.Globalization.CultureInfo.InvariantCulture, System.Globalization.DateTimeStyles.None, out var dt); return dt; }); return orderedOutput.ToList(); } } 
\$\endgroup\$
3
  • 3
    \$\begingroup\$"The premise was a CSV file was parsed into a List<List<string>>" -- and at that point you politely decline and leave the interview, because if this is the kind of nonsense they present to you at an interview, I can only imagine the horrors present in their codebase.\$\endgroup\$
    – BCdotWEB
    CommentedOct 21, 2023 at 12:02
  • \$\begingroup\$I heartily agree with comment by @BCdotWEB. I find it disturbingly odd that they pass you a List<List<string>>. If I were to gauge a developer's skills, I would pass them the CSV file and let them read from it. Whether they read the file in one big gulp or read in sections would tell me a lot. Also, I would hope to see some IEnumerable usage, or even IList rather than just List.\$\endgroup\$CommentedOct 23, 2023 at 15:35
  • \$\begingroup\$@RickDavin I agree that the formatting of the code is not the greatest. If I was to build something from scratch for reading CSV files, I would use classes and probably stream the file. The CSVHelper library could also be useful. Unfortunately, these types of coding challenges are pretty common on hackerank and other technical assessments.\$\endgroup\$CommentedOct 23, 2023 at 23:00

3 Answers 3

2
\$\begingroup\$

Some note I think the interview would be looking for

Error Handling
What if the inside list count is not equal to 4? Do you throw an exception, if so what one, or do you just ignore the record? I think in an interview best case here is to ask what the business requirements are for that situation, since I believe interviews should be a two-way street and not just them evaluating you. If leave it up to you I would just pick one. Either exception or ignore them and filter them out.

What if you can't parse the date? The code now will make the date Jan 1, 0001 if date can't be parsed. Is that the correct thing to do? Again, only something the interviewer could answer.  

Magic strings/numbers
Its good magic string are contained in _sale & _refund. One of the minor discussion points would be why a static readonly string instead of a const string? Should case be ignored when comparing the Sale and Rebate?

The magic numbers are in the code are the position inside the list. Bonus points for creating constants for those.

private const int DatePos = 0; private const int TimePos = 1; private const int TypePos = 2; private const int ValuePos = 3; 

Moving these string and numbers makes long term maintenance easier for if the positions or formats change.
 

Immutability
Another good point is the code doesn't mutate the List coming in.

Efficiency
This goes a bit into code people's personal preference. Creating a list of filteredRecords to fill with records then take that list to do Linq Orderby then into a new list doesn't seem to flow. Why create the list instead of using the Linq Where since the code is using OrderBy and ToList? The other option is to create the filteredRecords and loop, as it currently does, then use the List.Sort method to sort instead of Linq QrderBy then return the filteredRecords. I don't think one is inherently better than the other but to do both comes off as a bit odd in the code.

\$\endgroup\$
    1
    \$\begingroup\$

    I would build something along the lines of this.

    Make a class that represents the record.

    enum RecordType { REFUND, SALE, CANCEL, } class MyRecord { public DateTime Timestamp { get; set; } public RecordType RecordType { get; set; } public int NoIdea { get; set; } public override string ToString() { return $"{Timestamp} {RecordType} {NoIdea}"; } } 

    Make a parser that converts the CSV to this record:

    class RecordCSVParser { public bool TryParseRow(List<string> csvRow, out MyRecord result) { result = new MyRecord(); if (csvRow.Count < 3) return false; if (!DateTime.TryParseExact($"{csvRow[0]} {csvRow[1]}", "dd-MM-yyyy HH:mm", System.Globalization.CultureInfo.InvariantCulture, System.Globalization.DateTimeStyles.None, out var timestamp)) return false; result.Timestamp = timestamp; if (!Enum.TryParse<RecordType>(csvRow[2], out RecordType recordType)) return false; result.RecordType = recordType; if (!int.TryParse(csvRow[3], out int noIdea)) return false; result.NoIdea = noIdea; return true; } public IEnumerable<MyRecord> ParseCSV(List<List<string>> csv) { foreach (var row in csv) { if (TryParseRow(row, out MyRecord record)) yield return record; else Console.WriteLine($"Error while parsing row ..."); } } } 

    Then do the filtering and ordering using linq:

     static void Main() { ... RecordCSVParser parser = new RecordCSVParser(); var parsedRecords = parser.ParseCSV(records); var filteredRecords = parsedRecords.Where(x => x.RecordType == RecordType.SALE || x.RecordType == RecordType.REFUND); var orderedRecords = filteredRecords.OrderBy(x => x.Timestamp).ToList(); foreach (var record in orderedRecords) { Console.WriteLine(String.Join(",", record)); } } 

    This might not be the absolute fastest solution. I don't know it that was the requirement. But this is something that is readable and scalable. For example, you could go overboard and make an interface for parsing the data. Then you can also implement an JSON or XML converter.

    \$\endgroup\$
    1
    • 2
      \$\begingroup\$Hm, this isn't really a review of the existing code - it's an alternate implementation.\$\endgroup\$
      – ChrisWue
      CommentedOct 24, 2023 at 4:08
    1
    \$\begingroup\$

    Main

    Collection population

    • Your records collection's population could be written in a more concise way
    var records = new List<List<string>>() { new() { "30-01-2012", "14:50", "SALE", "10" }, new() { "30-01-2015", "14:50", "REFUND", "10" }, new() { "23-01-2012", "14:50", "CANCEL", "0" }, new() { "30-01-2012", "14:51", "SALE", "10" } }; 

    Result printing

    • Since the results is a List<List<string>> you can use the ForEach LINQ extension as well
    RecordExtractor.Extract(records) .ForEach(result => Console.WriteLine(string.Join(",", result))); 
    • If this is the only place where you use the results, it might make sense to return a more suitable datatype from the Extract method
      • for example List<string>

    RecordExtractor

    const vs static readonly

    • Both of your private fields could be declared as const
    private const string Sale = "SALE"; private const string Refund = "REFUND"; 
    • You could also extract several other constants:
    private const string DateTimeFormat = "dd-MM-yyyy HH:mm"; private const StringComparison Comparison = StringComparison.InvariantCultureIgnoreCase; 

    Mixing imperative and declarative code

    • You have done the filtering in imperative fashion while the ordering is implemented in a declarative way
      • I would suggest to pick one and stick with that if possible
    var result = from record in records let recordType = record[2] let dateTimeRaw = DateTime.ParseExact($"{record[0]} {record[1]}", DateTimeFormat, null) where string.Equals(Sale, recordType, Comparison) || string.Equals(Refund, recordType, Comparison) orderby dateTimeRaw select record; return result.ToList(); 
    • I've replaced your TryParseExact call to ParseExact because you did not use the returned value of the TryParseExact
      • and also your code lacks error handling in general

    For the sake of completeness here is the refactored version

    Main

    var records = new List<List<string>>() { new() { "30-01-2012", "14:50", "SALE", "10" }, new() { "30-01-2015", "14:50", "REFUND", "10" }, new() { "23-01-2012", "14:50", "CANCEL", "0" }, new() { "30-01-2012", "14:51", "SALE", "10" } }; RecordExtractor.Extract(records) .ForEach(result => Console.WriteLine(string.Join(",", result))); 

    RecordExtractor

    private const string Sale = "SALE"; private const string Refund = "REFUND"; private const string DateTimeFormat = "dd-MM-yyyy HH:mm"; private const StringComparison Comparison = StringComparison.InvariantCultureIgnoreCase; public static List<List<string>> Extract(List<List<string>> records) { var result = from record in records let recordType = record[2] let dateTimeRaw = DateTime.ParseExact($"{record[0]} {record[1]}", DateTimeFormat, null) where string.Equals(Sale, recordType, Comparison) || string.Equals(Refund, recordType, Comparison) orderby dateTimeRaw select record; return result.ToList(); } 

    UPDATE #1

    I forgot to explicitly list the following refactor ideas:

    • Please prefer string.Join over String.Join
      • string always refers to the built-in System.String type
      • String can be anything depending on the usings
    using My; public class Program { public static void Main() { String s = new String(); s.Hi(); } } namespace My { public class String { public void Hi() { System.Console.WriteLine("Hi there!"); } } } 
    • Please prefer to pass the expected value as the first parameter to the string.Equals method
      • Similarly to the assertions where the expected is the first parameter and the actual is the second one
      • It also helps future refactor. For example if you decide to use Equals instance method instead of the string.Equals static
        • then it is less error-prone since you call the Equals method on an object which is never null
    string actual = null; const string Pattern = "..."; Pattern.Equals(actual); // returns false actual.Equals(Pattern); // throws NullReferenceException 
    \$\endgroup\$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.