C#'s String.Split
method comes from C# 2.0, and lazy operations weren't a feature back then. The task is to split a string according to a (single) separator. Doing so with String.Split
is used like
string[] split = myString.Split(new string[] { separator });
Now, not that bad, but if you want to add more operations to that string[]
(and you probably do), you'll need to loop over the whole array, basically iterating the string twice. Using coroutine-like behaviour of the lazy yield
keyword, you can (maybe) do more than one operation while only iterating once over the string.
public static IEnumerable<string> LazySplit(this string stringToSplit, string separator) { if (stringToSplit == null) throw new ArgumentNullException("stringToSplit"); if (separator == null) throw new ArgumentNullException("separator"); var lastIndex = 0; var index = -1; do { index = stringToSplit.IndexOf(separator, lastIndex); if (index < 0 && lastIndex != stringToSplit.Length) { yield return stringToSplit.Substring(lastIndex); yield break; } else if (index >= lastIndex) { yield return stringToSplit.Substring(lastIndex, index - lastIndex); } lastIndex = index + separator.Length; } while (index > 0); }
While this does not have the "remove empty entries" option, using myString.LazySplit(separator).Where(str => !String.IsNullOrWhiteSpace(str))
should do the job with an O(n)
operation, or am I wrong here?
I'm not sure about the time complexity using co-routines, but for the functionality I've written some unit tests to be sure its working:
[TestMethod] public void LazyStringSplit() { var str = "ab;cd;;"; var resp = str.LazySplit(";"); var expected = new[] { "ab", "cd", "" }; var result = resp.ToArray(); CollectionAssert.AreEqual(expected, result); } [TestMethod] public void LazyStringSplitEmptyString() { var str = ""; var resp = str.LazySplit(";"); var expected = new string[0]; var result = resp.ToArray(); CollectionAssert.AreEqual(expected, result); } [TestMethod] public void LazyStringSplitWithoutEmpty() { var str = "ab;cd;;"; var resp = str.LazySplit(";").Where(s => !string.IsNullOrWhiteSpace(s)); var expected = new[] { "ab", "cd" }; var result = resp.ToArray(); CollectionAssert.AreEqual(expected, result); } [TestMethod] public void LazyStringSplitNoSplit() { var str = "ab;cd;;"; var resp = str.LazySplit(" "); var expected = new[] { "ab;cd;;" }; var result = resp.ToArray(); CollectionAssert.AreEqual(expected, result); }
";abc".LazySplit(";")
returns an empty sequence.\$\endgroup\$> 0
to>= 0
, right?\$\endgroup\$