I was recently presented with the following problem...
You are given a collection of singly-linked lists (SLLs). Return true if any of them share a common node (or “intersect”), or false otherwise.
Additional notes
- Please don’t use recursion.
- Assume the SLLs might be very large in length (in the millions).
- Stop traversing and return immediately if you detect a common node.
- If a cycle is detected, please throw an error.
- Aim to be as efficient as possible from a time complexity perspective.
- Implement this function with this signature:
DoLinkedListsIntersect(Collection<SinglyLinkedList>) returns bool
Input:
- The first lines of the input will describe the singly-linked-lists in a directed acyclic graph (DAG) format. The graph description language is a similar idea to the GraphViz graph description language, see: https://en.wikipedia.org/wiki/DOT_(graph_description_language).
- Each node is described as a string token, which is a unique identifier for the node. So “a->b” means a DAG with node “a” directionally connected to node “b”.
- As we are describing singly-linked-lists, take it as a given that the out degree of every node is either zero or one.
- After each of the edges has been described, then each subsequent line will include set of SLL starting nodes to traverse from.
Output:
- For each SLL print 'True' or 'False' based on the return value of your function
- true prints
True
- false prints
False
- On throwing an error print
Error Thrown!
So I must admit, not being particularly familiar with linked lists or graph data structures, I struggled with this exercise, however I did manage to complete a [probably quite naive] solution (coded in C#):
public static IEnumerable<string> Main(string[] lines) { var graph = new Dictionary<string, Node>(); foreach (var line in lines) { if (line.Contains(',')) { string returnValue; try { returnValue = LinkedListIntersection(line.Split(',').Select(v => v.Trim()), graph).ToString(); } catch (InvalidOperationException ex) when (ex.Message == "Cycle detected.") { returnValue = "Error Thrown!"; } yield return returnValue; } else if (line.Contains("->")) { var splitStr = line.Split("->", StringSplitOptions.None); var current = splitStr[0]; var next = splitStr[1]; // Check to see if the parent node already exists if (!graph.TryGetValue(next, out var nextNode)) { // If it doesn't, add it in so we can reference the object nextNode = new Node(next, null); graph.Add(next, nextNode); } // Check to see if the child node already exists if (graph.TryGetValue(current, out var currentNode)) { // If it does, update the existing object currentNode.Next = nextNode; } else { graph.Add(current, new Node(current, nextNode)); } } } } private static bool LinkedListIntersection(IEnumerable<string> nodeGroup, IDictionary<string, Node> graph) { var allTraversedNodes = new HashSet<string>(); foreach (var value in nodeGroup) { var currentTraversedNodes = new HashSet<string>(); if (!graph.TryGetValue(value, out var node)) { continue; } do { if (allTraversedNodes.Contains(node.Value)) { return true; } // Don't follow cycles if (currentTraversedNodes.Contains(node.Next?.Value)) { throw new InvalidOperationException($"Cycle detected."); } allTraversedNodes.Add(node.Value); currentTraversedNodes.Add(node.Value); node = node.Next; } while (node != null); } return false; } class Node { public string Value { get; } public Node Next { get; set; } public Node(string value, Node next) { Value = value; Next = next; } }
As you can see from the code, I'm parsing the lines of data, generating Node
objects from the DAG values and storing them in a Dictionary<string, Node>
. When it comes time to check for intersections, it's as simple as performing a lookup operation on the dictionary to locate the initial Node
, traversing the linked list while using a set to record each Node
traversed while checking for any intersections. Pretty simple eh?
Some test cases, from which I've generated some unit tests, were also provided to prove the functionality, which for my solution all passed.
[Fact] public void Test1() { var lines = new[] { "a->b", "r->s", "b->c", "x->c", "q->r", "y->x", "w->z", "a, q, w", "a, c, r", "y, z, a, r", "a, w" }; var expectedResults = new[] { "False", "True", "True", "False" }; var results = Main(lines).ToArray(); results.Should().Equal(expectedResults); } [Fact] public void Test2() { var lines = new[] { "A->B", "G->B", "B->C", "C->D", "D->E", "H->J", "J->F", "A, G, E", "H, A" }; var expectedResults = new[] { "True", "False" }; var results = Main(lines).ToArray(); results.Should().Equal(expectedResults); } [Fact] public void Test3() { var lines = new[] { "ABC->BCD", "BCD->CDE", "DEF->EFG", "EFG->BCD", "123->456", "ABC, CDE", "123, DEF", "ABC, DEF" }; var expectedResults = new[] { "True", "False", "True" }; var results = Main(lines).ToArray(); results.Should().Equal(expectedResults); } [Fact] public void Test4() { var lines = new[] { "X->Y", "Y->X", "A->B", "B->C", "X, A" }; var expectedResults = new[] { "Error Thrown!" }; var results = Main(lines).ToArray(); results.Should().Equal(expectedResults); }
The first thing I want to understand, for my solution, is there a more suitable data structure to store and query a collection of singly-linked lists than the Dictionary<string, Node>
structure used? I wasn't particularly happy about duplicating the Value
property in both the Node
object and as the dictionary Key
- it feels like there should be better way to do this. What are my options?
The second question, and this is the big one, are there better solutions for this problem from a time complexity perspective?
I have been musing theoretical alternate approaches, the best of which involved pre-processing and caching all intersections during the graph building process, perhaps somehow keeping track of the heads of all the different linked lists and then traversing them once all the data had been loaded to generate some kind of adjacency matrix (a 2d array with a dictionary index lookup?) which records the intersections, and then performing a lookup on those intersections with every unique permutation of node group nodes. This (in theory at least) should be more efficient in terms of query performance, however I have absolutely no idea what that would look like or how to achieve it in practice. Does anybody out there have any better ideas?
Some tests were also provided…
isn't originated by you, please put it into a block quote.\$\endgroup\$Test4
) being provided. As for the implementation, I applied the rationale that even if a cycle exists it doesn't necessarily invalidate all of the data in that set, as there could be lists within the collection that don't cycle and would produce a validtrue
/false
output when queried.\$\endgroup\$Dictionary<string, string>
could suffice but just doesn't feel right. You can skip a dictionary and go withHashSet<Node>
but yourNode
class would need to implementIEquatable<Node>
and overrideGetHashCode()
.\$\endgroup\$