3
\$\begingroup\$

I was recently presented with the following problem...

You are given a collection of singly-linked lists (SLLs). Return true if any of them share a common node (or “intersect”), or false otherwise.

Additional notes

  • Please don’t use recursion.
  • Assume the SLLs might be very large in length (in the millions).
  • Stop traversing and return immediately if you detect a common node.
  • If a cycle is detected, please throw an error.
  • Aim to be as efficient as possible from a time complexity perspective.
  • Implement this function with this signature: DoLinkedListsIntersect(Collection<SinglyLinkedList>) returns bool

Input:

  • The first lines of the input will describe the singly-linked-lists in a directed acyclic graph (DAG) format. The graph description language is a similar idea to the GraphViz graph description language, see: https://en.wikipedia.org/wiki/DOT_(graph_description_language).
  • Each node is described as a string token, which is a unique identifier for the node. So “a->b” means a DAG with node “a” directionally connected to node “b”.
  • As we are describing singly-linked-lists, take it as a given that the out degree of every node is either zero or one.
  • After each of the edges has been described, then each subsequent line will include set of SLL starting nodes to traverse from.

Output:

  • For each SLL print 'True' or 'False' based on the return value of your function
  • true prints True
  • false prints False
  • On throwing an error print Error Thrown!

So I must admit, not being particularly familiar with linked lists or graph data structures, I struggled with this exercise, however I did manage to complete a [probably quite naive] solution (coded in C#):

public static IEnumerable<string> Main(string[] lines) { var graph = new Dictionary<string, Node>(); foreach (var line in lines) { if (line.Contains(',')) { string returnValue; try { returnValue = LinkedListIntersection(line.Split(',').Select(v => v.Trim()), graph).ToString(); } catch (InvalidOperationException ex) when (ex.Message == "Cycle detected.") { returnValue = "Error Thrown!"; } yield return returnValue; } else if (line.Contains("->")) { var splitStr = line.Split("->", StringSplitOptions.None); var current = splitStr[0]; var next = splitStr[1]; // Check to see if the parent node already exists if (!graph.TryGetValue(next, out var nextNode)) { // If it doesn't, add it in so we can reference the object nextNode = new Node(next, null); graph.Add(next, nextNode); } // Check to see if the child node already exists if (graph.TryGetValue(current, out var currentNode)) { // If it does, update the existing object currentNode.Next = nextNode; } else { graph.Add(current, new Node(current, nextNode)); } } } } private static bool LinkedListIntersection(IEnumerable<string> nodeGroup, IDictionary<string, Node> graph) { var allTraversedNodes = new HashSet<string>(); foreach (var value in nodeGroup) { var currentTraversedNodes = new HashSet<string>(); if (!graph.TryGetValue(value, out var node)) { continue; } do { if (allTraversedNodes.Contains(node.Value)) { return true; } // Don't follow cycles if (currentTraversedNodes.Contains(node.Next?.Value)) { throw new InvalidOperationException($"Cycle detected."); } allTraversedNodes.Add(node.Value); currentTraversedNodes.Add(node.Value); node = node.Next; } while (node != null); } return false; } class Node { public string Value { get; } public Node Next { get; set; } public Node(string value, Node next) { Value = value; Next = next; } } 

As you can see from the code, I'm parsing the lines of data, generating Node objects from the DAG values and storing them in a Dictionary<string, Node>. When it comes time to check for intersections, it's as simple as performing a lookup operation on the dictionary to locate the initial Node, traversing the linked list while using a set to record each Node traversed while checking for any intersections. Pretty simple eh?

Some test cases, from which I've generated some unit tests, were also provided to prove the functionality, which for my solution all passed.

[Fact] public void Test1() { var lines = new[] { "a->b", "r->s", "b->c", "x->c", "q->r", "y->x", "w->z", "a, q, w", "a, c, r", "y, z, a, r", "a, w" }; var expectedResults = new[] { "False", "True", "True", "False" }; var results = Main(lines).ToArray(); results.Should().Equal(expectedResults); } [Fact] public void Test2() { var lines = new[] { "A->B", "G->B", "B->C", "C->D", "D->E", "H->J", "J->F", "A, G, E", "H, A" }; var expectedResults = new[] { "True", "False" }; var results = Main(lines).ToArray(); results.Should().Equal(expectedResults); } [Fact] public void Test3() { var lines = new[] { "ABC->BCD", "BCD->CDE", "DEF->EFG", "EFG->BCD", "123->456", "ABC, CDE", "123, DEF", "ABC, DEF" }; var expectedResults = new[] { "True", "False", "True" }; var results = Main(lines).ToArray(); results.Should().Equal(expectedResults); } [Fact] public void Test4() { var lines = new[] { "X->Y", "Y->X", "A->B", "B->C", "X, A" }; var expectedResults = new[] { "Error Thrown!" }; var results = Main(lines).ToArray(); results.Should().Equal(expectedResults); } 

The first thing I want to understand, for my solution, is there a more suitable data structure to store and query a collection of singly-linked lists than the Dictionary<string, Node> structure used? I wasn't particularly happy about duplicating the Value property in both the Node object and as the dictionary Key - it feels like there should be better way to do this. What are my options?

The second question, and this is the big one, are there better solutions for this problem from a time complexity perspective?

I have been musing theoretical alternate approaches, the best of which involved pre-processing and caching all intersections during the graph building process, perhaps somehow keeping track of the heads of all the different linked lists and then traversing them once all the data had been loaded to generate some kind of adjacency matrix (a 2d array with a dictionary index lookup?) which records the intersections, and then performing a lookup on those intersections with every unique permutation of node group nodes. This (in theory at least) should be more efficient in terms of query performance, however I have absolutely no idea what that would look like or how to achieve it in practice. Does anybody out there have any better ideas?

\$\endgroup\$
5
  • \$\begingroup\$I found the video less than clear about cycles: if c→a was added as suggested, does that mean, for the group y, z, a, r, to not keep cycling through cab, but go on to z and detect an intersection on a? Is throwing an error agreeable on preprocessing, as the last thing in processing queries, or just as the output for queries hitting a cycle?\$\endgroup\$
    – greybeard
    CommentedFeb 24, 2023 at 1:33
  • 1
    \$\begingroup\$If the code presented below Some tests were also provided… isn't originated by you, please put it into a block quote.\$\endgroup\$
    – greybeard
    CommentedFeb 24, 2023 at 10:12
  • \$\begingroup\$@greybeard yes, the video and the written instructions do seem to contradict one another. I chose to go with the written instructions on the basis of a clear test case (Test4) being provided. As for the implementation, I applied the rationale that even if a cycle exists it doesn't necessarily invalidate all of the data in that set, as there could be lists within the collection that don't cycle and would produce a valid true/false output when queried.\$\endgroup\$
    – James Law
    CommentedFeb 24, 2023 at 13:04
  • \$\begingroup\$On one hand, a simple Dictionary<string, string> could suffice but just doesn't feel right. You can skip a dictionary and go with HashSet<Node> but your Node class would need to implement IEquatable<Node> and override GetHashCode().\$\endgroup\$CommentedFeb 25, 2023 at 18:20
  • \$\begingroup\$Revisited the video - after all it does seem explicit and consistent about cycles: report intersection as soon as encountered, cycle where encountered: so in that situation, I would like you to just stop traversing … just say "it's a cycle" … immediately move on to the next element in the traversal group. For better or worse, cycle handling can't be finished between final edge and first query. Test 4 could read "B, A, X" expecting "True" and "B, X, A" expecting "Error Thrown!"/"it's a cycle" and "True".\$\endgroup\$
    – greybeard
    CommentedFeb 26, 2023 at 9:57

2 Answers 2

2
\$\begingroup\$

Don't write, never commit/publish undocumented/uncommented code.

While not in the specification of input and output, the Additional notes are explicit about implementation of DoLinkedListsIntersect(Collection<SinglyLinkedList>) returns bool I don't see an equivalent of.

I don't see a need to store lists, singly linked or otherwise.

Disregarding cycles, "lists intersect" when belonging to the same connected component, disjoint sets of nodes.

Needing access to successor or representative id, just map id to id.
Wanting to detect cycles seems to limit union-find algorithms applicable.

\$\endgroup\$
9
  • \$\begingroup\$"Don't write, never commit/publish undocumented/uncommented code." - I'm not sure what you mean by this, could you explain please?\$\endgroup\$
    – James Law
    CommentedFeb 24, 2023 at 19:04
  • \$\begingroup\$DoLinkedListsIntersect(Collection<SinglyLinkedList>) returns bool (bearing in mind is pseudocode), is implemented as private static bool LinkedListIntersection(IEnumerable<string> nodeGroup, IDictionary<string, Node> graph) - the boilerplate code provided in the video doesn't match the written description. Attention to detail is not the forte of the creator of this exercise.\$\endgroup\$
    – James Law
    CommentedFeb 24, 2023 at 19:07
  • \$\begingroup\$"I don't see a need to store lists, singly linked or otherwise." - ok, how would you store the data?\$\endgroup\$
    – James Law
    CommentedFeb 24, 2023 at 19:08
  • \$\begingroup\$"Disregarding cycles, "lists intersect" when belonging to the same connected component, disjoint sets of nodes." - how would you go about implementing what you've described in this scenario? Could you provide a code example please?\$\endgroup\$
    – James Law
    CommentedFeb 24, 2023 at 19:23
  • \$\begingroup\$"Wanting to detect cycles seems to limit union-find algorithms applicable." - I don't think it's so much that we want to detect the cycles, but surely we have to be able to handle them somehow? Could you suggest an alternative?\$\endgroup\$
    – James Law
    CommentedFeb 24, 2023 at 19:24
2
\$\begingroup\$

Firstly, the problem statement calls for DoLinkedListsIntersect(Collection<SinglyLinkedList>) returns bool, but that's not present in this code.

That's a problem in an interview situation - I'd be wanting to see that you can meet requirements that other code has on yours.


It seems to me that the auxiliary storage in allTraversedNodes and currentTraversedNodes could grow very large. I don't believe we need all that storage.

Instead, use the knowledge that if any node is common to two lists, they share the same tail node. So we just map each head node to the corresponding tail (this is a form of path compression), then test whether those tails are all distinct.

In pseudocode, we get something like

  • tails = map(find_tail, heads)
  • intersects = count(set(tails)) < count(set(heads))

Ensure that find_tail() throws a suitable exception if it finds a cycle (using a standard hare-and-tortoise method), and take appropriate action if we catch that.

\$\endgroup\$
4
  • \$\begingroup\$just one node per list - much less storage refers to any single query?\$\endgroup\$
    – greybeard
    CommentedFeb 24, 2023 at 10:30
  • \$\begingroup\$Yes (I hadn't spotted that the problem statement has multiple queries - that forces down the temporary-modification version of this answer). The duration of that storage is just for that query, of course - we reclaim it for the next query.\$\endgroup\$CommentedFeb 24, 2023 at 11:24
  • 1
    \$\begingroup\$just map each head node to the corresponding tail is what the path compression of some of the find algorithms union-find data structures are named for results in.\$\endgroup\$
    – greybeard
    CommentedFeb 25, 2023 at 6:42
  • 1
    \$\begingroup\$"Firstly, the problem statement calls for DoLinkedListsIntersect(Collection<SinglyLinkedList>) returns bool, but that's not present in this code." - yes it is, implemented as private static bool LinkedListIntersection(IEnumerable<string> nodeGroup, IDictionary<string, Node> graph), derived from the boilerplate code provided by the creator of the solution as can be seen in the video at 3:07 - youtu.be/zCezJ8QkUL4?t=187. I'm really sorry about the quality of the question that's been put to me but sadly I have no control over that.\$\endgroup\$
    – James Law
    CommentedFeb 25, 2023 at 12:25

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.