Those comments are very difficult to read.
j = list(set(j)) #someone warned that the problematic test case gives the same edge many times, so this is to remove duplicates
Format this properly:
# The problematic test case gives the same edge many times, # so this is to remove duplicates j = list(set(j))
Then remove fluff:
# Remove duplicates. Helps some test cases. j = list(set(j))
In this case just remove the first comment, because list(set(...))
is already known to mean deduplicate. And remove the second since it's implied by the fact you've written the code.
j = list(set(j))
Another pair is
a = [] # for input purposes b = [] # for output purposes
Don't do this. Just call them
descriptive_name_1 = [] descriptive_name_2 = []
so you don't need the comment in the first place.
Then there's q = ...
. So I wrack my brain thinking about what word contains q
. But it turns out you just meant to write num_test_cases
. What does that have to do with the letter q
?
Worse than that is when you outright lie. What would you expect the variable node
to contain? A node, right? Nope - it contains the total number of nodes.
And you write
node = 0
so we expect the total number of nodes to be 0
at that point, right? Nope, you immediately change your mind and write
node, edge = list(map(int, input().strip().split()))
But OK, fine. At least we know what nodes
contains, then: multiple node
s, so a list (or other collection) of lengths of nodes.
Nope, it actually means exactly the same thing as node
.
A similar comment goes for edge
, which should be num_edges
.
Then there's adj
, which you label "adjacency matrix". Why not just call it adjacency_matrix
then? But it's not even a dense matrix like that would imply, it's actually a list of per-node adjacencies, so call it adjacencies
.
Then there's starting
and start
, which are the same thing named differently. Call them start_node
or at least be consistent.
Going back to b
, you have
b = []
but again this is a lie, because actually
b = shortest(start_node, num_nodes, adjacencies)
So just write that.
distances = shortest(start_node, num_nodes, adjacencies)
For a
, initialize it as close to usage as possible and call it edges
.
If you write
for j in range(num_edges):
you're misleading me into thinking you're going to use the index. When you're not going to, write
for _ in range(num_edges):
And what's up with that, followed by
for j in edges:
followed by
for j in adjacencies:
Is j
just your general "this is a thing" variable? Most people use x
, elem
and item
for that. Instead, write
for _ in range(num_edges): for edge in edges: for adjacent in adjacencies:
The middle can be
for start, end in edges:
The comment
#edges are 6
Isn't coherent. Try
# Edges are length 6
Instead, though, don't put this information in shortest
- use 1
and multiply out when using the results.
Don't call variables temp
. It's a poor name.
You've double-indented the part where you update the neighbours.
Note that
num_nodes, num_edges = list(map(int, input().strip().split()))
is just
num_nodes, num_edges = map(int, input().strip().split())
Why do you do this?
for i in adjacencies[current-1]: neigh = i
Just write
for neighbour in adjacencies[current-1]:
And the content can be
for neighbour in adjacencies[current-1]: distance[neighbour-1] = min(distance[neighbour-1], distance[current-1] + 1)
Do you not think this whole -1
thing is getting a bit silly? When reading, just decrement the edge values and starting edge correctly:
edges = [] for _ in range(num_edges): start, end = map(int, input().strip().split()) edges.append((start - 1, end - 1)) start_node = int(input()) - 1
while
loops and if
statements don't need their condition parenthesized, and operators should be spaced properly:
while (len(unique)>0): # before while len(unique) > 0: # after
Then this is better as just while unique
.
Don't put spaces before the parentheses in function calls or indexing.
minDist
should be min_dist
. You can also start with it set to num_nodes
, and the same for distance
's initialization.
unique
is a horrible name, and has nothing to do with the variable's purpose. Try unvisited
instead. It can be initialized as
unvisited = [i for i, adj in enumerate(adjacencies) if adj]
distance
, unsurprisingly, does not mean distance
but instead distances
. Fix that.
This code:
min_dist = num_nodes for i in unvisited: if distances[i] <= min_dist: min_dist = distances[i] distIndex = i current = distIndex
is just
min_dist = num_nodes for i in unvisited: if distances[i] <= min_dist: min_dist = distances[i] current = i
and current
should be called closest
. It is simplifiable to
closest = min(unvisited, key=lambda x: distances[x])
This is faster if you write
closest = min(unvisited, key=distances.__getitem__)
but the difference isn't vital.
One last thing with shortest
- rename it to shortest_distances
.
Put the rest of the code in a main
function.
If you write
def read_pair(decrement): x, _, y = input().partition(" ") return int(x) - decrement, int(y) - decrement
then you can initialize num_nodes
, num_edges
and edges
with
num_nodes, num_edges = read_pair(0) edges = [read_pair(1) for i in range(num_edges)]
Note my use of partition
instead of split
as IMO it's a better description of the operation here.
Then
for adjacent in adjacencies: adjacent = list(set(adjacent))
doesn't actually work! adjacent =
only affects the local scope! Instead, you want
adjacencies = [list(set(adj)) for adj in adjacencies]
which is even better as
adjacencies = [set() for _ in range(num_nodes)] for start, end in edges: adjacencies[start].add(end) adjacencies[end].add(start)
as you never have the large intermediates then, and there's no real point converting back from a set.
This stuff:
distances.remove(0) for i in range(len(distances)): if distances[i] == num_nodes: distances[i] = -1 else: distances[i] *= 6 print(" ".join(str(e) for e in distances))
is just
print(*(-1 if i == num_nodes else i * 6 for i in distances if i))
Nice, that seems to have made it fast enough, but more important the code is more readable.
We can improve speed further by avoiding input
, using sys.stdin
directly to get buffering.
def main(lines): def read_pair(decrement): x, _, y = next(lines).partition(" ") return int(x) - decrement, int(y) - decrement ... main(sys.stdin)
Note that this change is sufficient on its own to beat the time limit: you can apply it to your original code to pass the test. Note also that you shouldn't mix input
with next
; at least on Python 2 this throws an error saying
ValueError: Mixing iteration and read methods would lose data
Though this error is gone in Python 3, that doesn't leave me feeling good about it.
I'd finish off with
from __future__ import print_function
to make it Python 2 compatible.
from __future__ import print_function import sys def shortest_distances(start_node, num_nodes, adjacencies): distances = [num_nodes] * num_nodes unvisited = [i for i, adj in enumerate(adjacencies) if adj] distances[start_node] = 0 while unvisited: closest = min(unvisited, key=distances.__getitem__) unvisited.remove(closest) # Update the distances of each neighbour for neighbour in adjacencies[closest]: distances[neighbour] = min(distances[neighbour], distances[closest] + 1) return distances def main(lines): def read_pair(decrement): x, _, y = next(lines).partition(" ") return int(x) - decrement, int(y) - decrement num_test_cases = int(next(lines)) for i in range(num_test_cases): num_nodes, num_edges = read_pair(0) edges = [read_pair(1) for i in range(num_edges)] start_node = int(next(lines)) - 1 adjacencies = [set() for _ in range(num_nodes)] for start, end in edges: adjacencies[start].add(end) adjacencies[end].add(start) distances = shortest_distances(start_node, num_nodes, adjacencies) print(*(-1 if i == num_nodes else i * 6 for i in distances if i)) main(sys.stdin)
NameError: name 'node' is not defined
.\$\endgroup\$node, edge
line is not valid syntax elsewhere. When I change it to read to a list first and then assign node and edge, the next error is for mixing spaces and tabs (it didn't matter in the HackerRank terminal again, sorry). I have to go right now, but I will try to clean it as soon as I get back. 2. The first link to HackerRank has a Sample Input, Sample Output, and Explanation on the page. The second link is just the input of the problematic test case.\$\endgroup\$node
andedge
first (not what I originally thought), and I used a text editor to change the indentation to be consistent with spaces. Please try again.\$\endgroup\$adj
is length 0 on the first iteration.\$\endgroup\$adj =
and thenode, edge =
lines.\$\endgroup\$