Asynchronous Training of Deep Learning Models

Question

I am thinking of how would it be if I can create asynchronous forward function in sub-class of nn.Module . When I came across architecture in attached image, I felt that it would be faster if we could train pass through each branch parallelly.

Another scenario I have come across is when input is of shape (B, C ,V ,T) , where : B: batch_sz , C: node_features , V : num_nodes , T : time_stamps So, there is a graph for each time-stamp corresponding to each instance in the batch. Now there is a module which calculates Adj_matrix of a graph based on its node_features... It would be compute intensive if we club (B,C,V,T)->(BxT, C,V) , and then treat (B*T) as new batch , where T~=250 , B=32....

Is there some work-around that we can evaluate adj_mat efficiently... This problem is not only limited to calculation of adj_mat through my custom module, but rather to use any graph layer from PyG, we need to club B,T

I'm not talking about training on multiple GPU, but making calls asynchronously rather than sequentially.

Please answer both the question , thanks in advance

noe · Accepted Answer · 2024-03-26 08:32:22Z

0

By default, CUDA computations (i.e. kernels) are asynchronous (see the CUDA docs and the Pytorch docs). Therefore, what you suggested is already happening under the hoods.

edited Mar 26, 2024 at 8:32

answered Mar 26, 2024 at 8:26

noe

28k1 gold badge49 silver badges83 bronze badges

$\begingroup$Thanks bro, I didn't know about it... The way I write forward function made me felt that things are done synchronously... I mean I don't write async anywhere .... Btw I would go through the resources you provided....$\endgroup$
– Sarvagya_P
CommentedMar 27, 2024 at 12:48

Add a comment |

Stack Exchange Network

Asynchronous Training of Deep Learning Models

1 Answer 1

Hot Network Questions

Asynchronous Training of Deep Learning Models

1 Answer 1

Related

Hot Network Questions