Questions tagged [deep-learning]

Ask Question

For questions related to deep learning, which refers to a subset of machine learning methods based on artificial neural networks (ANNs) with multiple hidden layers. The adjective deep thus refers to the number of layers of the ANNs. The expression deep learning was apparently introduced (although not in the context of machine learning or ANNs) in 1986 by Rina Dechter in the paper "Learning while searching in constraint-satisfaction-problems".

2,018 questions

0votes

0answers

11views

Convolutional Kernels in CNN learning to find different patterns

Suppose we have an input image of dimensions $w \times h $ and the first hidden layer has dimension $(w-1) \times (h-1) \times 3$. We have $3$ seperate $3 \times 3$ kernels with no padding. I ...

Stan

asked 2 days ago

0votes

0answers

15views

Why do my DNN convergence graphs behave differently on linear vs. dB scales?

I'm working on a deep neural network (DNN) and using the Adam optimizer to train it by learning parameters through backpropagation. My goal is to minimize the objective function. I’ve plotted the ...

Alee

asked Apr 23 at 22:27

0votes

0answers

10views

Is there other ways than using negative log-likelihood or KL-divergence to compute a loss function?

I've read that the two common ways to express a loss function in ML problems was to start either from the likelihood, then use the negative log likelihood to find a good expression of the loss, or to ...

Tristan Beruard

asked Apr 22 at 21:51

0votes

0answers

17views

Intuition behind Load-Balancing Loss in the paper OUTRAGEOUSLY LARGE NEURAL NETWORKS: THE SPARSELY-GATED MIXTURE-OF-EXPERTS LAYER

I'm trying to implement the paper "OUTRAGEOUSLY LARGE NEURAL NETWORKS: THE SPARSELY-GATED MIXTURE-OF-EXPERTS LAYER" But got stuck while implementing the Load-Balancing Loss. Could someone ...

qmzp

asked Apr 16 at 7:01

1vote

0answers

30views

Applying the RTD task to a model trained with MLM leads to a decrease in performance as training progresses

We are developing a new LLM based on the CodeBERT architecture. As part of this effort, we initially trained our model using the Masked Language Modeling (MLM) objective with HuggingFace API. To ...

One Bad Student

asked Apr 11 at 15:29

0votes

0answers

38views

How to Improve Levenshtein Distance in CNN-BiLSTM Morse Decoder?

Problem Context: I'm building a Morse code audio decoder using CNN-BiLSTM with CTC loss. My current 4-layer model achieves Levenshtein distance ≈0.6, but attempts to improve performance by adding a ...

alexander

asked Apr 9 at 19:35

1vote

0answers

35views

How to make a variational autoencoder work on time-freq matrix?

I want to use a complex valued variantional autoencoder for unsupervised blind source separation. As an input to the network, I am giving the time-freq matrix of the spectrogram instead of the ...

ananya

asked Apr 8 at 10:05

0votes

0answers

19views

CLIPSeg: no change in performance metrics with a better convolutional decoder

I am training CLIPSeg on the Oxford IIIT pet dataset for semantic segmentation (3 classes: background, cat, dog). In short, what I do is I stick a decoder on the CLIP encoder. The encoder outputs: ...

Stan

asked Apr 6 at 17:40

3votes

2answers

44views

Required background for thorough understanding of Causal ML research papers?

I'm interested in pursuing research in the intersection of causal inference and machine learning, particularly on causal discovery and causal representation learning. Through my exploration so far, I ...

Harsh Shrivastava

asked Apr 5 at 15:08

0votes

0answers

21views

why's there Nan values for forecast and total loss?

So I am training a Graph attention based model on time series dataset(Swat) for which while evaluating the dataset function for it is ...

Priyanshu Singh

asked Apr 5 at 8:13

1vote

0answers

19views

Implementation of TSMAE model in Keras

I’m currently implementing the TSMAE model described in the paper “TSMAE: A Novel Anomaly Detection Approach for Internet of Things Time Series Data Using Memory-Augmented Autoencoder” (https://pxl.to/...

Nguyễn Hoàng Hà

asked Mar 27 at 15:20

2votes

1answer

84views

How to deal with actions that complete in multiple steps (delayed reward) in reinforcement learning?

I have been exploring RL and using DQN to train an agent for a problem where i have two possible actions. But one of the action is supposed to complete over multiple steps while other one is ...

m101

asked Mar 18 at 16:29

5votes

2answers

128views

Is there a conflict between NFL theorem and multimodal learning?

The definition of multimodal learning and NFL theorem is clear to me. My question is, if model good at a specific field might perform badly in another field, is there any need to find out a multimodal ...

Heartache_Doctor

asked Mar 17 at 12:13

0votes

0answers

12views

Need Guidance on Gameplay Video Analysis for Storyline Graph Extractio

I'm a college student working on a project related to storyline graph extraction from gameplay videos and new player position identification in the graph. However, I'm completely clueless about how to ...

22I218 - GAYATHRI R

asked Mar 11 at 9:33

2votes

1answer

76views

Why doesn't deep learning use modular arithmetic like cryptography, even though both deal with non-linear functions?

So, deep learning models are great at learning complex, non-linear patterns and seem to handle noise just fine. But under the hood, they rely on IEEE754 floating-point numbers, which can lose ...

Muhammad Ikhwan Perwira

asked Mar 9 at 9:35

15 30 50per page

2 3 4 5

…

135 Next

Stack Exchange Network

Questions tagged [deep-learning]

Convolutional Kernels in CNN learning to find different patterns

Why do my DNN convergence graphs behave differently on linear vs. dB scales?

Is there other ways than using negative log-likelihood or KL-divergence to compute a loss function?

Intuition behind Load-Balancing Loss in the paper OUTRAGEOUSLY LARGE NEURAL NETWORKS: THE SPARSELY-GATED MIXTURE-OF-EXPERTS LAYER

Applying the RTD task to a model trained with MLM leads to a decrease in performance as training progresses

How to Improve Levenshtein Distance in CNN-BiLSTM Morse Decoder?

How to make a variational autoencoder work on time-freq matrix?

CLIPSeg: no change in performance metrics with a better convolutional decoder

Required background for thorough understanding of Causal ML research papers?

why's there Nan values for forecast and total loss?

Implementation of TSMAE model in Keras

How to deal with actions that complete in multiple steps (delayed reward) in reinforcement learning?

Is there a conflict between NFL theorem and multimodal learning?

Need Guidance on Gameplay Video Analysis for Storyline Graph Extractio

Why doesn't deep learning use modular arithmetic like cryptography, even though both deal with non-linear functions?

Hot Network Questions

Questions tagged [deep-learning]

Related Tags