11
$\begingroup$

I have the network architecture from the paper "learning fine-grained image similarity with deep ranking" and I am unable to figure out how the output from the three parallel network is merged using the linear embedding layer. The only information given on this layer, in the paper is

Finally, we normalize the embeddings from the three parts, and combine them with a linear embedding layer. The dimension of the embedding is 4096.

Can anyone help me in figuring out what exactly does the author mean when he is talking about this layer?

$\endgroup$
2
  • $\begingroup$It's unfortunate for me that there is no answer for this question. Because I'm stuck with the exactly same issue. Did you figure it out?$\endgroup$
    – LKM
    CommentedOct 10, 2017 at 16:51
  • $\begingroup$I did not figure out the answer but i just concatenated the input from the three parts and passed it through a dense layer containing 4096 nodes.$\endgroup$
    – A. Sam
    CommentedOct 12, 2017 at 8:53

2 Answers 2

1
$\begingroup$

Linear embedding layer must be just a fancy name for a dense layer with no activation. 'Linear' means there is no activation (activation is identity). And the embedding is rather a concept for a vector representation of the input data (e.g. word embeddings). I believe the elements from the second vector are simply added to the first one element-wise.

$\endgroup$
    0
    $\begingroup$

    It's mentioned in the paper:

    A local normalization layer normalizes the feature map around a local neighborhood to have unit norm and zero mean. It leads to feature maps that are robust to the differences in illumination and contrast.

    They take each part of the model and normalize it separately.

    As for combining them, as you commented, to capture the most salient features, with under-complete representation no needs for the non-linearity.

    $\endgroup$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.