DemoCNN_MNIST error

Nice find; thank you for being so thorough in your review.
 
There indeed is an error in the code, specifically, in the calculation of iNumWeight . Here is a correction; note the commented-out line:
for ( fm=0; fm<50; ++fm)
{
  for ( ii=0; ii<5; ++ii )
  {
    for ( jj=0; jj<5; ++jj )
    {
      // iNumWeight = fm * 26;  // 26 is the number of weights per feature map
      iNumWeight = fm * 156;  // 156 is the number of weights per feature map
      NNNeuron& n = *( pLayer->m_Neurons[ jj + ii*5 + fm*25 ] );
 
      n.AddConnection( ULONG_MAX, iNumWeight++ );  // bias weight

      for ( kk=0; kk<25; ++kk )
      {
        // note: max val of index == 1013, corresponding to 1014 neurons in prev layer
        n.AddConnection(       2*jj + 26*ii + kernelTemplate2[kk], iNumWeight++ );
        n.AddConnection( 169 + 2*jj + 26*ii + kernelTemplate2[kk], iNumWeight++ );
        n.AddConnection( 338 + 2*jj + 26*ii + kernelTemplate2[kk], iNumWeight++ );
        n.AddConnection( 507 + 2*jj + 26*ii + kernelTemplate2[kk], iNumWeight++ );
        n.AddConnection( 676 + 2*jj + 26*ii + kernelTemplate2[kk], iNumWeight++ );
        n.AddConnection( 845 + 2*jj + 26*ii + kernelTemplate2[kk], iNumWeight++ );
      }
    }
  }
}
As you can see, the error is a copy-and-paste type of error. The previous layer (#1) had 26 weights per feature map, whereas the current layer (#2) has 156.
 
There actually is even a further error for layer #2. The given equation for the number of weights is (5x5+1)x6x50=7800. In this equation, the "+1"is intended to refer to the bias weight, and yields (5x5+1)x6=156 weights for each of the 50 feature maps. However, the equation incorrectly implies that there are 6 bias weights for each neuron, which is incorrect (there should only be one single bias weight). The correct equation is (5x5x6+1)x50=7550, with 151 weights per feature map. The code above does not reflect this, and for this reason, when you re-do your math, you will see that there are still a few unused weights (more precisely, 5 unused weights per feature map).
 
I can't change the code in the article or the download; if I do, then the download for the trained neural network will no longer work (it's tightly coupled to the interconnections in the network architecture).
 
I'm working on a part-2 to the article, and I will make your changes there.
 
Meanwhile, I think it is a testament to the robustness of Dr. LuCun's architecture, that such a low error rate (0.74%) can still be achieved even when there's an error in interconnections.
 
Best regards,
Mike

좋은 웹페이지 즐겨찾기