help you fix your problem The problem is that the dimensions of the output of your last max pooling layer don't match the input of the first fully connected layer. This is the network structure until the last max pool layer for input shape (3, 512, 384):
With these it helps If you have a nn.Linear layer in your net, you cannot decide "on-the-fly" what the input size for this layer would be. In your net you compute num_flat_features for every x and expect your self.fc1 to handle whatever size of x you feed the net. However, self.fc1 has a fixed size weight matrix of size 400x120 (that is expecting input of dimension 16*5*5=400 and outputs 120 dim feature). In your case the size of x translated to 7744 dim feature vector that self.fc1 simply cannot handle.
x = F.max_pool2d(F.relu(self.conv2(x)), 2) # output of conv layers
x = F.interpolate(x, size=(5, 5), mode='bilinear') # resize to the size expected by the linear unit
x = x.view(x.size(0), 5 * 5 * 16)
x = F.relu(self.fc1(x)) # you can go on from here...
RuntimeError: size mismatch, m1: [32 x 1], m2: [32 x 9]
I wish this help you You don't need x=x.view(-1,1) and x = x.squeeze(1) in your forward function. Remove these two lines. Your output shape would be (batch_size, 9). Also, you need to convert labels to one-hot encoding, which is in shape of (batch_size, 9).
self.name = "large"
self.conv1 = nn.Conv2d(3, 5, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(5, 10, 5)
self.fc1 = nn.Linear(10 * 53 * 53, 32)
self.fc2 = nn.Linear(32, 9)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 10*53*53)
x = F.relu(self.fc1(x))
x = self.fc2(x)
model2 = LargeNet()
#Loss and optimizer
criterion = nn.BCEWithLogitsLoss()
optimizer = optim.SGD(model2.parameters(), lr=0.1, momentum=0.9)
images = torch.from_numpy(np.random.randn(2,3,224,224)).float() # fake images, batch_size is 2
labels = torch.tensor([1,2]).long() # fake labels
outputs = model2(images)
one_hot_labels = torch.eye(9)[labels]
loss = criterion(outputs, one_hot_labels)
Beginner PyTorch : RuntimeError: size mismatch, m1: [16 x 2304000], m2: [600 x 120]
will be helpful for those in need The input dimension of self.fc1 needs to match the feature (second) dimension of your flattened tensor. So instead of doing self.fc1 = nn.Linear(600, 120), you can replace this with self.fc1 = nn.Linear(2304000, 120). Keep in mind that because you are using fully-connected layers, the model cannot be input size invariant (unlike Fully-Convolutional Networks). If you change the size of the channel or spatial dimensions before x = x.view(x.size(0), -1) (like you did moving from the last question to this one), the input dimension of self.fc1 will have to change accordingly.
RuntimeError: size mismatch m1: [a x b], m2: [c x d]