Let sample.py can generate caption for gray image.

The sample.py cannot be used to generate caption for gray image.In the data_loader.py, all of the images have been converted to 'RGB' format. But when generating captions using sample.py for single image, this conversion is missed. So for gray image, it will have such RuntimeError: Given groups=1, weight of size [64, 3, 7, 7], expected input[1, 1, 224, 224] to have 3 channels, but got 1 channels instead.
2026-03-13 09:11:37 +08:00 · 2019-02-24 10:02:07 +08:00
parent 4896cefea1
commit 16691c00f8
1 changed files with 2 additions and 2 deletions
--- a/tutorials/03-advanced/image_captioning/sample.py
+++ b/tutorials/03-advanced/image_captioning/sample.py
@@ -14,7 +14,7 @@ from PIL import Image
 device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

 def load_image(image_path, transform=None):
-    image = Image.open(image_path)
+    image = Image.open(image_path).convert('RGB')
    image = image.resize([224, 224], Image.LANCZOS)
    
    if transform is not None:
@@ -78,4 +78,4 @@ if __name__ == '__main__':
    parser.add_argument('--hidden_size', type=int , default=512, help='dimension of lstm hidden states')
    parser.add_argument('--num_layers', type=int , default=1, help='number of layers in lstm')
    args = parser.parse_args()
-    main(args)
+    main(args)