Options: Input size: N = 32, C = 32, H = 64, W = 32 Output size: N = 32, K = 32, OH = 49, OW = 17 Filter size: K = 32, C = 32, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.534210 sec Validating... Result: VALID Avg. throughput: 26.160914 GFLOPS Options: Input size: N = 64, C = 64, H = 64, W = 32 Output size: N = 64, K = 32, OH = 49, OW = 17 Filter size: K = 32, C = 64, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.674964 sec Validating... Result: VALID Avg. throughput: 82.821732 GFLOPS Options: Input size: N = 256, C = 64, H = 64, W = 32 Output size: N = 256, K = 32, OH = 49, OW = 17 Filter size: K = 32, C = 64, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.610788 sec Validating... Result: VALID Avg. throughput: 366.095430 GFLOPS Options: Input size: N = 128, C = 64, H = 64, W = 32 Output size: N = 128, K = 32, OH = 49, OW = 17 Filter size: K = 32, C = 64, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.562826 sec Validating... Result: VALID Avg. throughput: 198.646360 GFLOPS Options: Input size: N = 32, C = 32, H = 128, W = 32 Output size: N = 32, K = 32, OH = 113, OW = 17 Filter size: K = 32, C = 32, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.683617 sec Validating... Result: VALID Avg. throughput: 47.144858 GFLOPS Options: Input size: N = 64, C = 32, H = 32, W = 32 Output size: N = 64, K = 128, OH = 17, OW = 17 Filter size: K = 128, C = 32, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.622051 sec Validating... Result: VALID Avg. throughput: 62.356500 GFLOPS Options: Input size: N = 256, C = 32, H = 32, W = 128 Output size: N = 256, K = 32, OH = 17, OW = 113 Filter size: K = 32, C = 32, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.628193 sec Validating... Result: VALID Avg. throughput: 410.434688 GFLOPS Options: Input size: N = 128, C = 32, H = 32, W = 32 Output size: N = 128, K = 64, OH = 17, OW = 17 Filter size: K = 64, C = 32, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.619934 sec Validating... Result: VALID Avg. throughput: 62.569456 GFLOPS Options: Input size: N = 32, C = 128, H = 32, W = 32 Output size: N = 32, K = 32, OH = 17, OW = 17 Filter size: K = 32, C = 128, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.454889 sec Validating... Result: VALID Avg. throughput: 42.635586 GFLOPS Options: Input size: N = 64, C = 64, H = 32, W = 32 Output size: N = 64, K = 32, OH = 17, OW = 17 Filter size: K = 32, C = 64, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.667696 sec Validating... Result: VALID Avg. throughput: 29.046844 GFLOPS Options: Input size: N = 64, C = 64, H = 32, W = 32 Output size: N = 64, K = 64, OH = 17, OW = 17 Filter size: K = 64, C = 64, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.552917 sec Validating... Result: VALID Avg. throughput: 70.153247 GFLOPS Options: Input size: N = 128, C = 64, H = 32, W = 128 Output size: N = 128, K = 32, OH = 17, OW = 113 Filter size: K = 32, C = 64, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.575908 sec Validating... Result: VALID Avg. throughput: 447.696993 GFLOPS Options: Input size: N = 32, C = 32, H = 32, W = 32 Output size: N = 32, K = 32, OH = 17, OW = 17 Filter size: K = 32, C = 32, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.522080 sec Validating... Result: VALID Avg. throughput: 9.287109 GFLOPS Options: Input size: N = 64, C = 32, H = 32, W = 32 Output size: N = 64, K = 32, OH = 17, OW = 17 Filter size: K = 32, C = 32, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.465673 sec Validating... Result: VALID Avg. throughput: 20.824122 GFLOPS Options: Input size: N = 96, C = 32, H = 32, W = 32 Output size: N = 96, K = 32, OH = 17, OW = 17 Filter size: K = 32, C = 32, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.555241 sec Validating... Result: VALID Avg. throughput: 26.197362 GFLOPS Options: Input size: N = 128, C = 32, H = 32, W = 32 Output size: N = 128, K = 32, OH = 17, OW = 17 Filter size: K = 32, C = 32, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.581046 sec Validating... Result: VALID Avg. throughput: 33.378525 GFLOPS Options: Input size: N = 8, C = 8, H = 8, W = 8 Output size: N = 8, K = 8, OH = 6, OW = 6 Filter size: K = 8, C = 8, R = 3, S = 3 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.618939 sec Validating... Result: VALID Avg. throughput: 0.000536 GFLOPS Options: Input size: N = 8, C = 8, H = 8, W = 8 Output size: N = 8, K = 8, OH = 6, OW = 6 Filter size: K = 8, C = 8, R = 3, S = 3 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.484627 sec Validating... Result: VALID Avg. throughput: 0.000685 GFLOPS Options: Input size: N = 1, C = 1, H = 8, W = 8 Output size: N = 1, K = 1, OH = 5, OW = 5 Filter size: K = 1, C = 1, R = 4, S = 4 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.664391 sec Validating... Result: VALID Avg. throughput: 0.000001 GFLOPS Options: Input size: N = 1, C = 1, H = 8, W = 8 Output size: N = 1, K = 1, OH = 5, OW = 5 Filter size: K = 1, C = 1, R = 4, S = 4 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.582629 sec Validating... Result: VALID Avg. throughput: 0.000001 GFLOPS Options: Input size: N = 3, C = 3, H = 256, W = 256 Output size: N = 3, K = 3, OH = 129, OW = 129 Filter size: K = 3, C = 3, R = 128, S = 128 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.500472 sec Validating... Result: VALID Avg. throughput: 29.418023 GFLOPS Options: Input size: N = 3, C = 3, H = 256, W = 256 Output size: N = 3, K = 3, OH = 129, OW = 129 Filter size: K = 3, C = 3, R = 128, S = 128 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.587543 sec Validating... Result: VALID Avg. throughput: 25.058407 GFLOPS Options: Input size: N = 3, C = 3, H = 256, W = 256 Output size: N = 3, K = 3, OH = 133, OW = 133 Filter size: K = 3, C = 3, R = 128, S = 128 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.460911 sec Validating... Result: VALID Avg. throughput: 33.954698 GFLOPS Options: Input size: N = 3, C = 3, H = 256, W = 256 Output size: N = 3, K = 3, OH = 4, OW = 4 Filter size: K = 3, C = 3, R = 128, S = 128 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.629908 sec Validating... Result: VALID Avg. throughput: 0.022473 GFLOPS Options: Input size: N = 3, C = 3, H = 256, W = 256 Output size: N = 3, K = 3, OH = -242, OW = -242 Filter size: K = 3, C = 3, R = 128, S = 128 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.555800 sec Validating... Result: VALID Avg. throughput: 93.223611 GFLOPS Options: Input size: N = 128, C = 128, H = 8, W = 8 Output size: N = 128, K = 64, OH = 1, OW = 1 Filter size: K = 64, C = 128, R = 8, S = 8 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.618955 sec Validating... Result: VALID Avg. throughput: 0.216846 GFLOPS Options: Input size: N = 128, C = 128, H = 8, W = 8 Output size: N = 128, K = 64, OH = 1, OW = 1 Filter size: K = 64, C = 128, R = 8, S = 8 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.559329 sec Validating... Result: VALID Avg. throughput: 0.239962 GFLOPS Options: Input size: N = 128, C = 128, H = 8, W = 8 Output size: N = 128, K = 64, OH = 5, OW = 5 Filter size: K = 64, C = 128, R = 8, S = 8 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.475397 sec Validating... Result: VALID Avg. throughput: 7.058190 GFLOPS Options: Input size: N = 128, C = 128, H = 8, W = 8 Output size: N = 128, K = 64, OH = 6, OW = 6 Filter size: K = 64, C = 128, R = 8, S = 8 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.638493 sec Validating... Result: VALID Avg. throughput: 7.567566 GFLOPS Options: Input size: N = 128, C = 128, H = 8, W = 8 Output size: N = 128, K = 64, OH = 1, OW = 1 Filter size: K = 64, C = 128, R = 8, S = 8 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.615809 sec Validating... Result: VALID Avg. throughput: 0.217954 GFLOPS Options: Input size: N = 7, C = 4, H = 64, W = 64 Output size: N = 7, K = 4, OH = 49, OW = 49 Filter size: K = 4, C = 4, R = 16, S = 16 Number of iterations: 1 Validation: on Initializing... done! Initializing Convolution... Calculating...(iter=0) Using 4 devices [GPU 0] NVIDIA GeForce RTX 3090 [GPU 1] NVIDIA GeForce RTX 3090 [GPU 2] NVIDIA GeForce RTX 3090 [GPU 3] NVIDIA GeForce RTX 3090 0.618435 sec Validating... Result: VALID Avg. throughput: 0.222631 GFLOPS