599 lines
16 KiB
Plaintext
599 lines
16 KiB
Plaintext
|
Options:
|
||
|
Input size: N = 32, C = 32, H = 64, W = 32
|
||
|
Output size: N = 32, K = 32, OH = 49, OW = 17
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.534210 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 26.160914 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 64, C = 64, H = 64, W = 32
|
||
|
Output size: N = 64, K = 32, OH = 49, OW = 17
|
||
|
Filter size: K = 32, C = 64, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.674964 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 82.821732 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 256, C = 64, H = 64, W = 32
|
||
|
Output size: N = 256, K = 32, OH = 49, OW = 17
|
||
|
Filter size: K = 32, C = 64, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.610788 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 366.095430 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 64, H = 64, W = 32
|
||
|
Output size: N = 128, K = 32, OH = 49, OW = 17
|
||
|
Filter size: K = 32, C = 64, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.562826 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 198.646360 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 32, C = 32, H = 128, W = 32
|
||
|
Output size: N = 32, K = 32, OH = 113, OW = 17
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.683617 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 47.144858 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 64, C = 32, H = 32, W = 32
|
||
|
Output size: N = 64, K = 128, OH = 17, OW = 17
|
||
|
Filter size: K = 128, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.622051 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 62.356500 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 256, C = 32, H = 32, W = 128
|
||
|
Output size: N = 256, K = 32, OH = 17, OW = 113
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.628193 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 410.434688 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 32, H = 32, W = 32
|
||
|
Output size: N = 128, K = 64, OH = 17, OW = 17
|
||
|
Filter size: K = 64, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.619934 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 62.569456 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 32, C = 128, H = 32, W = 32
|
||
|
Output size: N = 32, K = 32, OH = 17, OW = 17
|
||
|
Filter size: K = 32, C = 128, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.454889 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 42.635586 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 64, C = 64, H = 32, W = 32
|
||
|
Output size: N = 64, K = 32, OH = 17, OW = 17
|
||
|
Filter size: K = 32, C = 64, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.667696 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 29.046844 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 64, C = 64, H = 32, W = 32
|
||
|
Output size: N = 64, K = 64, OH = 17, OW = 17
|
||
|
Filter size: K = 64, C = 64, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.552917 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 70.153247 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 64, H = 32, W = 128
|
||
|
Output size: N = 128, K = 32, OH = 17, OW = 113
|
||
|
Filter size: K = 32, C = 64, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.575908 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 447.696993 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 32, C = 32, H = 32, W = 32
|
||
|
Output size: N = 32, K = 32, OH = 17, OW = 17
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.522080 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 9.287109 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 64, C = 32, H = 32, W = 32
|
||
|
Output size: N = 64, K = 32, OH = 17, OW = 17
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.465673 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 20.824122 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 96, C = 32, H = 32, W = 32
|
||
|
Output size: N = 96, K = 32, OH = 17, OW = 17
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.555241 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 26.197362 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 32, H = 32, W = 32
|
||
|
Output size: N = 128, K = 32, OH = 17, OW = 17
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.581046 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 33.378525 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 8, C = 8, H = 8, W = 8
|
||
|
Output size: N = 8, K = 8, OH = 6, OW = 6
|
||
|
Filter size: K = 8, C = 8, R = 3, S = 3
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.618939 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.000536 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 8, C = 8, H = 8, W = 8
|
||
|
Output size: N = 8, K = 8, OH = 6, OW = 6
|
||
|
Filter size: K = 8, C = 8, R = 3, S = 3
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.484627 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.000685 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 1, C = 1, H = 8, W = 8
|
||
|
Output size: N = 1, K = 1, OH = 5, OW = 5
|
||
|
Filter size: K = 1, C = 1, R = 4, S = 4
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.664391 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.000001 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 1, C = 1, H = 8, W = 8
|
||
|
Output size: N = 1, K = 1, OH = 5, OW = 5
|
||
|
Filter size: K = 1, C = 1, R = 4, S = 4
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.582629 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.000001 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 129, OW = 129
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.500472 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 29.418023 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 129, OW = 129
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.587543 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 25.058407 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 133, OW = 133
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.460911 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 33.954698 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 4, OW = 4
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.629908 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.022473 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = -242, OW = -242
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.555800 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 93.223611 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 1, OW = 1
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.618955 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.216846 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 1, OW = 1
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.559329 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.239962 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 5, OW = 5
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.475397 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 7.058190 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 6, OW = 6
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.638493 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 7.567566 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 1, OW = 1
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.615809 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.217954 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 7, C = 4, H = 64, W = 64
|
||
|
Output size: N = 7, K = 4, OH = 49, OW = 49
|
||
|
Filter size: K = 4, C = 4, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.618435 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.222631 GFLOPS
|