657 lines
17 KiB
Plaintext
657 lines
17 KiB
Plaintext
|
Options:
|
||
|
Input size: N = 8, C = 8, H = 8, W = 8
|
||
|
Output size: N = 8, K = 8, OH = 6, OW = 6
|
||
|
Filter size: K = 8, C = 8, R = 3, S = 3
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.000319 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 1.040037 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 8, C = 8, H = 8, W = 8
|
||
|
Output size: N = 8, K = 8, OH = 6, OW = 6
|
||
|
Filter size: K = 8, C = 8, R = 3, S = 3
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.003255 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.101924 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 129, OW = 129
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.010786 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 1364.993012 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 129, OW = 129
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.009503 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 1549.307137 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 131, OW = 131
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.011128 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 1364.398311 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 131, OW = 131
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.009547 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 1590.338555 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 129, OW = 129
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.011046 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 1332.850227 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 129, OW = 129
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.009507 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 1548.646617 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 3, OW = 3
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.003819 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 2.085008 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 3, OW = 3
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.004746 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 1.677685 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 1, OW = 1
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.004154 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 32.310736 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 1, OW = 1
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.005034 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 26.662402 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 3, OW = 3
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.005290 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 228.346385 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 3, OW = 3
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.008603 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 140.409865 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 1, OW = 1
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.004286 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 31.315011 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 1, OW = 1
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.005229 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 25.667972 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 0, OW = 0
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.003479 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.000000 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 0, OW = 0
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.004432 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.000000 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 5, C = 4, H = 64, W = 64
|
||
|
Output size: N = 5, K = 4, OH = 49, OW = 49
|
||
|
Filter size: K = 4, C = 4, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.000419 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 234.635187 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 5, C = 4, H = 64, W = 64
|
||
|
Output size: N = 5, K = 4, OH = 49, OW = 49
|
||
|
Filter size: K = 4, C = 4, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.003314 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 29.675443 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 5, H = 64, W = 64
|
||
|
Output size: N = 4, K = 4, OH = 49, OW = 49
|
||
|
Filter size: K = 4, C = 5, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.000429 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 229.287748 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 5, H = 64, W = 64
|
||
|
Output size: N = 4, K = 4, OH = 49, OW = 49
|
||
|
Filter size: K = 4, C = 5, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.003389 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 29.019886 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 2, H = 127, W = 67
|
||
|
Output size: N = 4, K = 4, OH = 112, OW = 52
|
||
|
Filter size: K = 4, C = 2, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.000433 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 220.386692 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 2, H = 127, W = 67
|
||
|
Output size: N = 4, K = 4, OH = 112, OW = 52
|
||
|
Filter size: K = 4, C = 2, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.003297 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 28.942886 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 2, H = 64, W = 64
|
||
|
Output size: N = 4, K = 7, OH = 49, OW = 49
|
||
|
Filter size: K = 7, C = 2, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.000336 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 204.926942 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 2, H = 64, W = 64
|
||
|
Output size: N = 4, K = 7, OH = 49, OW = 49
|
||
|
Filter size: K = 7, C = 2, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.003252 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 21.168773 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 2, H = 64, W = 64
|
||
|
Output size: N = 4, K = 7, OH = 52, OW = 44
|
||
|
Filter size: K = 7, C = 2, R = 13, S = 21
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.000330 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 212.012030 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 2, H = 64, W = 64
|
||
|
Output size: N = 4, K = 7, OH = 52, OW = 44
|
||
|
Filter size: K = 7, C = 2, R = 13, S = 21
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.003317 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 21.089963 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 7, H = 63, W = 121
|
||
|
Output size: N = 3, K = 7, OH = 13, OW = 44
|
||
|
Filter size: K = 7, C = 7, R = 19, S = 28
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.000622 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 143.827131 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 7, H = 63, W = 121
|
||
|
Output size: N = 3, K = 7, OH = 13, OW = 44
|
||
|
Filter size: K = 7, C = 7, R = 19, S = 28
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.004011 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 22.304148 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 3, H = 131, W = 67
|
||
|
Output size: N = 4, K = 12, OH = 24, OW = 8
|
||
|
Filter size: K = 12, C = 3, R = 21, S = 37
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.000585 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 73.434490 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 3, H = 131, W = 67
|
||
|
Output size: N = 4, K = 12, OH = 24, OW = 8
|
||
|
Filter size: K = 12, C = 3, R = 21, S = 37
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Calculating...(iter=0) 0.003520 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 12.205922 GFLOPS
|