912 lines
24 KiB
Plaintext
912 lines
24 KiB
Plaintext
|
Options:
|
||
|
Input size: N = 32, C = 32, H = 64, W = 32
|
||
|
Output size: N = 32, K = 32, OH = 49, OW = 17
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.553664 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 25.241702 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 64, C = 64, H = 64, W = 32
|
||
|
Output size: N = 64, K = 32, OH = 49, OW = 17
|
||
|
Filter size: K = 32, C = 64, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.533996 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 104.685611 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 256, C = 64, H = 64, W = 32
|
||
|
Output size: N = 256, K = 32, OH = 49, OW = 17
|
||
|
Filter size: K = 32, C = 64, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.582360 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 383.966645 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 64, H = 64, W = 32
|
||
|
Output size: N = 128, K = 32, OH = 49, OW = 17
|
||
|
Filter size: K = 32, C = 64, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.605670 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 184.594535 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 32, C = 32, H = 128, W = 32
|
||
|
Output size: N = 32, K = 32, OH = 113, OW = 17
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.570761 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 56.466753 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 64, C = 32, H = 32, W = 32
|
||
|
Output size: N = 64, K = 128, OH = 17, OW = 17
|
||
|
Filter size: K = 128, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.473969 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 81.838527 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 256, C = 32, H = 32, W = 128
|
||
|
Output size: N = 256, K = 32, OH = 17, OW = 113
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.530609 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 485.917705 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 32, H = 32, W = 32
|
||
|
Output size: N = 128, K = 64, OH = 17, OW = 17
|
||
|
Filter size: K = 64, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.566264 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 68.499727 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 32, C = 128, H = 32, W = 32
|
||
|
Output size: N = 32, K = 32, OH = 17, OW = 17
|
||
|
Filter size: K = 32, C = 128, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.454443 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 42.677437 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 64, C = 64, H = 32, W = 32
|
||
|
Output size: N = 64, K = 32, OH = 17, OW = 17
|
||
|
Filter size: K = 32, C = 64, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.462973 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 41.891118 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 64, C = 64, H = 32, W = 32
|
||
|
Output size: N = 64, K = 64, OH = 17, OW = 17
|
||
|
Filter size: K = 64, C = 64, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.472275 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 82.132067 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 64, H = 32, W = 128
|
||
|
Output size: N = 128, K = 32, OH = 17, OW = 113
|
||
|
Filter size: K = 32, C = 64, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.571329 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 451.284991 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 32, C = 32, H = 32, W = 32
|
||
|
Output size: N = 32, K = 32, OH = 17, OW = 17
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.566248 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 8.562707 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 64, C = 32, H = 32, W = 32
|
||
|
Output size: N = 64, K = 32, OH = 17, OW = 17
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.441358 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 21.971345 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 96, C = 32, H = 32, W = 32
|
||
|
Output size: N = 96, K = 32, OH = 17, OW = 17
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.635037 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 22.905512 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 32, H = 32, W = 32
|
||
|
Output size: N = 128, K = 32, OH = 17, OW = 17
|
||
|
Filter size: K = 32, C = 32, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.475372 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 40.798487 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 8, C = 8, H = 8, W = 8
|
||
|
Output size: N = 8, K = 8, OH = 6, OW = 6
|
||
|
Filter size: K = 8, C = 8, R = 3, S = 3
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.536754 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.000618 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 8, C = 8, H = 8, W = 8
|
||
|
Output size: N = 8, K = 8, OH = 6, OW = 6
|
||
|
Filter size: K = 8, C = 8, R = 3, S = 3
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.544731 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.000609 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 1, C = 1, H = 8, W = 8
|
||
|
Output size: N = 1, K = 1, OH = 5, OW = 5
|
||
|
Filter size: K = 1, C = 1, R = 4, S = 4
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.445027 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.000002 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 1, C = 1, H = 8, W = 8
|
||
|
Output size: N = 1, K = 1, OH = 5, OW = 5
|
||
|
Filter size: K = 1, C = 1, R = 4, S = 4
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.460785 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.000002 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 129, OW = 129
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.461354 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 31.912368 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 129, OW = 129
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.500026 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 29.444253 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 133, OW = 133
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.687652 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 22.758747 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = 4, OW = 4
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.455767 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.031059 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 3, H = 256, W = 256
|
||
|
Output size: N = 3, K = 3, OH = -242, OW = -242
|
||
|
Filter size: K = 3, C = 3, R = 128, S = 128
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.450455 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 115.025219 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 1, OW = 1
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.550982 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.243597 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 1, OW = 1
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.455080 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.294932 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 5, OW = 5
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.515875 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 6.504374 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 6, OW = 6
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.490695 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 9.846928 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 128, C = 128, H = 8, W = 8
|
||
|
Output size: N = 128, K = 64, OH = 1, OW = 1
|
||
|
Filter size: K = 64, C = 128, R = 8, S = 8
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.460174 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.291667 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 7, C = 4, H = 64, W = 64
|
||
|
Output size: N = 7, K = 4, OH = 49, OW = 49
|
||
|
Filter size: K = 4, C = 4, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.553278 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.248849 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 7, C = 4, H = 64, W = 64
|
||
|
Output size: N = 7, K = 4, OH = 49, OW = 49
|
||
|
Filter size: K = 4, C = 4, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.520503 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.264519 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 9, H = 64, W = 64
|
||
|
Output size: N = 4, K = 4, OH = 49, OW = 49
|
||
|
Filter size: K = 4, C = 9, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.533430 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.331854 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 9, H = 64, W = 64
|
||
|
Output size: N = 4, K = 4, OH = 49, OW = 49
|
||
|
Filter size: K = 4, C = 9, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.575073 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.307823 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 2, H = 123, W = 69
|
||
|
Output size: N = 4, K = 4, OH = 108, OW = 54
|
||
|
Filter size: K = 4, C = 2, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.679876 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.140543 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 2, H = 123, W = 69
|
||
|
Output size: N = 4, K = 4, OH = 108, OW = 54
|
||
|
Filter size: K = 4, C = 2, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.669258 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.142772 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 2, H = 64, W = 64
|
||
|
Output size: N = 4, K = 9, OH = 49, OW = 49
|
||
|
Filter size: K = 9, C = 2, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.591400 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.149663 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 2, H = 64, W = 64
|
||
|
Output size: N = 4, K = 9, OH = 49, OW = 49
|
||
|
Filter size: K = 9, C = 2, R = 16, S = 16
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.489383 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.180861 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 2, H = 64, W = 64
|
||
|
Output size: N = 4, K = 7, OH = 46, OW = 28
|
||
|
Filter size: K = 7, C = 2, R = 19, S = 37
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.577326 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.175658 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 2, H = 64, W = 64
|
||
|
Output size: N = 4, K = 7, OH = 46, OW = 28
|
||
|
Filter size: K = 7, C = 2, R = 19, S = 37
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.451467 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.224628 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 3, C = 7, H = 63, W = 121
|
||
|
Output size: N = 3, K = 7, OH = 12, OW = 25
|
||
|
Filter size: K = 7, C = 7, R = 19, S = 28
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.436559 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.107482 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 7, C = 9, H = 67, W = 119
|
||
|
Output size: N = 7, K = 11, OH = 8, OW = 15
|
||
|
Filter size: K = 11, C = 9, R = 19, S = 28
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.451336 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.196045 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 3, H = 129, W = 199
|
||
|
Output size: N = 4, K = 13, OH = 15, OW = 9
|
||
|
Filter size: K = 13, C = 3, R = 21, S = 39
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.623802 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.055300 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 3, H = 111, W = 51
|
||
|
Output size: N = 4, K = 17, OH = 36, OW = 18
|
||
|
Filter size: K = 17, C = 3, R = 24, S = 19
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.523031 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.230501 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 4, C = 3, H = 135, W = 63
|
||
|
Output size: N = 4, K = 12, OH = 18, OW = 1
|
||
|
Filter size: K = 12, C = 3, R = 32, S = 37
|
||
|
Number of iterations: 1
|
||
|
Validation: on
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.560119 sec
|
||
|
Validating...
|
||
|
Result: VALID
|
||
|
Avg. throughput: 0.010958 GFLOPS
|
||
|
Options:
|
||
|
Input size: N = 8, C = 8, H = 8, W = 8
|
||
|
Output size: N = 8, K = 8, OH = 6, OW = 6
|
||
|
Filter size: K = 8, C = 8, R = 3, S = 3
|
||
|
Number of iterations: 1
|
||
|
Validation: off
|
||
|
|
||
|
Initializing... done!
|
||
|
Initializing Convolution...
|
||
|
Calculating...(iter=0) Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
Using 4 devices
|
||
|
[GPU 0] NVIDIA GeForce RTX 3090
|
||
|
[GPU 1] NVIDIA GeForce RTX 3090
|
||
|
[GPU 2] NVIDIA GeForce RTX 3090
|
||
|
[GPU 3] NVIDIA GeForce RTX 3090
|
||
|
0.525031 sec
|
||
|
Avg. throughput: 0.000632 GFLOPS
|