Options: Input size: N = 32, C = 64, H = 256, W = 256 Output size: N = 32, K = 64, OH = 241, OW = 241 Filter size: K = 64, C = 64, R = 16, S = 16 Number of iterations: 10 Validation: off Initializing... done! Initializing Convolution... Calculating...(iter=0) 0.832585 sec Calculating...(iter=1) 0.678545 sec Calculating...(iter=2) 0.680206 sec Calculating...(iter=3) 0.709605 sec Calculating...(iter=4) 0.673506 sec Calculating...(iter=5) 0.686936 sec Calculating...(iter=6) 0.676093 sec Calculating...(iter=7) 0.673491 sec Calculating...(iter=8) 0.714218 sec Calculating...(iter=9) 0.685941 sec Avg. throughput: 5580.243521 GFLOPS