Options: Input size: N = 32, C = 64, H = 256, W = 256 Output size: N = 32, K = 64, OH = 241, OW = 241 Filter size: K = 64, C = 64, R = 16, S = 16 Number of iterations: 10 Validation: off Initializing... done! Initializing Convolution... Calculating...(iter=0) 0.833556 sec Calculating...(iter=1) 0.602634 sec Calculating...(iter=2) 0.606260 sec Calculating...(iter=3) 0.607697 sec Calculating...(iter=4) 0.605594 sec Calculating...(iter=5) 0.607735 sec Calculating...(iter=6) 0.605948 sec Calculating...(iter=7) 0.607907 sec Calculating...(iter=8) 0.606787 sec Calculating...(iter=9) 0.607396 sec Avg. throughput: 6252.183193 GFLOPS