Use static inline functions in header to do CPU feature detection. The c files are already compiled/linked with SIMD support and might have used instructions from that featureset already.