Implement Depthwise Separable Convolution from “Xception: Deep Learning with Depthwise Separable Convolutions” (Chollet, 2017).
A depthwise separable convolution factorizes a standard convolution into:
Given (1D case for simplicity):
x: shape (C_in, L) — C_in channels, length L dw_filters: shape (C_in, K) — one filter per channel, kernel size K pw_weights: shape (C_out, C_in) — pointwise mixing weights Steps:
(C_in, L-K+1) (C_out, L-K+1)
Output: Tensor of shape (C_out, L-K+1).