Towards Plausibility in Time Series Counterfactual Explanations

Marcin Kostrzewa¹ · Krzysztof Galus¹ · Maciej Zięba^1,2

¹ Wrocław University of Science and Technology · ² Tooploox

ACIIDS 2026

💻 Code

arXiv

We present a method for generating plausible counterfactual explanations (CFEs) for time series classification via gradient-based optimization in input space. Plausibility is enforced by aligning the generated CFE with \(k\)-nearest neighbors from the target class using soft-DTW — a differentiable relaxation of dynamic time warping. The optimization objective balances validity, proximity, sparsity, and the novel soft-DTW plausibility term. Across eight datasets, our method achieves perfect or near-perfect validity while outperforming baselines in distributional alignment with the target class by up to an order of magnitude in DTW distance.

Method

We seek a counterfactual \(X'\) for a time series \(X\) with predicted class \(\hat{y} = f(X)\), targeting \(y_{\text{target}} \neq \hat{y}\). The optimization objective is:

\[ \mathcal{L}_{\text{CF}} = \mathcal{L}_{\text{prox}} + \mathcal{L}_{\text{sparse}} + \lambda \cdot (\mathcal{L}_{\text{valid}} + \mathcal{L}_{\text{DTW}}), \]

\[ \mathcal{L}_{\text{prox}} = \tfrac{1}{dT}\|X' - X\|_2^2, \qquad \mathcal{L}_{\text{sparse}} = \tfrac{1}{dT}\|X' - X\|_1, \]

\[ \mathcal{L}_{\text{valid}} = \max\!\left(0,\, \tau - p_f(y_{\text{target}} \mid X')\right), \qquad \mathcal{L}_{\text{DTW}} = \frac{1}{k}\sum_{Y \in \mathcal{N}_k(X,\, y_{\text{target}})} \text{DTW}^\gamma(X', Y). \]

Soft-DTW (Cuturi and Blondel 2017) replaces the hard minimum in standard DTW with a smooth approximation parameterized by \(\gamma > 0\), making the alignment cost differentiable with respect to \(X'\). The plausibility term \(\mathcal{L}_{\text{DTW}}\) pulls \(X'\) toward the \(k\) nearest target-class training examples, encouraging realistic temporal structure rather than adversarial perturbations. We optimize \(X'\) by gradient descent with classifier weights frozen. Defaults: \(\lambda = 1\), \(k = 10\), \(\gamma = 1\).

Qualitative Results

The figures below compare counterfactuals produced by our method, Glacier (Wang et al. 2024), and M-CELS (Li et al. 2024).

On TwoLeadECG, both our method and M-CELS capture the prominent target-class peak; Glacier produces subtle changes that miss it entirely.

The contrast is sharper on CBF — three geometrically distinct classes (Cylinder, Bell, Funnel). Our method produces a CFE that clearly adopts the target shape. Glacier and M-CELS generate perturbations that resemble adversarial noise rather than meaningful class transformations.

Quantitative Results

Evaluated on eight UCR/UEA datasets (Dau et al. 2019) against Glacier (Wang et al. 2024) (univariate only) and M-CELS (Li et al. 2024). Metrics: Validity (\(\text{Val}\uparrow\)), \(L_1\)/\(L_2\) distance (\(\downarrow\)), average DTW to 10 nearest target-class neighbors (\(\downarrow\)), Isolation Forest Score (\(\uparrow\)).

Dataset	Method	\(\text{Val}\uparrow\)	\(L_1\downarrow\)	\(L_2\downarrow\)	\(\text{DTW}\downarrow\)	Iso Forest\(\uparrow\)
CBF	Ours	1.000	9.871	1.071	0.194	1.000
	Glacier	0.360	4.062	0.540	1.415	0.987
	M-CELS	0.226	1.500	0.486	2.402	0.984
TwoLeadECG	Ours	1.000	1.446	0.214	0.016	1.000
	Glacier	0.233	0.484	0.115	0.064	1.000
	M-CELS	0.970	0.245	0.119	0.302	0.879
GunPoint	Ours	0.975	4.478	0.491	0.155	1.000
	Glacier	0.000	0.639	0.170	0.436	1.000
	M-CELS	0.425	0.129	0.074	2.317	0.925
Earthquakes	Ours	1.000	48.985	2.441	0.775	0.924
	Glacier	0.000	8.528	0.661	1.907	1.000
	M-CELS	0.174	6.765	1.167	0.288	1.000
Coffee	Ours	1.000	5.979	0.489	0.064	1.000
	Glacier	0.455	9.182	0.795	1.024	1.000
	M-CELS	1.000	0.527	0.183	0.423	0.636
ItalyPowerDemand	Ours	1.000	0.869	0.222	0.015	1.000
	Glacier	0.023	0.307	0.107	0.054	1.000
	M-CELS	0.466	0.178	0.091	0.369	0.831
Cricket	Ours	1.000	475.900	12.210	0.810	0.972
	Glacier	N/A	N/A	N/A	N/A	N/A
	M-CELS	0.194	54.403	2.636	65.924	0.888
Epilepsy	Ours	1.000	68.130	3.138	3.445	1.000
	Glacier	N/A	N/A	N/A	N/A	N/A
	M-CELS	0.272	14.807	1.623	19.213	1.000

Our method achieves perfect or near-perfect validity on all datasets and the best DTW plausibility score everywhere — often by an order of magnitude (e.g. 0.016 vs. 0.302 on TwoLeadECG; 0.810 vs. 65.924 on Cricket). The higher \(L_1\)/\(L_2\) values reflect an inherent trade-off: enforcing temporal realism requires larger, more structured modifications than simply minimizing perturbation magnitude.

Conclusions

Explicit soft-DTW alignment with target-class neighbors is a simple and effective mechanism for producing plausible time series counterfactuals. It delivers near-perfect validity and substantially better distributional alignment than existing methods, at the cost of larger perturbations — confirming that meaningful temporal realism requires more structured changes than proximity-minimizing methods admit.

Supported by the National Science Centre (Poland), Grant No. 2024/55/B/ST6/02100.

References

Cuturi, Marco, and Mathieu Blondel. 2017. “Soft-DTW: A Differentiable Loss Function for Time-Series.” Proceedings of the International Conference on Machine Learning (ICML) 70: 894–903.

Dau, Hoang Anh, Anthony Bagnall, Kaveh Kamgar, et al. 2019. “The UCR Time Series Archive.” IEEE/CAA Journal of Automatica Sinica 6 (6): 1293–305.

Li, Zhengping, Ling Huang, Chang-Dong Wang, and Hongyang Chao. 2024. “M-CELS: Counterfactual Explanation for Multivariate Time Series Classification.” Proceedings of the ACM Web Conference (WWW).

Wang, Zhendong, Ioanna Miliou, Isak Samsten, and Panagiotis Papapetrou. 2024. “Learning Time Series Counterfactual Explanations: Glacier.” Proceedings of the SIAM International Conference on Data Mining (SDM).