Common spatial patterns (CSP) is a well-known spatial filtering algorithm for multichannel electroencephalogram (EEG) analysis. In this paper, we cast the CSP algorithm in a probabilistic modeling setting. Specifically, probabilistic CSP (P-CSP) is proposed as a generic EEG spatio-temporal modeling framework that subsumes the CSP and regularized CSP algorithms. The proposed framework enables us to resolve the overfitting issue of CSP in a principled manner. We derive statistical inference algorithms that can alleviate the issue of local optima. In particular, an efficient algorithm based on eigendecomposition is developed for maximum a posteriori (MAP) estimation in the case of isotropic noise. For more general cases, a variational algorithm is developed for group-wise sparse Bayesian learning for the P-CSP model and for automatically determining the model size. The two proposed algorithms are validated on a simulated data set. Their practical efficacy is also demonstrated by successful applications to single-trial classifications of three motor imagery EEG data sets and by the spatio-temporal pattern analysis of one EEG data set recorded in a Stroop color naming task.