Robust and accurate scale estimation of a target object is a challenging task in visual object tracking. Most existing tracking methods cannot accommodate large scale variation in complex image sequences and thus result in inferior performance. In this paper, we propose to incorporate a novel criterion called the average peak-to-correlation energy into the multi-resolution translation filter framework to obtain robust and accurate scale estimation. The resulting system is named SITUP: Scale Invariant Tracking using Average Peak-to-Correlation Energy. SITUP effectively tackles the problem of fixed template size in standard discriminative correlation filter based trackers. Extensive empirical evaluation on the publicly available tracking benchmark datasets demonstrates that the proposed scale searching framework meets the demands of scale variation challenges effectively while providing superior performance over other scale adaptive variants of standard discriminative correlation filter based trackers. Also, SITUP obtains favorable performance compared to state-of-the-art trackers for various scenarios while operating in real-time on a single CPU.