Videofluoroscopy (VF) is one of the most commonly used tools to assess oropharyngeal dysphagia as well as to visualize musculoskeletal structures of humans and animals engaged in various behaviors, including feeding. Despite its importance in clinical and scientific use, processing VF data has historically been extremely tedious because it is performed using manual frame-by-frame methods. With recent technological advances, the frame rate for scientific use has been increasing along with the use of high speed data capture systems. In the current study, we used non-human primates as a model animal to study human feeding behaviors to capture tongue movement based on markers implanted into the tongue. Here, we introduce a semi-automatic marker tracking algorithm that yields high tracking accuracy (> 90%) and dramatic speed improvements (faster than real time labeling). Furthermore, we quantify the sources of tracking errors and the tracking performance as a function of marker speeds. Our results indicate that there is more room for methodological improvements both in detection and prediction of marker positions. Moreover, correspondingly faster frame rates will be required to capture faster kinematic behaviors such as those of mice, which are extensively used to study both control and pathological conditions.