There is now a well-established literature showing that people anticipate upcoming concepts and words during language processing. Commonsense knowledge about typical event sequences and verbal selectional preferences can contribute to anticipating what will be mentioned next. We here investigate how temporal discourse connectives (before, after), which signal event ordering along a temporal dimension, modulate predictions for upcoming discourse referents. Our study analyses anticipatory gaze in the visual world and supports the idea that script knowledge, temporal connectives (before eating → menu, appetizer), and the verb’s selectional preferences (order → appetizer) jointly contribute to shaping rapid prediction of event participants.