Accessibility theory (Ariel, 1988; Gundel, Hedberg, & Zacharski, 1993) proposes that the grammatical form of a referring expression depends on the accessibility of its referent, with greater accessibility permitting more reduced expressions. From whose perspective is accessibility measured? Recent experiments (Bard, Hill, & Foster, 2008; de Ruiter & Lamers, submitted) using a joint construction task suggest that the speakers view often determines referential form. Two objections to these results would neutralize accessibility predictions in many real-world situations. First, objects in shared visual space may be so salient that all will be highly accessible and reference to them in whatever form cannot fail (Smith, Noda, Andrews, & Jucker, 2005). Second, since joint action demands joint attention, the listeners and speakers view of what is accessible should seldom differ. We use cross-recurrence analysis of interlocutors gaze to show that neither objection applies. Gaze is not always well aligned. Dyads whose referring expressions ignored listeners needs did not coordinate attention well. Dyads referring cooperatively coordinated attention better and in a way linked to the elaboration of their referring expressions.