Cognitive models of web navigation like CoLiDeS, use only textual information from hyperlinks to compute information scent and ignore so far the impact of visual and graphical widgets. We conducted an experiment to study the extent to which textual and especially graphical information, plays a role in identifying web page widgets. Four different versions of a webpage were created by systematically varying text and graphics. In general, task completion times and number of clicks were significantly less in the presence of graphics than in their absence. This was particularly the case when there was no textual information available. We conclude that for identifying graphical widgets, text and graphics interact and complement each other and it is important for a cognitive model on web navigation to include information from graphics. In this direction we propose a method to integrate information extracted from pictures into CoLiDeS and we demonstrate its usefulness with a simulation done on a mock web site.