Words often have multiple meanings. Research in computational linguistics either attempts to capture lexical ambiguity with representations of word sense or avoids the issue and represents words as amalgams of their respective meanings.
Where sense representations are used, there is a default assumption that a word token in context can be classified with one of the senses for that word; a tagging process known as word sense disambiguation. The ease of which man or machine can determine the senses and apply them in the sense tagging task varies tremendously depending on the word. This variation reflects findings in linguistics that suggest word meaning lies on a continuum from clear cut cases of ambiguity through subtler cases of polysemy to vagueness. There are computational options for building softer, subtler more nuanced models of word sense but this brings additional complexity and effort, which begs the question - is this complexity worth it? In this talk I'll present work applying measures of clusterability to datasets with alternative annotations of word meaning to determine how readily a lemma partitions into senses. The motivation behind this work is to inform different representations depending on the nature of the ambiguity word.