Benchmark tests of human category learning (e.g., Shepard, Hovland, & Jenkins, 1961) play a major role in evaluating theories and models. New benchmarks are needed to reveal psychological patterns and to better test the explanatory power of competing accounts. We sought to investigate a clean and systematic benchmark that focuses on the type and degree of regularity underlying category structures, rather than the number of predictive dimensions. Learners were each assigned to one out of the nine formally-distinct, two-way (unbalanced) classification schemes over the nine examples based on two nominal ternary dimensions. A clear ordering of learning difficulty was observed among the four 3-vs-6 category structures (see also, Lee & Navarro, 2002), while more subtle distinctions characterized the five 4-vs-5 structures. The DIVA model (Kurtz, 2007) produced a good quantitative fit to these data. Theoretical implications of these results, including comparisons to well-known reference point models, are addressed.