Fast and far generalization from sparse data
Indiana University
Much learning is slow, incremental, and limited in generalizability. However, there are clear cases of fast learning with far generalization from quite limited input data. In this talk, I will present a case from a real-world learning domain that provides insights into both incremental learning and the origins of fast learning and far generalization from minimal data. Focusing on the case of multi-digit numbers, I will show that the multiple predictive relations in the names and written forms of number symbols enable young learners from experiences with just a few examples to robustly generalize to new instances. I will present studies of early symbol knowledge independent of the referred to physical quantities and studies that demonstrate the rapid learning of this knowledge by both preschoolers and a deep learning neural network. I will argue that this early learning has the usual characterizations of incremental learning and is not rule-based. Instead, rapid learning and far generalization emerge because the surface properties of multi-digit number names and their written forms present many redundant, overlapping, and co-predicting features that provide imperfect but multiple pathways to the same generalizable principles. I will conjecture that this form of data structure characterizes many of the knowledge domains that support “few-shot” learning and far generalization. I will also propose and present initial evidence that this early implicit earning about multi-digit numbers sets the stage for later learning of explicit and generative rules.