There's an interesting article in The Economist
about how Bayesian statistics are increasingly being used in the cognitive sciences [via MindHacks
]. Two researchers set out to test how well humans apply Bayesian reasoning, by giving them nuggets of information and asking them to draw general conclusions.
For example, many of the participants were told the amount of money that a film had supposedly earned since its release, and asked to estimate what its total “gross” would be, even though they were not told for how long it had been on release so far.
All of these things have well-established probability distributions, and all of them, together with three other items on the list—an individual's lifespan given his current age, the run-time of a film, and the amount of time spent on hold in a telephone queuing system—were predicted accurately by the participants from lone pieces of data.
The participants in the study could moreover hold their own for data with many different sorts of naturally-occurring prior distributions. (Priors are "assumption[s] about the way the world works...that can be expressed as a mathematical probability distribution of the frequency with which events of a particular magnitude can happen.") Psychologists therefore suggest...
...that the Bayesian capacity to draw strong inferences from sparse data could be crucial to the way the mind perceives the world, plans actions, comprehends and learns language, reasons from correlation to causation, and even understands the goals and beliefs of other minds.
Hmm. Drawing strong inferences from sparse data? Sounds a lot like a solution to the problem of the poverty of the stimulus
. But if language learning is Bayesian, then what is the prior? Does the prior = UG? What structure would it take? How are language universals encoded into it? Are
language universals encoded into it? Certainly an interesting problem to think about.
Links:Original paper [pdf]
Tom Griffiths' webpage
, which includes a Bayesian reading list
Josh Tenenbaum's webpage
, which also has a paper on word-learning as Bayesian inference