“The success of large language models is the biggest surprise in my intellectual life. We learned that a lot of what we used to believe may be false and what I used to believe may be false. I used to really accept, to a large degree, the Chomskyan argument that the structures of language are too complex and not manifest in input so that you need to have innate machinery to learn them. You need to have a language module or language instinct, and it’s impossible to learn them simply by observing statistics in the environment.
If it’s true — and I think it is true — that the LLMs learn language through statistical analysis, this shows the Chomskyan view is wrong. This shows that, at least in theory, it’s possible to learn languages just by observing a billion tokens of language.”
Paper Citation: Philip Capin, Sharon Vaughn, Joseph E. Miller, Jeremy Miciak, Anna-Mari Fall, Greg Roberts, Eunsoo Cho, Amy E. Barth, Paul K. Steinle & Jack M. Fletcher (2023) Investigating the Reading Profiles of Middle School Emergent Bilinguals with Significant Reading Comprehension Difficulties, Scientific Studies of Reading, DOI: 10.1080/10888438.2023.2254871
A few months ago, a study crossed my radar that caused me to stop, print it out, mark it up, and then begin digging into related studies, which is what I do when a study grabs my attention.
Getting into research is akin to getting into Miles Davis—if you like a given song or album, you may start checking out the other musicians he plays with, and they'll lead you into a new and ever expanding fractal universe, because Davis had a knack for collaborating with musicians who were geniuses in their own right. A few examples: John Coltrane, Tony Williams, Keith Jarrett, Herbie Hancock, John McLaughlin, Wayne Shorter, Jack DeJohnette, the list goes on and on.
This has been a great year for education research. I thought it could be fun to review some of what has come across my own limited radar over the course of 2023.
The method I used to create this wrap-up was to go back through my Twitter timeline starting in January, and pull all research related tweets into a doc. I then began sorting those by theme and ended up with several high-level buckets, with further sub-themes within and across those buckets. Note that I didn’t also go through my Mastodon nor Bluesky feeds, as this was time-consuming enough!
The rough big ticket research items I ended up with were:
Multilinguals and multilingualism
Reading
Morphology
The influence of physical or cultural environment
The content of teaching and learning
The precedence of academic skills over soft skills
In my last post, we landed on the idea of a nascent scaffold that we are born with in our brains, which is developed through our daily interactions with one another – and then further accelerated through the reinforcement and extension of written language use.
Before we venture into the wilds of the possible relations between language and thought, I wanted to build on this idea of how our inner scaffolds are most fully realized through speaking, listening, reading, and writing by geeking out about the beauty and wonder of multilingualism.
There is a fertile topsoil we are born with in our brains, imprinted by the interplay of sights and sounds and movement of those who interact with us. This immersive communicative theater, felt first in the womb, roots itself within the immediacy of each moment, even while gesturing at distant realms yet unknown. Climbing towards this mystery with our tongues and thoughts and technology bends the world toward our needs, and allows us to project our inner selves into the past and future. We ride rivers and build highways across our brains. This is our cultural inheritance, our storied legacy of language and literacy.
Language is a uniquely human phenomenon that develops in children with remarkable ease and fluency. Yet questions remain about how we acquire language. Is it innately wired in our brain, or do we learn all facets rapidly from birth?
Two books – Rethinking Innateness and The Language Game – provide us with some fascinating perspectives on language learning that bears implications for how we think about learning to read and write, and furthermore, for how we talk about the power and limitations of AI.
In my last post (yeah, it’s been a long time. I don’t get paid for these, you know), I made the case for the importance of phonics instruction, while acknowledging it should be just about 30 minutes a day in the early grades. But I also pointed out that the quality of that 30 minutes can be highly variable.
Even when you have a program that sequences phonics instruction systematically and explicitly, it needs to be acknowledged that this is only a small part of what is on most teachers’ plates each day. Kindergarten – 2nd grade teachers usually teach most core subjects, and may be drawing upon a panoply of programs they are supposed to be experts in, while managing a bunch of young homo sapiens who have not yet fully developed a prefrontal cortex and the ability to regulate their emotions and behavior. It’s exhausting, to say the least.
Why do I keep harping on the importance of explicit, systematic phonics instruction? I know it bugs some people.
Teaching decoding and encoding of written words in English shouldn’t be much more than 30 minutes a day for most kids at a K-2 level. So what’s the big deal, right?
There is a concept termed diglossia worth exploring in relation to dialects of African American English used in the United States.
What is diglossia?
Diglossia can be defined as “the coexistence of two varieties of the same language throughout a speech community. Often, one form is the literary or prestige dialect, and the other is a common dialect spoken by most of the population.”
I am a nerd, and I skim through a fair number of research papers, both to keep current for my professional role, and because I just like learning about literacy and language.
While I use Zotero to organize some of what I come across, I tend to read through papers on my phone on buses/trains to and from work, or to print out something to read later, so I am not systematic or well-organized about what I pick up from what I read, unfortunately. I do post quotes from articles as I read them on social media, so I can search through my own past feed to find links to research I read. So while I might build my own schema about things as I read more and more stuff, I don’t retain the specific sources.
One of the things I have had in my head regarding literacy interventions is that multicomponent approaches in English tend to be more effective than single component approaches for students who are learning English at school (ELL), and for many other populations as well.