My main interests are in linguistics and applied linguistics, generally language teaching/learning and corpus linguistics. I’ve long been interested in lexis in particular, so putting all this together I stumbled across data-driven learning (DDL), i.e. the use of corpus linguistic tools and techniques for pedagogical purposes. This happened rather serendipitously through offering a course in corpus linguistics to M1 students in a distance English programme; one day I suddenly thought – hey! this could be useful for learning and not just a linguistics course for its own sake… But of course, others had beaten me to it by a decade or two.

I’m particularly keen to do and promote empirical research (which doesn’t have the same connotations as ‘recherches empiriques’), and most of my early DDL work was on these lines in seeing whether learners could make sense of corpus data for learning or reference purposes, how well, in what conditions, etc. I don’t really think of myself as a corpus linguist per se, as I don’t set out to analyse language as an end in itself; but I work parasitically on what real corpus linguists do. What interests me is how teachers, learners and others can use corpus tools and techniques for their own ends. For most ‘ordinary’ users, this is only possible with simple, familiar, free resources; but even using Google to find answers to language questions from web data involves considerable parallels with DDL and should not be automatically dismissed. Other basic tools include CTRL+F within a single document or webpage, Linguee or Reverso, GoogleFights and comparing wikipedia pages, WebCorp Live, general online corpora such as the BYU suite, BootCaT for making specialised quick’n’dirty corpora, AntConc for personal corpora, LexTutor for various corpus-related tools, and so on. There is thus something for everyone at any level, and I incorporate corpus work into most of my courses to some degree, from passing references for non-specialist undergraduates with compulsory English courses to a corpus linguistics project in a distance English Master’s degree, from teacher training courses to research methodology and scientific writing, etc.

Starting about 2012, I’ve become interested in more methodological and epistemological issues, i.e. what different types of research (qualitative and quantitative, significance and effect size, emic and etic, experimental and ecological) can reasonably tell us about different aspects of DDL, and how to make sense of research in the field. With over 750 empirical studies seeking to evaluate some aspect of DDL (latest trawls in August 2023), how can we get a view of the whole?

Unfortunately, other responsibilities meant I had less time for research and publishing in the 5 years or so leading up to January 2023…