
Keywords: NLP, Computational Linguistics, LLMs, Language Learning (Chinese), Language Evolution
Interest in language and languages can easily become obsesive. It is for me. Here are some of the questions that I have been considering for a long time:
Symbolic parsing is possible! This is a large part of what I have been working on this past few years and I will be releasing a sequence of drafts (comments welcome).
Emmanuel Roche. 2023. (I) Finite-State Representation of Geometric Transductions. DRAFT
Deep learning has long been opposed to the kind of symbolic processing that was developed during the first decades of natural language processing. Geoffrey Hinton declared in his Turing price acceptance speech (2018) that deep learning has won and symbolic processing has lost. I think that language is more probably a combination of both approaches. It is possible to combine these approaches with transformers and the kind of finite-state transducers alluded to in the previous section.
Here is a preview about what this current work:
My first few years of learning Chinese were both exhilarating and incredibly frustrating. Exhilaration from the sheer about of knowledge about the word. It was frustrating because I seemed to be learning way too slowly. It was like putting gas in an engine and getting back a 1% efficiency. Was it possible to push that much higher? 50%, maybe? There was only one way to find out: build a solution that tested all the hypotheses and release it as an app. That’s what we did with a few friends. You can check out our FullChinese solution (still evolving; many ideas to try).
You can also check out a recent write-up that Chen Tong (teaching Chinese at MIT) and I have recently put together.
Full CV here.