Language Highlights

In this post, I'd like to describe a very recent idea I've had in the context of my computer program, which I've recently described here on Journaly. The post is not directly related to the program and it was actually written without it. In fact, that's part of the fun. Welcome!

For those Journaly readers who don't know yet about the computer program I've mentioned, I'll give a very short summary of it before continuing, in order to provide the needed context for the rest of this post: I've written a small computer program that helps me improve and maintain my vocabulary knowledge in several foreign languages. The program takes the form of a console application and can be used for either reading texts or questioning and reviewing vocabulary. The reading part is quite similar to the main idea behind existing products such as LingQ or LWT (Learning With Texts). The questioning and reviewing functionality using the sentences of a given text document is, as far as I'm aware, a more novel idea. In the whole software there is nothing such as flashcards or a spaced-repetition algorithm. Instead, the questioning and reviewing occurs by considering the frequency of words in a text (words with a higher count are presented first to the user), the last time the user reviewed the word, or both.

Now you know what the mental starting point for this post is, so to speak. I have a computer program that helps me with my vocabulary knowledge. Words are colored according to whether they are unknown, guessed, passively known (what I call 'comprehended'), or actively known. Coloring words to reflect the state of vocabulary knowledge of a certain user is the main mechanism of the program for supporting vocabulary learning.

That's already a nice thing to have and use, I think, but I still happen to have several ideas about possible improvements and alternative or complementing features to explore. One of those is the possibility to not only read, but also write texts. Beyond a simple book reader and vocabulary training functionality, I'd like to have a very straightforward text editor that colors text while typing. The words are colored, again, according to the vocabulary knowledge of the writer.

To show you how that may look like, I've recorded part of the writing process of this very same Journaly post. Take a look at the following link: https://asciinema.org/a/YJ6FyqfGME4rKktEAcQP3zbgy.

The cool idea, I find, is that this time I've barely written any code for it. I'm using my favorite text editor (nano) for writing it inside a text console. The words are highlighted in different colors by a very simple mechanism: A syntax file. This is a plain text file with the ending ".nanorc", something like "java.nanorc". Normally these syntax files are used for highlighting the keywords of a certain programming language; they are an aid for software developers writing computer code. Other text editors like SublimeText, etc. use similar mechanisms for highlighting words. Now the quirky idea from my part has been to generate syntax files for natural languages instead of programming languages. More concretely, I've mapped the different knowledge states (ignored, unknown, guessed, comprehended and known) to the different available basic colors of the text editor (green, red, cyan, yellow/orange, magenta and blue). This color code is quite similar to the color code of my own computer program, so the result looks quite familiar to me.

In short: I'm right now writing this text inside nano, which is coloring the words according to my vocabulary knowledge as I type them. That's quite neat.

Actually I can also use nano simply to open a previously written text document (like a book) and read it. There is nothing wrong with that. A text editor can be obviously used in both ways. For me, the novelty is the approach of using a previously existing software tool (the text editor) in a new and useful way in the context of writing in a foreign language. It's for example interesting, entertaining and even motivating to see the majority of the words being colored as blue (i.e. known). Those few words that appear in different colors tell me that I should actually re-consider my self-assessment of the knowledge, because I actually know them (otherwise I wouldn't have used them in my writing).

Using nano for this with a statically generated syntax file is not such a good idea in real-life, I suppose. I mean, the file itself is not that big (328 KB), but the editor is simply not meant for such a high amount of keywords to highlight (it uses several gigabytes of memory!). For me, this is mostly a fun hack in order to try out the idea of "writing using your own vocabulary" (i.e. simply writing), combined with the computer aid of actually seeing your own vocabulary knowledge on the screen. Additionally, I find the minimalism of a simple text environment very appealing. It should help the writer focus on the writing process and the contents. For me, it definitely does.

I think I'm going to write my own small computer program that does exactly this (coloring words as you type), but in a more dynamical way and without consuming as much memory (i.e. faster). As a counterpart of my document reader, there'll be a document writer. This is one of the mentioned ideas about how to enhance my computer program.

And with this I'm going to end this rather spontaneous Journaly post. I hope you've found it interesting. Let me know what you think about it, and if there are any questions, please don't hesitate to ask.

Caro (@MimmiCaro)

almost 4 years ago

Hi Eduard, nice to read from you so frequently the last few weeks :) (I should take you as a role model and do the same I guess :D)
The idea of being able to see the knowledge of a certain word while writing it seems very interesting, especially for languages I don't yet speak so well (i.e. all of them except German and English :D). I know there are different approaches to writing here on Journaly. I like to look up words while I'm writing, so that I actively use them and hopefully remember them better the next time. I imagine it would be a great tool to see if I ever encountered those words before and how I estimated my knowledge of them. In a future document writer, it would maybe be a helpful feature if you could categorise new words right away or change the status of words you've already encountered before.

@SaraT

Your software sounds very interesting. I like the fact that when your reviewing the words they appear according to their frequency in the text. That can be really useful to assure that you review high frequency words, which are definitely more important to retain on your memory. The idea of being able to have the words at different colours as you type them, according to the knowledge that you have of those words, is a smart one. I also think that the ideia of having a software that you can use to read, write and review words is a good one.

Eduard (@edufuga)

Caro: Thank you! I find the idea of seeing how well the own knowledge of words is while writing also a really good idea (obviously, that's why I wrote the text). I think that my next post will be quite related to this one, because I had yesterday another idea and, after trying it out, works even better than what I've described here. Ah, yes, that sounds like a nice use case, seeing previously entered words you forgot about (that happens to be already, sometimes :D). About your idea: I guess it could be useful to press the down arrow key or Enter or something while writing a word and set a new state for it, though I imagined the Writer piece of software to be quite focussed and minimalistic.

Sara: Nice to see you here after my rather critical comments of certain software tools :D. Thanks! Yeah, that's the exact point when going through the vocabulary of a book. I described this part of my software project a bit more detailed in my post "The War Machine": the file with the words in order of decreasing frequency (word count) for the book of the last book club can be found here: https://pastebin.com/hNwtBepk. Just as an example. The program generates and uses files like this one in order to present the relevant words to the user. Who knows, maybe one day my program will be usable for oher people as well :). I hope to see you in my next post!

Language Highlights

Language Highlights

Comments