The Language Glasses
In this post, I'm going to talk about a nifty idea I've had yesterday. I immediately tried it out and was at once amazed by the result. Today I'd like to share my enthusiasm with you. Welcome!
In fact, this post will be the logical continuation of my previous one. There, I wrote about using a weird console-based text editor that a huge amount of Linux and UNIX geeks love: nano. The rest of the world uses graphical tools for writing their text documents. In case you're using NotePad or WordPad, please leave this post immediately :D.
Just kidding. Please stay.
Ok, let's get started with the show.
Nano: A console-based text editor
If you've been following my last few posts, you may have guessed the topic already. It's about coloring words according to their self-assessed state of knowledge. I'm not going to repeat the whole introduction to my own computer program for improving and maintaining vocabulary, the ideas in behind of it, its features and rationale (my justification on why I've written the damn thing the way it is) and all the nitty-gritty details. If you're interested in that (and I hope you are!), you're very welcome to go and read these posts. As always, I'm happy and thankful for any kind of constructive feedback you may have.
As I was saying, I recently wrote about using the text editor nano and self-created syntax files in order to highlight the words of a certain language according to my personal knowledge of them. After creating those files with a tiny computer program and testing it for a while, I noticed that the main idea really did work exactly as I thought, but the program was way too slow and consumed too much memory. For this and other reasons, I've decided to try out the same idea with a different text editor: A graphical one.
Enter SublimeText
SublimeText is a graphical text editor. And like the name tells you, it's a sublime software product. It really is.
If you want to get your hands dirty, install SublimeText on your computer and then download this ZIP file:
https://gitlab.com/edufuga/languagesyntaxfiles/-/archive/master/languagesyntaxfiles-master.zip
There you'll find my current vocabulary knowledge in different languages, for example English and German. With these files, you literally have my brain. Ok, not quite, but almost. Let's say you put my "language glasses" on. With these glasses, you can see what my knowledge of a given word you type in the text editor is. To be able to do that, all you need to do is follow the instructions in the README.md file. Mostly, all you really need to do is copy the files from the downloaded ZIP file into the right SublimeText folder, which you can open directly within SublimeText by pressing Control-Shift-P and entering "Browser Packages". On macOS, this folder is located under "/Users/$USER/Library/Application Support/Sublime Text/Packages/User".
And actually that's it. After copying the syntax files and the color definitions, SublimeText should understand files with a file ending ".cat" or ".en" as being written in Catalan and English, respectively. As a concrete example, this very same post is currently a text document named "SublimeText.en" in my computer. Yes, very original, I know.
The picture of this post shows you a concrete example of what that looks like:
The concrete colors have currently the following meanings: blue stands for actively known words, purple stands for passively known words, orange tells you I just guessed the meaning (the word is understandable from context, or I'm almost sure that I actually know it), red denotes unknown and rose is for ignoring stuff (like invented words or characters in novels, which are not part of the language). This color scheme is the same that my computer program uses, but it can be changed in SublimeText by editing the color file. This is the file https://gitlab.com/edufuga/languagesyntaxfiles/-/blob/master/Mariana.sublime-color-scheme.
Why am I doing all this?
Well, because it's really a nice thing to have. Take a look at this screen recording:
https://gitlab.com/edufuga/languagesyntaxfiles/-/blob/master/CatalanTest.mp4
I hope you see what I mean.
If not, let's try to describe it: I find it absolutely amazing to be able to write in a graphical text editor and directly see my state of knowledge being represented by a given color. Soon enough your brain assimilates these colors and you develop an intuitive and very reliable feeling for how well you understand a given text document as a whole, just by looking at it. I for example use this by opening a book in any foreign language I speak: The "color signature", i.e. the overall color appearance of the document in the text editor tells you something about your level in that language, or more concretely, in the language of that book. For the languages I speak the best, everything shines mostly blue, which is a good thing, because it means I know those words. As you can tell, I like blue. For the languages I just started to learn (like Italian), there is a very big amount of purple and orange, as well as a decent amount of red: Opening an Italian book tells my eyes and brain "hey, this is a completely different level". To me, this kind of chromatic context helps a lot in getting into the right mindset for reading and writing in a language.
It's not just the foreign languages. I initially started to write my computer program in order to help me improve my Dutch, but soon I started to use it for other foreign languages and eventually I ventured into my own mother tongue(s) as well. In the end, being native just means you have a bigger vocabulary basis than in the other languages, but the same categories and mechanisms apply: You can always extend and strengthen your vocabulary, no matter in which language.
I've also noticed that I get much more self-critical when reviewing words in my native language. All of a sudden I start marking words as "guessed" or "comprehended" even though I actually know them well, just because my response or reaction was slower than I liked, or because I want to review them in the future. In the end, the categories have a certain amount of implicit subjectivity. I tried to compensate for that by including enough of them: Beyond the mentioned "ignored", "unknown", "guessed", "comprehended" and "known" there are two additional ones: "well known" and "mastered". I'm personally not using them yet, mostly because I find the amount of work in maintaining the complete vocabulary project with the categories already in use to be a decent amount of work, but I think they may be generally helpful if a given user wants even more fine-grained control over the knowledge structure representation.
All things considered, I think this is a great new way to start writing posts and continuing reading books. Let me know what you think about it!
That looks really cool! If I downloaded your "brain", my Catalan texts would also be nice and colourful ... 🤔
Evil plans aside, I really enjoyed reading this and especially liked how the title connects to the content.
Thank you, Caro! I'm glad you liked the post and I'm looking forward to your feedback as alpha tester number 1 on my idea with SublimeText. I really think it makes the writing process a lot more enjoyable, but that's obviously my personal opinion.
One of the next steps should be to extract the code that creates the syntax files (here it is) in a single project. The reason is that the
WordStatisticsCrawler
originally was just meant for analyzing the data and is the backbones of this post. Besides that, the code is already usable and you could theoretically create your own syntax files for the text editor with your own vocabulary. For everyone else, I'd have to make the Questioner still a bit more usable so that they can start entering their vocabulary. A possible idea could be a chat bot version of it with an inline keyboard with the option to return the ZIP file with the vocabulary, or directly the single syntax files. We need more volunteers!This is an incredible project!! I gave up programming years ago but I've always wanted to come back to create some sort of language learning application or tool. Currently I don't have the free time yet to invest into getting into programming again, but WOW, in an alternate reality, I hope that the version of me that kept going with Computer Science is making cool stuff like this. I also love the way you write in English, it's so gripping and entertaining!
Thank you a lot, Emily, that's such a nice and motivating comment! I'm very happy to see you got lost in here :D. I'm sure the alternate you is a badass hacker. Actually, I didn't study CS, so in my opinion there's nothing wrong or inherently/overly difficult in getting at some point into programming! But tell me: what programming language(s) did you have contact with, and which would you ideally like to learn in the future?
I definitely feel like one day I could get back to it and learn some more programming! The bigger problem is between my work, studies and hobbies, I have no time to learn programming haha. I'm always passively trying to develop my idea for an app but nothing seems to be quite...revolutionary yet. When I did program, I used Java in high school and when I went to university, it was based around C for the first 2 years and then we were starting to get into Java, Javascript and HTML when I realized that I really want to become a language professional. In the future, I am thinking more practically, I'd just learn whatever language allows me to build the things that inspire me! I know that it is most likely not going to be C. I hate Python personally because it's so "simple" to me that it complicates things! (this is coming from someone who learned about memory management and ridged variable definitions). I know Python is super useful though and I should probably give it another shot...kind of like me with Spanish ;)
Thanks for your awesome answer! I see. You seem to have a bit of a love-hate relationship with C (I can relate to that!). Your sentence "I know that it is most likely not going to be C" even made me laugh :D. I guess you're right! I don't "hate" Python, but I just don't like it for bigger projects; I also find it too simple (and I absolutely abhor the lack of a static type system). That said, I don't find it necessary to deal with
malloc
andfree
s as in C, what a horrible nightmare for application programmers. But then again I seem to have some tendency towards systems programming as well and find something like C++ to be a good combination of performance, control and high-level abstractions (the "only" problem with C++ is its exaggerated complexity). If I stay inside the JVM, I find Scala quite appealing in theory and Java is, actually, still a good practical choice. If I think about speed and code simplicity, then something like Go seems to be quite nice (Go is the new C). For my Bookwards project I'm considering a full rewrite in a different language, but I'm still struggling with the proper choice between the mentioned possibilities 🤔. That depends on something else I'm equally unsure about: What kind of application should the "finished" product be? A desktop app, a mobile app or even a web app? I'm aware that the current console app is mostly off-putting for everyone else here, which is unfortunate. Hm... What do you think? Or should I keep with the idea of this article and assume people install SublimeText for reading/writing? In that case, I mostly need to find an easy-to-use way for creating and providing the syntax files after they drill the vocabulary. That was actually what I was going to try out next: A chat bot that asks words and spills the syntax files for the text editor. In the end, I think the outcome will be simply to have several options and let people choose what they like most, because the "one-size-fits-all" version simply does not exist. Sorry for my ramblings.