Easier way to make your own programing language - Language Customizer

Easier way to make your own programing language - Language Customizer

Introduction

What if you wanted to make your own programing language with its own unique style?

You may have heard of languages like:

Bhailang, LOLCODE, Brainfuck

They have their own peculiar, but funny syntax.

// bhailang
hi bhai
  bol bhai "Hello bhai";
bye bhai
// LOLCODE
HAI 1.2
CAN HAS STDIO?
VISIBLE "HAI WORLD!"
KTHXBYE
// Brainfuck
++++++++++[>+++++++>++++++++++>+++<<<-]>++.>+.+++++++
 ..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.

Or maybe you just want to make a syntax that is most comfortable to you.
For these reasons I started working on a project, a software that lets you make these quirky languages on your own without much knowledge of a tokenizer, parser, AST, etc.

What I got so far

Language Customizer (language-customizer.web.app)

A website where you can change the keywords of a language to anything you want.

How it works

In the top left section, you have a text-editor that you can write your code into for testing. The bottom left section is where you can see the output of the of that code after you press the run button.
On the right side is where all the magic happens, you can change the keyword of this language by setting the corresponding input field and just hit apply. This will also automatically update the documentation as well and now you can make the language to look as you want.

// setting -->
// var = take
// while = till
// print = say
take variable = 0;
till(variable < 10) {
    say variable;
}

or you can puzzle your mind by making something like

// setting -->
// var = not_constant
// true = not_false
// false = not_true
not_constant variable = not_true;
if(variable != not_false) {
    print "@.@";
}

The world is your oyster! maybe not. Despite being able to do all of this, there are some restrictions (actually, many restrictions). We will discuss those restrictions later on.

Edge cases don't bother us; after all, we're the edgy ones.

So maybe now you want to share this new amalgamation of a language that you created. You can do so by clicking the share button and it will copy a very big URL to your clipboard that you can share with your friends.

Hope & Despair

Limitations

This is cool! Now let's see what this tool can't do.

To start off, everything is a bit beyond its reach. The language in itself is very lacking. It doesn't have classes, built-in standard library, useful keywords like break, continue, goto—thinking about implementing these gives me headache. There are also weird edge cases that will break the execution of the whole program or maybe will give you some unexpected output just like JavaScript, which is what I used to build the interpreter for mine. Just one of the edge cases that I looked at right now, is where if I make an empty array like,

var a = [];

It gives me error.
But something like,

var a = [1, 2];

works perfectly fine.
Hopefully, I fix that before publishing this blog. I did fix the bug.
Along with this buggy language, there are some restrictions with what you can use as keyword:

  • you can only use English alphabets for keywords.

  • keywords can't have space in-between them.

I'm hoping to remove some of those restrictions and fix those bugs in future.

Potential Future

This was most of what you could do now. In future I hope to add a lot more ways to customize this language.

Maybe in future you could change the syntax of for loop to look something like,

for {
// body
} (updation; condition; initial);

Why would you want that? That doesn't look useful for any practical reason. But it's good to think that you can do that.

Maybe in future you could change these keywords and syntax dynamically as the code is parsed. Something like,

var v = 0;
#CHANGE var let
let t = 10;
#ARRANGE WHILE BODY EXPR
while {
    print v;
    v = v + 1;
} (v < t);

In the sense of possibilities, I suppose you could say that the world is your oyster!

Technical Details

Internal Working

Let's discuss more about the internals of how it all works:

Currently as I mentioned above, I am using JavaScript for tokenizing, parsing, generating the AST and then interpreting that AST. This as you would guess it is pretty slow. Mainly for two reasons, First the JavaScript itself is slow. But the main reason is that I am using an AST for interpretation. (If you want to learn more about what tokenizing, parsing, AST, intermediate code are then you can read https://www.cs.man.ac.uk/~pjj/farrell/comp3.html)

In other words what I'm using is a tree walk interpreter written in JavaScript, don't judge me, it was a way for me to learn how to write an interpreter in a language that I'm most comfortable in.

Also, another important factor about this website is that it's SPA (Single Page Application) without any server communication other than just retrieving the index page. So, everything from writing code, parsing, AST generation, interpretation, and showing the output all of this is happening inside your single browser window. A side effect of this is that when you click run it halts all of processes in your browser tab during the execution of your program. You can try this by running this code,

var i = 0;
while (i < 100000000000) {
 i = i + 1;
}

For removing this feature bug, a web worker could be used to run the code independently of the main thread.

To build the website I used Vue3 along with Bulma for styling. For code highlighting I'm using prism.js.

Performance Improvements

Now my primary goal is to do it all again! but in C. Because, as we all know C is fast! but along with using C I'm going to be using bytecode instead of an AST for interpretation. But you may wonder if my application is SPA and there is little to no server-side element to it then how will I run C in browser? Web Assembly!

I'll be using Emscripten to convert my C code into Web Assembly and run it using the Virtual File System that it provides. I know very fancy words but it's simple in understanding. I'll be discussing it my future blogs.

Resources

Educational Resources

You can check out Crafting Interpreter by Robert Nystrom. It helped me a lot to understand how interpreters work and how to make one yourself. It is a very practical approach learning about compiler design.

Project Repositories

Also, you can check out the code for this on GitHub:

The Interpreter: RhythmDeolus/Language_Customizer (github.com)
The Website: RhythmDeolus/Language_Customizer_website (github.com)