You are here

TANG: Creating A Template Language From Scratch

Corey Pennycuff's picture

Here it is: I recorded a real-time, live, unrehearsed, no-IDE (except VIM), terminal-only, modern C++, personal journey of creating a programming language from scratch.

I'm sure that there are plenty of questions that you might ask, so here's the answers to the most common ones.

The Vital Parts

YouTube playlist: Tang playlist

Github repository: https://github.com/Ghoti-io/Tang (edit: or, the C version that I'm currently working on: here)

Code example 1 (creating a list of names):

<ul>
<%
for (name : names) {
  print!("  <li>");
  print(name);
  print!("</li>\n");
}
</ul>

or:

<ul>
<% for (name : names) { %>
  <li><%= name %></li>
<% } %>
</ul>

Code example 2 (simple variable replacement):

Welcome, <%= user %>!

What's the point of this language?

The language is meant to be an end-user interface language that can produce text (HTML) output. It's a Template Language hence the name Tang).

Features:

  • It is Unicode aware (functioning on graphemes, not just code points).
  • It's syntax is a mixture of PHP and Python.
  • It's garbage-collected.

Where is an instance of how this language could be used?

  • Scenario 1:
    Imagine that you create an application and you want a designer to be able to create templates for the output. For security reasons, you don't want them to have access to everything that "bigger" languages have access to, such as the file system, networking, etc. You want to give them a sandbox that is powerful enough to make their job easy, but protects them from hurting the system in the event that they write infinite loops, syntax errors, etc. Tang is meant to be that language.
  • Scenario 2:
    Inagine that you create a game. You want to give users the ability to script actions, but you do not want to give them low-level access to the underlying system itself, lest they try to do things such as mine cryptocurrency in a background thread, read the user's personal files, open network connections, etc. You need a sandboxed language and execution environment, and that's what Tang is.
  • Scenario 3:
    Imagine that you just wrote a spreadsheet application (or some other formula-centric tool) in which users can enter fomulas with logic. How do you execute it? If you use an engine such as Node.js (which is a great too, btw) then you also just gave them access to everything that Node.js has access to. Again, you need a sandboxed environment.

I'm not claiming that Tang does all of this out of the box yet, but it is a framework on which features could be extended. For example, in the spreadsheet scenario, Tang could be the starting point, and additional syntax could be added for row/column references.

The videos are boring.

That's not a question, but it is a statement of opinion that I agree with. Sometimes programming is boring. Sometimes it's reading ugly syntax errors and sitting and thinking until you understand the problem. That's real-life programming.

Consider this: I have a PhD in computer science, and that doesn't matter. It doesn't mean that all of the sudden coding is super easy and fast. Sometimes I have to sit and think for quite a while in order to figure out what's wrong or how to fix it. I'm not a bad programmer. I'm not even a particularly slow programmer (although I'm not as fast as others that I have seen live coding in front of others). That's just the realities of programming itself, and I wanted to capture that fact. If it's too boring to you, then that's fine. It doesn't hurt my feelings. But sometimes I myself am in the mood to watch a long-format, real-time video of someone else coding (or painting, or anything else) and it is weirdly meditative. So that's what I produced.

What did you (that is me... Corey... they guy who did all of this) get out of this?

So many things.

  • I wanted to capture what it's really like to write a nontrivially large project from the ground up. I hope it is helpful to someone some day. To be clear, it can often be difficult to see that path of how to bootstrap a project from an empty text file to something that spans many files.
  • I wanted to get better at using Vim, so I used it exclusively. I won't claim to be good at it now, but I did pick up some nice tricks by using it consistently over a long period of time.
  • I wanted to understand the structure of a C++ library better and how it needed to be structured as a project in order to have it exist in a versioned, cross-platform environment. My approach isn't perfect, but I do like what it has become so far.
  • I wanted to learn. I learned so much about, well, so much!
    • Fun. This benefit should never be underestimated.
    • Doxygen. Very nice.
    • Bison and Flex. It's sorcery!
    • Unit testing. I now live and die by it.
    • C/C++ compiling deep dive. It's not just a set of magic incantations that turn code into an executable. I've always understood it decently enough, but now I feel competent.
    • Linux shared object libraries. Not just how to make and use them, but how to add support so that other programs can use them as well.
    • Modern C++. Basically exploring all of the things that they don't have time to teach you in class.
    • Hot-loading code in C++.
    • Makefile and shell scripting.
    • GDB. I'm not an expert with it, but at least now I'm not scared of it, either.
  • I have a project that I want to use with this, so this was a stepping stone.

What's next?

This is not the first incarnation of Tang. The first time I wrote Tang, I did it in C++ as an AST walker. That was the first incarnation, and it was not recorded. It was slow, probably due to my implementation relying heavily on shared pointers and polymorphism. The second version of the language is what you see here. It has a bytecode interpreter. It's not just an extension of the AST walker, though. I tried to incorporate the lessons that I learned in the first version into this second attempt. It's much better, but still has a way to go. I'm currently focusing on performance and better installation patterns.

Update!!!

I'm now working on a 3rd incarnation of Tang. This one will be written in C. It will have a JIT with Bytecode fallback.

Tags: