Tcl the Misunderstood

Salvatore antirez Sanfilippo, 6 March 2006
Why Tcl is not a toy language, but a very powerful one

In an article recently linked from reddit entitled Tour de Babel you can read (among lots of other nonsense): Heck, people still use Tcl as an embedded interpreter, even though Python is far superior to Tcl in every conceivable way -- except, that is, for the frost thing.

Ok, the whole article is well.. not very valid, but unfortunately while many misconceptions are promptly recognized by the informed reader, this one against Tcl is generally believed at face value. I hope this article will convince people that Tcl is not that bad.

Prologue

In my programming life I have used a lot of languages to write different kind of applications: many free/paywork programs in C, a web CMS in Scheme, a number of networking/web applications in Tcl, a shop management system in Python, and so on. I used to play with a number of other programming languages like Smalltalk, Self, FORTH, Ruby, Joy,... And yet, I have no doubt, that there is no language that is as misunderstood in the programming community as Tcl is.

Tcl is not without faults, but most of its limitations are not hard coded in the language design, they are just the result of the fact that Tcl lost its "father" (John Ousterhout) a number of years ago, and together with him any kind of single-minded strong leadership that was able to take strong decisions. With the right changes it is possible to overcome most of the limitations of Tcl, and at the same time preserve the power of the language. If you don't trust Tcl is remarkably powerful please take the time to read this article first. Maybe you still won't like it afterwards, but hopefully you will respect it, and you will certainly have strong arguments against the Tcl is a toy language misconception that's even more petty than Lisp has too many parenthesis.

Before we begin, I'll spend some time explaining how Tcl works. Like the best languages in the world, Tcl has a few concepts that, combined together, allow for programming freedom and expressiveness.

After this short introduction to Tcl, you'll see how in Tcl things very similar to Lisp macros just happen using normal procedures (in a much more powerful way than Ruby blocks), how it's possible to redefine almost every part of the language itself, and how it is possible to mostly ignore types when programming. The Tcl community developed a number of OOP systems, radical language modifications, macro systems, and many other interesting things, just writing Tcl programs. If you like programmable programming languages I bet you'll at least look on it with interest.

Tcl in five minutes

Concept 1: Programs are composed of commands

The first idea of the Tcl language is: commands. Programs are commands, one after the other. For example to set the variable 'a' to 5 and print its value you write two commands:

set a 5
puts $a

Commands are space separated words. A command ends with a newline or with a ; character. Everything is a command in Tcl - as you can see there is no assignment operator. To set a variable you need a command, the set command, that sets the variable specified as the first argument to the value specified as the second argument.

Almost every Tcl command returns a value, For example the set command returns the value assigned to the variable. If the set command is called with just one argument (the variable name), the current value of the variable is returned.

Concept 2: Command substitution

The second idea is command substitution. In a command some arguments may appear between [ and ] braces. If so the argument is substituted with the return value of the code included inside the braces. For example:

set a 5
puts [set a]

The first argument of the second command, [set a], will be substituted with the return value of "set a" (that's 5). After the substitution step the command will be converted from:

puts [set a]
to
puts 5
And, at that point, it will be executed.

Concept 3: Variable substitution

Always using the set command for variable substitution would be too verbose, so even if not strictly needed, variable substitution was introduced at some time during the early development of Tcl. If a variable name is preceded by the $ character it is substituted with its value. So instead of

puts [set a]
it's possible to write
puts $a

Concept 4: Grouping

If commands are space separated words, how to deal with the need for arguments that may contain spaces? For example:

puts Hello World

is an incorrect program as Hello and World are two different arguments. This problem is solved by grouping. Text inside "" is considered a single argument, so the right program is:

puts "Hello World"

Commands and variables substitution work inside this kind of grouping, For example I can write:

set a 5
set b foobar
puts "Hello $a World [string length $b]"

And the result will be "Hello 5 World 6". Also, escapes like \t, \n will do what you think. There is, however, another kind of grouping where every kind of special character is just considered verbatim without any kind of substitution step. Everything between { and } is seen by Tcl as a unique argument where no substitutions are performed. So:

set a 5
puts {Hello $a World}

Will print Hello $a World.

Concept 1 again: Everything is a command

Concept 1 was: programs are composed of commands. Actually it is much more true than you may think. For example in the program:

set a 5
if $a {
    puts Hello!
}

if is a command, with two arguments. The first is the value of the variable a substituted, the second is the string { ... puts Hello! ... }. The if command uses a special version of Eval that we'll see in a moment to run the script passed as the second argument, and returns the result. Of course, you can write your version of if or any other control structure if you want. You may even redefine if itself and add some feature to it!

Concept 5: Everything is a string - no types

The following program works and does what you think:
set a pu
set b ts
$a$b "Hello World"

Yes, in Tcl everything happens at runtime and is dynamic: it's the ultimate late binding programming language, and there are no types. The command name is not a special type but just a string. Numbers are also just strings, so is Tcl code, a string (remember we passed a string to the if command as second argument?). In Tcl what a string represents is up to the command that's manipulating it. the string "5" will be seen as a string of characters by the "string length 5" command, and as a boolean value by "if $a ..." command. Of course commands check that values have a suitable form, If I try to add "foo" to "bar" Tcl will produce an exception because it can't parse "foo" nor "bar" as numbers. This kind of checks in Tcl are very strict, so you'll not get the PHP-alike effect of silent absurd type conversions. The type conversion only happens if the string makes sense interpreted as the thing the command needs as arguments.

So Tcl is so dynamic, but guess what? It is more or less as fast as current Ruby implementations. There is a trick in the implementation of Tcl: objects (not in the OOP sense, but C structs representing Tcl values) cache the native value of the last use of a given string. If a Tcl value is always used as a number the C struct representing it will contain an integer inside, and as long as the next commands continue to use it as an integer, the string representation of the object is not touched at all. It's a bit more complex than this, but the result is that the programmer doesn't need to think about types, and programs still work as fast as other dynamic programming languages where types are more explicit.

Concept 6: Tcl lists

One of the more interesting types (or better.. string formats) Tcl uses is lists. Lists are mostly the central structure of a Tcl program: a Tcl list is always a valid Tcl command! (and both are just strings, in the end). In the simplest form lists are like commands: space separated words. For example the string "a b foo bar" is a list with four elements. There are commands to take a range of elements from a list, to add elements, and so on. Of course, lists may have elements containing spaces, so in order to create well formatted lists the list command is used. Example:

set l [list a b foo "hello world"]
puts [llength $l]

llength returns the length of the list, so the above program will print 4 as output. lindex will instead return the element at the specified position, so "lindex $l 2" will return "foo", and so on. Like in Lisp, in Tcl most programmers use the list type to model as many concepts as possible in programs.

Concept 7: Math in Tcl

I bet most Lisp hackers already noted how Tcl is a prefix-notation language, so you may think like in Lisp, math in Tcl is performed using math operators as commands, like: puts [+ 1 2]. Instead, things work in a different way: in order to make Tcl more friendly there is a command taking infix math expressions as argument and evaluating them. This command is called expr, and math in Tcl works like this:

set a 10
set b 20
puts [expr $a+$b]

Commands like if and while use expr internally in order to evaluate expressions, for instance:

while {$a < $b} { puts Hello }

where the while command takes two arguments - the first string being evaluated as an expression to check if it's true at every iteration, and the second evaluated itself each time. I think it's a design error that math commands are not builtins, I see expr like a cool tool to have where there is complex math to do, but to just add two numbers [+ $a $b] is more convenient. It's worth noting that this has been formally proposed as a change to the language.

Concept 8: Procedures

Naturally, nothing stops a Tcl programmer from writing a procedure (that's a user defined command) in order to use math operators as commands. Like this:

proc + {a b} {
    expr {$a+$b}
}

The proc command is used to create a procedure: its first argument is the procedure name, the second is the list of arguments the procedure takes as input, and finally the last argument is the body of the procedure. Note that the second argument, the arguments list, is a Tcl list. As you can see the return value of the last command in a procedure is used as return value of the procedure (unless the return command is used explicitly). But wait... Everything is a command in Tcl right? So we can create the procedures for +, -, *, ... in a simpler way instead of writing four different procedures:

set operators [list + - * /]
foreach o $operators {
    proc $o {a b} [list expr "\$a $o \$b"]
}

After this we can use [+ 1 2], [/ 10 2] and so on. Of course it's smarter to create these procedures as varargs like Scheme's procedures. In Tcl procedures can have the same names as built in commands, so you can redefine Tcl itself. For example, in order to write a macro system for Tcl I redefined proc. Redefining proc is also useful for writing profilers (Tcl profilers are developed in Tcl itself usually). After a built in command is redefined you can still call it if you renamed it to some other name prior to overwriting it with proc.

Concept 9: Eval and Uplevel

If you are reading this article you already know what Eval is. The command eval {puts hello} will of course evaluate the code passed as argument, as happens in many other programming languages. In Tcl there is another beast, a command called uplevel that can evaluate code in the context of the calling procedure, or for what it's worth, in the context of the caller of the caller (or directly at the top level). What this means is that what in Lisp are macros, in Tcl are just simple procedures. Example: in Tcl there is no "built-in" for a command repeat to be used like this:

repeat 5 {
    puts "Hello five times"
}
But to write it is trivial.
proc repeat {n body} {
    set res ""
    while {$n} {
        incr n -1
        set res [uplevel $body]
    }
    return $res
}

Note that we take care to save the result of the last evaluation, so our repeat will (like most Tcl commands) return the last evaluated result. An example of usage:

set a 10
repeat 5 {incr a} ;# Repeat will return 15

As you can guess, the incr command is used to increment an integer var by one (if you omit its second argument). "incr a" is executed in the context of the calling procedure, (i.e. the previous stack frame).

Congratulations, you know more than 90% of Tcl concepts!

Why is Tcl powerful?

I am not going to show you every single Tcl feature, but I want to give an idea of advanced programming tasks that are solved in a very nice way with Tcl. I want to stress that I think Tcl has a number of faults, but most of them are not in the main ideas of the language itself. I think there is room for a Tcl-derived language that can compete with Ruby, Lisp and Python today in interesting domains like web programming, network programming, GUI development, DSL and as scripting language.

Simple syntax that scales

Tcl syntax is so simple that you can write a parser for Tcl in few lines of code in Tcl itself. I wrote a macro system for Tcl in Tcl as I already mentioned, which is able to do source level transformations complex enough to allow tail call optimization. At the same time, Tcl syntax is able to scale to appear more algol-like, it depends on your programming style.

No types, but strict format checks

There are no types, and you don't need to perform conversions, however, you aren't likely to introduce bugs because the checks on the format of the strings are very strict. Even better, you don't need serialization. Have a big complex Tcl list and want to send it via a TCP socket? Just write: puts $socket $mylist. On the other side of the socket read it as set mylist [read $socket]. and you are done.

Powerful, event driven I/O model

Tcl has built-in event-driven programming, integrated with the I/O library. To write complex networking programs with just what is provided in the core language is so simple it's funny. An example: the following program is a concurrent (internally select(2) based) TCP server that outputs the current time to every client.

socket -server handler 9999
proc handler {fd clientaddr clientport} {
    set t [clock format [clock seconds]]
    puts $fd "Hello $clientaddr:$clientport, current date is $t"
    close $fd
}
vwait forever

Non-blocking I/O and events are handled so well that you can even write to a socket where there is no longer output buffer and Tcl will automatically buffer it in userland and send it in background to the socket when there is again space on the socket's output buffer.

Python users know a good idea when they see it - Python's "Twisted" framework makes use of the same select-driven IO concepts that Tcl has had natively for years.

Multiple paradigms

In Tcl you can write object oriented code, functional style code, and imperative code in a mix, like it happens in Common Lisp more or less. A number of OOP systems and Functional Programming primitives where implemented in the past. There are everything from prototype-based OOP systems to SmallTalk-like ones, and many are implemented in Tcl itself (or were initially, as a proof-of-concept). Furthermore, because code in Tcl is first class, it is very simple to write functional language primitives that play well with the logic of the language. An example is lmap:

lmap i {1 2 3 4 5} {
    expr $i*$i
}

which will return a list of squares, 1 4 9 12 25. You can write a map-like function based on a version of lamba (also developed in Tcl itself), but Tcl has already what you need to allow for a more natural functional programming than the Lisp way (which works well for Lisp but maybe not for everything else). Note what happens when you try to add functional programming to a language that's too rigid: Python and the endless debate of its functional primitives.

Central data structure: the list

If you are a Lisp programmer you know how beautiful is to have a flexible data structure like the list everywhere in your programs, especially when the literal is as simple as "foo bar 3 4 5 6" in most cases.

Programmable programming language via uplevel

Via eval, uplevel, upvar and the very powerful introspection capabilities of Tcl you can redefine the language and invent new ways of solving problems. For example, the following interesting command if called as first command in a function will automagically make it a memoizing version of the function:

proc memoize {} {
    set cmd [info level -1]
    if {[info level] > 2 && [lindex [info level -2] 0] eq "memoize"} return
    if {![info exists ::Memo($cmd)]} {set ::Memo($cmd) [eval $cmd]}
    return -code return $::Memo($cmd)
}
Then, when you write a procedure just write something like:
proc myMemoizingProcedure { ... } {
    memoize
    ... the rest of the code ...
} 

i18n just happens

Tcl is probably the language with the best internationalization support. Every string is internally encoded in utf-8, all the string operations are Unicode-safe, including the regular expression engine. Basically, in Tcl programs, encodings are not a problem - they just work.

Radical language modifications = DSL

If you define a procedure called unknown it is called with a Tcl list representing arguments of every command Tcl tried to execute, but failed because the command name was not defined. You can do what you like with it, and return a value, or raise an error. If you just return a value, the command will appear to work even if unknown to Tcl, and the return value returned by unknown will be used as return value of the not defined command. Add this to uplevel and upvar, and the language itself that's almost syntax free, and what you get is an impressive environment for Domain Specific Languages development. Tcl has almost no syntax, like Lisp and FORTH, but there are different ways to have no syntax. Tcl looks like a configuration file by default:

disable ssl
validUsers jim barbara carmelo
hostname foobar {
    allow from 2:00 to 8:00
}

The above is a valid Tcl program, once you define the commands used, disable, validUsers and hostname.

Much more

Unfortunately there isn't room to show a lot of interesting features: most Tcl commands just do one single thing well with easy to remember names. Strings operations, introspection and other features are implemented as single commands with subcommands, for example string length, string range and so on. Every part of the language that gets indexes as argument support an end-num notation, so for example to take all the elements of a list but not the first nor the last you just write:

lrange $mylist 1 end-1

And in general there is a lot of good design and optimization for the common case inside. Moreover the Tcl source code is one of the best written C programs you'll find, and the quality of the interpreter is amazing: commercial grade in the best sense of the word. Another interesting thing about the implementation is that it works exactly the same in different environments, from Windows to Unix, to Mac OS X. No quality difference among different operating systems (yes, including Tk, the main GUI library of Tcl).

Conclusion

I don't claim everybody should like Tcl. What I claim is that Tcl is a powerful language and not a Toy, and it's possible to create a new Tcl-alike language without most of the limitations of Tcl but with all of its power. I tried this myself, and the Jim interpreter is the result: the code is there and working and can run most Tcl programs, but then I had no longer time to work for free to language development, so the project is now more or less abandoned. Another attempt to develop a Tcl-alike language, Hecl, is currently in progress, as a scripting language for Java applications, where the author (David Welton) exploits the fact that the Tcl core implementation is small, and the command based design simple to use as a glue between the two languages (This is not typical with modern dynamic languages, but both the ideas apply to Scheme too). I'll be very glad if, after reading this article, you no longer think of Tcl as a Toy. Thank you. Salvatore.

Vote on Reddit.com

p.s. want to learn more about Tcl? Visit the Tclers Wiki.