Search this blog

28 April, 2008

Is writing text the best way to code?

I do think so. But it seems that there are many people that do not agree with me, as there are a lot of experiments with graphical programming languages. One of the best I've ever seen is Subtextual, but I still don't see the point in those languages.

There are a couple of educational languages that use graphics to ease programming, and some of them are also very nice (check out this one, Alice), probably for childs this is the best way to start (even if I started coding at eight, in Basic, and I don't remember having much trouble with the syntax itself). But for most newbies the thing that is hard to understand is the logic of coding, not the syntax of the languages. Syntax is easy. I can't find any simpler way to express a condition than the sentece: if x>10 then print "the numer is greater than ten"

In general, I find that text is a great way of expressing logical statements. And textual input is the best way of editing it, of course. While is true that code is data (and we know that since the beginning of computer science, from lambda calculus), I don't see any value in editing it as structured data. Modern IDEs already use that equivalence, and almost noone today is editing code as raw text, even the simplest code editor has syntax highlighting, code folding, many do refactoring, many others use reflection or parsing to run tests, to do coverage analysis, static code checking... We do take advantage of the structure of the code.

Last but not least, beware of false analogies. My motivation in writing this post was mainly because of this other one about subtextual. There, the author writes:

[...] One of the things you do most frequently in an IDE like this is type some code, which at least temporarily puts your program into a totally invalid state. As you're typing "def ", your module is syntactically invalid [...] If you're using a tool to edit something other than a program, like, say, Inkscape, as you move between different states in your drawing (add a line, change a gradient, resize a shape) each one is a valid SVG document if you were to save it. [...]"

Now, the nice thing about analogies is that they take two truths and stich them togheter. As you're starting from two truths, usually people don't pay much attention about the intricacies of that mechanism and often see relations that are not there. Here we are comparing coding using text to (vector) image editing, show how textual coding goes from an valid program to another passing through many invalid ones, while image editing always maintains a valid image. But this is not true at all. When I edit a photo in photoshop, I go through many invalid ones, meaning that they don't represent anything believable, they are wrong. I can cut a piece of skin, paste it into a layer, move to another position in the image, in order to cover a scar on the body. And until I've finished blending the new layer, the image is really invalid to me. Of course in image editing, we don't have a formal mean to assess validity, but still when making our analogies we should be sure of working on the same level of abstraction. We can do that using the correct one, an higher one, in the image editing part of the analogy, or to make my point in a stronger way, also lowering the one on the programming side. As Inkscape goes from a valid SVG document to another, a IDE goes from a valid ASCII document to another.

Last but not least, let me make an analogy. Coding is about expressing idea (only algorithmic ones) in a language (a very small, rigid but extensible one). I don't see any better way of doing that than writing text, as I don't if I had to write a novel in a natural language, or a theorem in maths.


Ale said...

This is a really interesting topic, but actually I find it mostly a taste matter.
In your analogy, it's assumed that code can be expressed better with text, but I don't think it's true.

Text is nothing more than "graphics", combined in some way, and as there exists a way to say "for each x in the set A", there could be exists a "shorter symbol" to express it.

Text IS graphical, so it is a matter of comparing common text advantages with other symbols advantages. Text is rigid and can be handled in fixed modes (from left to right, top to bottom). Graphics is more "rule-free" and harder to handle.

Graphics is as good as text for expressing logical statement (i.e. flux diagrams), but I don't think that is "the way". As I don't think that "text is the way".

Do you read? I guess yes. There are books with no pictures. There are books with only pictures, and both of them express concepts, in different ways and different levels.

I think the best way is combining both. There is no way that text could give a good representation about code structure (this is why UML DOES exist), but there is no way "graphics" could be used for expressing some abstract concepts in a quick way (i.e. maths symbol is graphics, but it's absolutely arbitrary and abstract, it's just a convention matter. And it isn't short at all).

I'd like to *see* blocks of code which represent something, but don't see the code which does that something. Just text can't do that.
I'd like to *write* code for a specific and particular purpose, without caring of the structure of the whole project. Just graphics, can't do that.

Subtext idea is quite interesting, but it's not fast, imho. A good language/symbols set could be far better in shorter time.

And... Someone say that a picture is worth more than 1000 words :)
I guess that graphic is not to discard, but to use wisely.

DEADC0DE said...

Mhm. Of course text is made of characters that in turn are some graphical glyphs that we use to compose words. I'm not discussing the merits of using those glyphs compared to others (and there are languages that used non alphabetic symbols, still being textual, like APL, or mathematica), but using a keyboard, a textual input device, compared to using a graphical GUI, to edit code.

It's true that some hybrids could well exist, but I was more talking about the idea of graphical languages, that use graphics as their core syntax and input method, not about organizing code like is possible with UML roundtrip engineering tools for example. I'd like to see more graphical representations of the textual code, visualizations of the code structure (dependency graphs etc) and execution (i.e. the DDD debugger). But I like to write code, not to draw it.

Anonymous said...

I think that current "graphic" programming are not high level enough.

You could imagine describing a data structure (let's UML) where you add feature, behaviour...

I know I may be too abstract here.
But what I mean is that "text" programming does not represent the code itself.

If you take the word "class" in your language, it does not mean something until you have typed "c/l/a/s/s" and even, it is still characters with no meaning until you have tried to compile it.

Graphic system allow to manipulate directly data/process and keep the meaning and its integrity. It is a wonderfull step.

It is not something that your text editor is able to understand when you move things around.

Of course, I agree that IDE tends to get smarter and smarter and by having background compilation are able to understand what you want to say, or be able do move things around, refactor and such...

I think it makes the difference here.Of course, I agree that we talk about theory here.
I do not think that text based IDE are going to be replaced soon, but I still believe they are archaic but we just do not have a better way now to get rid of them.

AI may be the next evolutionnary step of computing to allow such thing.

For now, data structure, state machine / graph, hardware communication/process communication is something that can be described very well with a graphical tool and poorly in a programming language.

Formal verification tool also use graphical definitions.

Extracting a state machine from a switch case in a source code is quite difficult.
(but hardware language such as VHDL do it)

The biggest problem I see with graphical programming is the usage of "expression". It is just too much pain in the a... to write something like :
if (a+abs(b) > c)

This is why mathematics use symbol and are still "written", and not graphic with tree.

The other problem we are facing is the "compactness".
One line of code may take a lot of graphical surface.

Finally, things like variable/register and anything related to memory access is just very difficult to express in such graphical environment.
(I would not like to program in assembler with such tools ! :-P)

Ale said...

Well, but graphics tools are at the same level as text is.
Both are valid when it come to express something: you could express "if (a+abs(b) > c)" with few blocks.
To be more precise: actually compilers do represent the whole source code as a syntax tree, so code can be easily represented via graphic methods.

I'm not talking here about "powerfulness" or "level", because even with text you could go to the highest level possible.
And I don't think it's a matter of keyboard or mouse: mouse may take far lesser time if with one drag and one click you could create a whole switch-case conditional, instead of writing the whole code.

I think that, since both ways are equivalent, one should only have tools that allow both input methods: if someone would like to program the whole application with mouse, he can do (wasting lot of time), same as one would use just the keyboard.

This tool should keeps in itself the program with a binary structure, and give users many ways to show and edit it, graphical or textual.

Anonymous said that some things are described poorly with text: well i don't think. Text gives you same ability to describe something.
The difference isn't how much a structure can be well described, but how it can be handled.

I'm talking about semantic and structure: when you code you may think of your program in a certain way, but when you write your code you're doing it in another way. For example, many programmers think about "modularity" in their programs, but when they use global variables the bounds of modules is broken: something which belong to a module goes in another module, produces side effects, logical bugs, hardness in maintenance and so on.

(This happens mostly when patching programs and doing later modifications to code, that something new break the structure of the code. Because, often, the program is not enough self-documented and it's very time-consuming to reverse-engineer the code, even if many tools help)

I think that having a "second" way to see things, could help in organizing the logical structure of the program, avoiding conflicts and "boundary breaking": if you see that boundary, you could avoid going to the other side.

UML already gives a "second way" to organize your program, but often it's not consistent with the program: you design things on UML and when developing things it's a different matter.
I believe they should (and could, in a rather short time) be synchronized and equivalent, or - even better - they just should be the same, unified, method for writing a program.

Ok, sorry for the long post :D

Anonymous said...

Hi !

Yes, UML is the first step and I agree that somehow "structural integrity" should be protected through synchronisation with the source code.

Still things like memory read/write and expression (while they are POSSIBLE graphically) are not really usable in the current state of graphical programming.
It is good for research stuff, not for production code.
You can play doing the equiv of 300 lines of code may be but not a whole game project such as next gen games.

This is why we are still "stuck" with text.

Imagine a super smart computer, same as you.
You give him the data structure in UML.

What is going to happen in the next step ?
The computer will ask you : ok what that function is doing, give me a description...

You can say : hey this part is an iteration doing this, this part is a state machine, this part is a series of conditionnal statement...

Of course, each time the definition is not clearly explicit, the computer will ask recursively questions until everything is clear.

For now, doing that fully graphically simply do not work in term of realistic usage. It is too much pain.
As long as the computer is not smart enough to write 90% of these graphics automatically based on your answer to those question, we will not escape from text input for logic.

But that does not mean that text is the best input system for logic.

Anyway, thats just my two cents.

DEADC0DE said...

Another interesting thing to notice is how graphical languages tend to be less expressive than textual ones as encoding graphically a complex but powerful syntax is often very hard. Most graphical languages used in real world are state machine editors. State machines are probably the lower possible level of representing computation. And, as a side note, I do think that most of the time those tools are abused, it's my personal opionion and it's not my field so I could be really wrong, but I think that we're using state machines too much to represent game logic. I've always wondered if they could be replaced by another paradigm, I think that logic programming (ala prolog) could do the job...

Ale said...

Mhh I'm not an expert either, but too bad... It seems that research goes in the wrong direction, if they're trying to make "graphic" as alternative to "text".

Anyway, deadc0de, as you said a complex and powerful syntax via graphical method is hard (most probably: useless) to implement, so they point mostly to create graphical tools which ease the implementation of such pattern.

I never found state machines to be really "clean" in the code, both implemented with switch-case or with classes, so a graphical representation could be "more usable". It's just matter of tastes, I think.
So they did it with GUI, which tend to be more easy (everyone knows how state graphs are far more readable than state matrices or state-spaghetti-code).

About the abusing of states and logic programming... Well, abusing states is quite simple :) because it's easy to think in that way. It's a pity that it becomes hard to maintain as the code (and the states number) grow larger.

But why do you think that logic programming could do? I'm interested in this point (if this is a long topic, you could write a post about it, if you like :)

DEADC0DE said...

I think that state machines encode logical propositions and that expecially in games having a logical language can be more useful to encode logic. Think about encoding combos in a street fighter like game, it could be rather complex to do that with an explicit state machine. The only state machine that I find useful is the one that handles the menu system, but that one is implicit in the menu system itself, the current menu page is the state, the buttons/links are the transitions.