Syntax Matters...?

barrkel · on April 16, 2012

It is important, however, to distinguish between syntax and expressiveness. I don’t care (too much) whether you write function(x) { ... }, \x -> ... or { |x| ... } to denote a closure, but there had better be a way to write a closure somehow! (Java, I’m looking at you here)

Java is the refutation, IMO: it does have a way to express a closure, it's just that the syntax (anonymous class) is laborious. You need to wrap up any mutable variables to be captured in final single-element arrays, and your closure type will be a single-method interface, but the expressiveness is there. The syntax just sucks.

The laborious syntax is what stops you from writing what may be expressed in C# as 'foo.Where(x => x.Name.Length > 10).OrderBy(x => x.Name)' in Java. The equivalent, even using the smaller Javascript syntax of 'function(x) { return x.name.length > 10; }', is tedious. The (current) Java syntax is just silly.

The syntax "weight" of a feature matters. Syntax that's light and unobtrusive invites use. Syntax that's large and clumsy dissuades use, to the point that whole libraries may not come into existence, not because of a lack of expressiveness in the language, but because of aesthetic ugliness.

Lisp has an out here because of macros; almost everything is equally ugly. Many people appear to consider this base level of ugliness unacceptable; others may see it as a flaw of Lisp, because the very malleability of the syntax makes it harder to encode implementation costs in the form of the program.

scott_s · on April 16, 2012

I think you're abusing the word "syntax." I think there is a difference between being able to implement the same functionality of an abstraction, and having direct language support for that abstraction.

I don't think it's accurate to say that Java has clunky syntax for closures. Rather, I think it's more accurate to say that Java has no concept of closures. If you want that functionality, you need to roll your own support using existing concepts. C++ before C++11 is the same.

ajross · on April 16, 2012

But isn't that abuse of the word "concept"? The same logic would argue that C has no concept of a string, or that lisp has no concept of an object (despite lots of text handling and OO code being written in both languages). Not every "closure" works the same, even between languages that support things called "closures" and use them in roughtly the same way. Anonymous Java classes are clearly a related concept, and in fact as the grandparent points out there's a straightforward mapping between them.

scott_s · on April 16, 2012

I would argue that C has no concept of a string: there is no direct support in C (or C++, for that matter) for concepts related to strings. They are just arrays that happen to contain characters. All of the strigyness of those arrays exists solely in the mind of the programmer. C itself does not help you any. That is, I think, the hallmark of a language level abstraction: does the language allow you to think of the entity only as its abstraction, or do you need to handle that yourself? This fact is also, I think, a major reason why there are so many string-related bugs in C code.

Having never used Common Lisp, I can't say if that's also accurate for your other example.

Edit: ARGH. I forgot string literals in C. So C clearly does have the concept of strings. The problem is that's the only support in the language for that concept. Aside from being able to define string literals, C does not help you with strings at all.

ajross · on April 16, 2012

Fair enough. Then I suspect we'll just have to agree to disagree here. To me, that's a near-meaningless semantic distinction. Language concepts are squishier in my mind.

kibwen · on April 16, 2012

I'm not sure we can make a distinction here; since it's possible to use any Turing-complete language to augment itself (if necessary, in the worst case, by writing a new virtual machine for itself in itself), every language, technically, includes every concept. We have to draw the line somewhere, don't we? :)

barrkel · on April 16, 2012

Anonymous classes use lexical closure to capture names in the outer environment. The feature that Java lacks is not so much closures, as first class functions (which you can have without closures). Defining a method which binds to lexically enclosing symbols, and when evaluated has access to an environment mapping these symbols to values, seems to me to be possible in Java. It's just that the method needs to be contained in a class.

tqs · on April 16, 2012

I think you're spot on with your comment on syntactic weight.

Coffeescript vs Javascript forms a good case study because they largely have the same semantics but with different syntax for key features.

To my mind, the big difference with Coffeescript is the syntax for functions, e.g.

(x) -> x * x

vs.

function (x) { return x * x; }

Coffeescript's syntax eliminates "function" and "return", which significantly lightens the syntactic weight for functions.

I would assume on average that Coffeescripters write significantly more functions than Javascripters (say, for normalization, per bytes of minified JS). This assumption is based on reinforcing selective pressures: Coffeescript encourages programmers to write more functions, and programmers who want to write more functions choose Coffeescript.

kibwen · on April 16, 2012

You make a very good point. However, I think he addresses your concern over "syntactic weight" in this line:

Basically, there are some languages whose syntax is so distinctive that it makes a qualitative difference to the experience of programming.

I believe his argument then is that of the following three syntaxes (syntaxen?) to describe closures:

  function(x) { ... }
  \x -> ...
  { |x| ... }

The differences are so slight as to not be worth quibbling over. Perhaps his point, as yours, is that Java's closure-equivalents are so onerous to employ that they might as well not exist at all. :)

barrkel · on April 16, 2012

Part of my comment is arguing that there is actually a significant difference in aesthetics between "\x -> y" and "function(x) { return y; }"; and that query library like LINQ is natural to use with the former syntax, but so awkward to use with the latter that it's unlikely to be written in that way because of how ugly it would be in use.

vorg · on April 17, 2012

The first | in the third example, i.e. {|x| ...}, is redundant. This is the first thing I noticed when I first read up about Ruby over 10 years ago, and it put me off what I now know is a fairly good language. The same thing happened with the leading __ and trailing __ for certain names in Python. Syntax matters as newcomers notice it.

kibwen · on April 17, 2012

I don't know about Ruby, but it's not redundant in Rust; you can enclose arbitrary expressions in braces, and | is the bitwise OR operator, and Rust allows implicit returns such that the following is the identity closure:

  {|x| x}

Trying to rewrite this in your syntax as:

  {x| x}

could cause ambiguity if you had defined a variable x somewhere up-scope: is that a bitwise OR or is it a closure? True, the correct behavior could probably (maybe?) be inferred from context, but that might require Rust to switch from a simple LR(1) parser to an arbitrary-lookahead parser, which would inflate compilation times.

In any case, Ruby's hamburger-notation for closures is well-known by this point, and I'd hardly call it novel anymore. In the case of Python's __special__ methods, the whole point was to discourage people from heavily using them in the first place. Making ugly operations ugly, as it were.

Your point that newcomers might be turned off by a language's syntax remains valid, but you've also demonstrated that newcomers lack the context necessary to be aware of the tradeoffs necessary in language design. But hey, tough shit for language developers, huh? Yes, tough shit indeed. Languages must be familiar enough to attract people, but they must also be sufficiently expressive enough to retain the interest of the people in the community who really matter (the elite hackers and the devoted evangelists), while at the same time seeming novel enough to justify switching away from whatever language you were previously using in that domain.

TL;DR: It's a thorny issue, and simply saying "syntax matters" is an unjust simplification. :)

CodeMage · on April 16, 2012

While I agree with your ideas about the "syntax weight", I think that you're wrong about Java.

Java definitely has no syntax for closures, because it has no semantics for function types. It has interfaces and anonymous classes, which you can use to accomplish the same requirement, but the semantics of interfaces are not 100% the same as that of function types. A small example: there's only one function type for a function that takes one int and returns one int, whereas there are infinitely many interfaces you can define to represent that function type and none of them are necessarily interchangeable among themselves.

I think this distinction is important and that this is what the author is talking about when he says "there had better be a way to write a closure somehow!" In Java, there isn't.

barrkel · on April 18, 2012

You're confusing structural / nominal typing with first class functions. What Java lacks is the latter, while closures do not depend on the former, which is what you're talking about.

For example, C# supports function types, but it can have hundreds of function types (it calls them delegate types) for a function that takes one int and returns one int:

  delegate int Func1(int x);
  delegate int Func2(int x);
  delegate int Func3(int x);

and so on; all these function types are incompatible with one another.

I believe the distinctive feature of closures is closing over (hence the name) the outer lexical environment and binding to variables in the outer environment, in such a way that those variables are available during evaluation. Which is exactly what an anonymous class lets you have (so long as locals in the outer scope are declared final).

beothorn · on April 16, 2012

I agree with all you said, except that writing the equivalent of "function(x) { return x.name.length > 10; }" on java is a syntax problem. There is a lot implicit when you declare this function on javascript, for example, it's visibility. There's also the typing which is resolved on runtime. So maybe I don't have the concept very clear but, is this really a syntax difference or a language features difference?

starwed · on April 16, 2012

You realize you're making exactly the same point that the blog post was trying to make...

pcwalton · on April 16, 2012

I don't agree that surface syntax doesn't matter unless it affects language functionality. Syntax is UI for programming languages, and UI for programming languages matters as surely as UI for other software matters. Fortunes have been won and lost on UI.

To me, the problem is that there is no answer to programming language syntax that pleases everybody. There are general aesthetic principles to follow, but the fundamental questions - semicolons vs. newlines, indentation vs. braces, short vs. long keywords - generate a lot more heat than light. We have to pick one and disappoint somebody.

Note that even in this thread, there is disagreement about whether JavaScript is too terse or too verbose!

vidarh · on April 16, 2012

Frankly, I've stopped using languages, or lost interest in learning languages, entire over "small" syntax issues. Syntax is a bigger potential deal-breaker to me than functionality in many cases - I spent many times more time reading code than molding my mental model of what I want to do into something I can write down as code in a particular language. If I can't read and scan code quickly, the language is dead to me.

kibwen · on April 16, 2012

I empathize. I have a hugely hard time reading code, which is why I tend to use Python wherever I'm able (and even Python's a bit hard for me to read, but it's certainly better than most).

However, I also think it's a little sad to limit your selection of languages strictly due to syntactic concerns, without any consideration of features whatsoever. Some languages that are utterly unreadable (I'm looking at you, Haskell! Like trying to eat soup with your hands tied behind your back) have some really cool concepts that, if nothing else, are great to poach for your own favorite language (list comprehensions spring to mind).

chc · on April 16, 2012

Haskell's actual syntax is actually quite readable — I'd say it's maybe second to Python. Where Haskell gets hard to read is in library code with lots of custom operators, where you have to remember the difference between

  **>
  *>
  *>*
  >*>
  >**

But that's not really Haskell's syntax, it's just a combination of libraries with lots of unfamiliar operators.

vidarh · on April 18, 2012

I'd prefer to read assembly over Haskell any day.

Peaker · on April 16, 2012

I find well-written Haskell more readable than well-written Python.

Average Haskell is probably less readable than average Python, because the Haskell community places less emphasis on readability, unfortunately.

vidarh · on April 18, 2012

It's not that I won't read up on their inner workings. I do. Compiler writing is a hobby of mine, and I love reading up on new concepts in relation to that. I've read a lot of the original papers for Haskell for example. It's that I won't bother learning to program in them any more - I find it wasted effort.

Haskell was a particularly obnoxious one - I love a lot of the concepts, but I find it horribly slow for me to read. I got so frustrated with it that I implemented a small little lambda calculus interpreter to play with the concepts because that ended up being an easier way for me to learn them than to wrestle with Haskell.

Personally I don't think Haskell will ever become mainstream to a large extent because it is too hard to read for people who are not maths geeks.

tikhonj · on April 16, 2012

I suspect you find Haskell difficult to read not because of its syntax--which is actually rather readable and surprisingly simple--but because it basically shares almost nothing with other languages you know (e.g. Python).

kibwen · on April 16, 2012

This is likely true! I was born and raised on Java, crashed headlong into Scheme (with painful, traumatic consequences), and just plain didn't "get" functional programming until I was gradually reintroduced to its concepts via R, Python, and Javascript. Consequently, I've found that the best way to learn a new language is in terms of ones that I already know (and I don't think I'm alone here). If you know of any Haskell tutorials that specifically cater to the mindset of Python programmers, I'd be happy to give it another shot. :)

timClicks · on April 16, 2012

Syntax matters. Take Erlang, great language that nobody learns because it looks and feels alien. Damien Katz, creator of CouchDB & now Couchbase, is pretty convinced this is its main reason for lack of adoption: http://qconlondon.com/dl/qcon-london-2012/slides/DamienKatz_... (~8 MB)

hxa7241 · on April 16, 2012

Well, there are two things here: syntax, and lexical form or lexical syntax -- the underlying structure and the textual representation. The term 'lexical form/syntax' ought to be more popularly used, and hence make it clearer to talk about exactly what we like or do not.

tqs · on April 16, 2012

This touches on some really important relationships between programming languages and cognition.

Alan Kay has been studying how children take to programming and many of the ideas from Smalltalk come from these studies.

When designing for children, every part of the user interface, including the syntax of the programming language, matters a lot. For example, he writes, "If we take functional relationships as an example, it has been shown that children readily understand them but have considerable difficulty with variables, and much more difficulty with parameters. The standard math syntax for functions with parameters requires some extra trained chunks to associated dummy names with actual parameters. Some computer languages allow conventions for prefixing the actual parameters with the dummy names. This is good. For younger children, it's likely that making these into complete assignment statements is an even better idea. An object oriented language can use instance variables for a long time before introducing the idea of passing parameters in a method, etc. Having really good trace and single-step visualizations is critical." http://www.donhopkins.com/drupal/node/140

Children have different cognitive needs than adult programmers (and different needs from each other depending on age). But cognitive needs matter in designing programming languages for adults too. I personally take the optimistic (or cynical?) view that we have a lot of room to improve our programming interfaces (not just languages, but the entire programming experience viewed holistically).

nickbauman · on April 16, 2012

What about languages that have no syntax? Or have very little syntax? In my experience, I find them easier to reason about, therefore easier to read and easier to write. My sense is that children would find this to be true as well.

tqs · on April 16, 2012

Could you explain or give examples of what you mean by no or little syntax?

nickbauman · on April 16, 2012

Lisp languages have virtually no syntax.

Example: Write a function that generates a string of all the numbers in a range from a start and end value.

Java (lots of syntax, many tokens):

String rangeString(int start, int end) { String oni = ""; for(int i = start; i < (1 + end); i++) { oni += String.valueOf(i); } }

Clojure (almost no syntax):

(defn [start end] (apply str (range start (inc end))))

Peaker · on April 16, 2012

Little lexical/concrete syntax. Quite a bit of abstract syntax.

I personally don't think that having less lexical syntax is useful for the things you mention. Lisp has extensible abstract syntax, so overall, it probably has the largest syntax of all.

tqs · on April 16, 2012

Ah, okay.

So by less syntax you mean fewer syntactic forms (parentheses, curly braces, do notation, etc.) and keywords (class, public, etc.) in the language.

I think in general, languages with fewer syntactic forms also have fewer but more powerful abstraction features. Lisps are pretty high up by this measure. They basically only have one abstraction feature: lambda expressions. Other languages have powerful abstraction features but more of them, like Haskell (lambdas, pattern matching, monadic do notation). And others have many but less powerful abstraction features (e.g. Java).

On the one hand, I certainly agree with you that more powerful abstractions make it easier to program. Once you've grokked the abstraction, you only need one brain "chunk" to deal with it freeing up your other chunks to work on the problem. If you need to juggle several different abstractions (e.g. wrap up your closure inside an object inside a class), you have fewer chunks to work on your problem.

On the other hand, people tend to have difficulty learning these higher abstractions in the first place. Alan Kay, in that quote, points out that young children (I think less than ~8 years old) have difficulty with the abstraction of parameterization. In particular they have trouble with mentally corresponding the placement of a parameter in a list with its role in the function. Named parameters make things a bit better, they help the children keep track of the role. Explicit variable assignment (imperative programming) makes it even easier. Of course, as less powerful abstractions, these styles don't tend to scale as well when working on more complex programs.

In general, I think people need help keeping track of specifics while working with higher abstractions until the abstractions become "natural" to them. Programming interfaces, including debugging interfaces, can help with this.

It's a bit of a paradox. When you truly understand an abstraction it tremendously improves the clarity of your programming experience. But without this understanding, abstractions can be a major hindrance.

You may be interested in further exploring "point-free" abstractions. It's kind of the next level up the abstraction ladder from explicitly assigned variables in imperative code, to lambda abstractions with parameters, to functions without the parameters at all. http://www.haskell.org/haskellwiki/Pointfree

Or approach from the concatenative side, http://evincarofautumn.blogspot.com/2012/02/why-concatenativ...

nickbauman · on April 17, 2012

Yes. I mean less syntax (you call it "syntactic forms", sure). I can think in this language because it's only the semantics that I have to remember, because there is no syntax to also remember. Any syntax you come up with will be inferior to data. But I think it could be a worthy challenge to teach kids Lisp. For instance, you know, you can make named parameters in your functions to help kids really easily in Lisp. It's, what, at most a 15 minute task to add that to any function.

nickbauman · on April 17, 2012

Read that link. Concatenative programming sounds like an interesting subset of Functional programming. You could easily write such a language in Lisp, too.

mattmanser · on April 16, 2012

I have real problems reading a large volume of code in certain programming languages. With a lot of code, even if I'm not familiar with the language, I can scan it and understand the broad strokes and what it's doing very quickly.

There are two languages I can't do that with. Javascript and Lisp. For me, Lisp and Javascript's syntax is too sparse. I've even used Javascript for a few days every week for the last 10 years and I still can't 'scan' a large code file. I know how it works, I know how to use it, I know the good parts and all the tricks. But I still dread cracking open a misbehaving Jquery plugin or any home grown, even self-written, non-trivial code base in it.

With both of them it's a laborious process of actually having to load the whole code into my head, to remember what other functions are doing, how things are initialized. Perhaps you can call it laziness, but personally I only 'load' a whole code base into my memory when I'm doing significant work with it. It takes time and effort I don't want to have to put in just to make a tweak.

So not being able to just skim read code is intensely annoying.

And that's down to syntax. So to me syntax does matter and it's why I'll keep railing against javascript being the only available language in the browser. It's not just a matter of preference, it's also a matter of productivity. I at least, and I suspect many others, are not very productive with sparse syntactical languages because they're so hard to skim read.

EDIT: Some people have the gift/skill of memorizing things very quickly. I don't, I have the gift/skill of understanding things very quickly, which generally means I can get away with a weak memory. I think a lot of programmers have one of those two skills, but rarely both. Sparse syntax is a bane for the latter.

kibwen · on April 16, 2012

Could you define "sparse"? Perhaps illustrate it with a counterexample in a language that does not exhibit that property.

(Also, have you tried Dart, Coffeescript, or any of the dozens of other languages that compile to Javascript?)

mattmanser · on April 16, 2012

Sparse in that function can actually mean function, closure, class or class method.

Javascript also suffers from a cowboy layout problem where, for example, the initializer of a class may be anywhere. Most programming languages have established a convention to where it is (the top of the class declaration (e.g. Ruby, Python) or underneath class variables/properties (e.g. Java, C#)).

In Lisp it's even worse as it's not even obvious immediately what is a function or an argument as it's just on the position of the words.

I do keep meaning to give coffeescript a whirl, but feels it solves the symptoms not the underlying cause, so it's not that high on my 'try it' list.

jon6 · on April 16, 2012

Syntax matters for people who are just learning a new language. When I was learning to program I thought about code in terms of the pascal statements I had to write. Now I think about more abstract ideas, about data flow and control flow. I can map those ideas to the language I am using at any given moment but that only came after many years of experience.

On a different note, I have abstained from commenting on Rust's attitude toward keyword names but aside #2 just bothers me. Naming conventions matter and since Rust may possibly become used in the future it may inspire some new generation of users to learn to program. 20 years later they will start writing their own languages and think "Hey, I used 'ret' in Rust and that brings back good memories so I'll just keep using it!". Truly an embarrassing part of hacker culture.

beothorn · on April 16, 2012

The smalltalk difference is not just syntax. Smalltalk uses messages instead of function calls. It's a different concept because smalltalk object can handle messages that are not declared on the class body. If you changed the parameters order it would be a different message, and you would not have a compile error, because it would still be possible to an instance to answer that message.

systems · on April 16, 2012

I think good syntax won't make a language better but bad syntax can make a language worst

Its like you will not work for a company because it have clean toilets, but if they have dirty toilets you wont work for them

If you find this logic familiar, its because i am basing it on motivation theory

etanol · on April 16, 2012

Brilliant