I have admired Douglas Crawford's excellent JSON from the moment I saw it, it is a model of simplicity. I also like TOML and wish it all the best. By contrast, YAML is complex and could use a hair cut.
When I say "JSON didn't exist", what I mean is that it wasn't popular or known to us when we were working on YAML. So, please excuse my sloppy wording. For me, the work on what would become YAML started with a few of us in 1999 (from SML-DEV list). In January of 2001 we picked the name and had early releases. It took a few years of iteration before we had a specification the collaborators (Perl, Python, and Ruby) could all bless.
Anyway, with regard to Crawford's excellent work, JSON. It is a coincidence that YAML's in-line format happened to align. Although, it's probably because of a "C" ancestor, not JavaScript. The main influence on the YAML syntax was RFC0822 (e-mail), only that from my perspective, it needed to be a typed graph. In fact, we documented where we stole ideas from, to the best we could recall at that time: http://yaml.org/spec/1.0/#id2488920.
That was my attempt at giving YAML a haircut. I'd be curious to know what you thought.
Thank you for creating YAML, by the way. Even though part of that rant was quoted from me, I'm not negative on it like the author - I think the core was brilliantly designed. If you put two hierarchical documents side by side - one in TOML and another in YAML the YAML one is much, much clearer and cleaner.
Thank you for StrictYAML I might just use it. It does look like a nice hair cut. You might wish to give Ingy a ring. He has been itching to move forward on a reduced/secure YAML subset.
That said, StrictYAML seems to be a tad bit more of a hair cut than I'd imagine. I'd keep nodes/anchors, since I think a graph storage model is underrated; I think that data processing techniques just haven't caught up with graph structures.
Further, I'm not sure everything can be easily typed based upon a schema. Hence, I'm not sure about completely dropping implicit types, perhaps you may want to provide a way for applications to resolve them if they wish. For example, an application may want to attempt to treat anything starting with "[" or "{" as JSON sub-tree. Perhaps keeping "!tag" but handing it off to the application to resolve might also be a good idea in this regard. Even so, typing should be done at the application level and default to something very boring.
> I'd keep nodes/anchors, since I think a graph model is underrated
Well, you can create graph models without it (and I do) - you can just use string identifiers to identify nodes and let the application decide what that means.
I always thought the intent behind nodes/anchors was not so much graph models but rather to take repetitive YAML and make it DRY. That appears to be how it is used, e.g. in gitlab's ci YAML.
>I'm not sure about completely dropping implicit types, perhaps you may want to provide a way for applications to resolve them if they wish. For example, an application may want to attempt to treat anything starting with [ or { as JSON.
I think that would cause surprise type conversions. There will be plenty of times when you want something to start with a [ or { and you won't want it parsed as JSON.
I embed snippets of JSON in YAML multiline strings sometimes and I usually just parse it directly as a string. Then I run that string through a JSON parser elsewhere in the code.
> I think that would cause surprise type conversions.
YAML has traditionally been used as the basis of higher-level configuration files for particular applications. What I'm saying is that implicit typing should be permitted, but delegated to those applications.
Conversely, I'm not saying that StrictYAML should do anything by default with unquoted values, except reporting them to the application as being an unquoted value. This way the application could choose to process the value differently from those that are quoted.
An interesting idea, but it's not clear that this will be less confusing or that application authors will make better at avoiding config languages gotchas than config language designers such as yourself (and existing app specific config languages suggest otherwise).
I think a reason this won't necessarily fix the problem with unmet expectations is that identical constructs in different but analogous yaml files would be likely to end up with very different semantics and users effectively have to remember which particular idiosyncratic YAML dialect choices various apps make. Say
version: 1.3
means the string "1.3" in app a), the float 1.3 in app b) and a version number in app c) one. Furthermore let's assume that app c) required a version number, whereas a) and b) required strings.
Another, more subtle problem, is that such a scheme would make it more likely that applications would end up parsing raw string representations themselves (with ensuing subtle differences even for things which are nominally meant to be identical, say dates or numbers and possibly security problems as well).
> I always thought the intent behind nodes/anchors was not so much graph models but rather to take repetitive YAML and make it DRY. That appears to be how it is used, e.g. in gitlab's ci YAML.
That's how I use it too. When I read about competing formats, that's the first feature I check for. It's really key for readability and usability in some use cases.
I don't have much to suggest. For YAML, the use of whitespace, colons and dashes primarily emerged from usability testing with domain experts who are not programmers. In particular, testing was done in the context of an application that needed a configuration and data auditing interface, an accounting application. Even anchors/aliases worked in this context and supported the application's use by making the audit records less repetitive without introducing artificial handles.
Other use cases such as dumping any in-memory data structure from memory, perhaps out of a sense that we needed full completeness, actually didn't have any end-user usability testing. Round-tripping data seems in retrospect to be a diversion from the primary value that YAML provided.
If you are writing a new YAML implementation, then yeah, you want a simpler spec to follow.
If on the other hand you are using a YAML library... I've had pretty good success using YAML compatibly across Python, Ruby, C# and Go projects. Do you have a particular issue in mind that the existing Ruby implementation doesn't address?
YAML is an invented serialization format, JSON is a discovered one. As CrOCKford points out, JSON existed as long as JS existed, he just called it out and put a name on it.
Anyway, XML is a strong anti-pattern (too much security, even if you get it right on your end, the other party likely screwed something up). YAML seems to be going down that path too.
TOML seems to be "the JSON of *.ini" (ie: discovering old conventions, rather than inventing new ones), and I'm glad to have been exposed to it.
If you define JSON as the underlying practice that Crawford later named and documented, then sure, what I wrote reads completely wrong headed. However, when we were working on YAML, JSON was not yet called out and given a name.
I believe the most important convention that YAML and JSON shared was a recognition of the typed map/list/scalar model used by modern languages. Further, as far as conventions go, I think there's quite a bit to be said about languages that use light-weight structural markers such as: indentation, colon and dash.
When I say "JSON didn't exist", what I mean is that it wasn't popular or known to us when we were working on YAML. So, please excuse my sloppy wording. For me, the work on what would become YAML started with a few of us in 1999 (from SML-DEV list). In January of 2001 we picked the name and had early releases. It took a few years of iteration before we had a specification the collaborators (Perl, Python, and Ruby) could all bless.
Anyway, with regard to Crawford's excellent work, JSON. It is a coincidence that YAML's in-line format happened to align. Although, it's probably because of a "C" ancestor, not JavaScript. The main influence on the YAML syntax was RFC0822 (e-mail), only that from my perspective, it needed to be a typed graph. In fact, we documented where we stole ideas from, to the best we could recall at that time: http://yaml.org/spec/1.0/#id2488920.