I have admired Douglas Crawford's excellent JSON from the moment I saw it, it is...

crdoconnor · on June 20, 2018

>YAML is complex and could use a hair cut.

Out of curiosity, did you see the parser linked to at the end of the article? ( https://github.com/crdoconnor/strictyaml )

That was my attempt at giving YAML a haircut. I'd be curious to know what you thought.

Thank you for creating YAML, by the way. Even though part of that rant was quoted from me, I'm not negative on it like the author - I think the core was brilliantly designed. If you put two hierarchical documents side by side - one in TOML and another in YAML the YAML one is much, much clearer and cleaner.

clarkevans · on June 20, 2018

Thank you for StrictYAML I might just use it. It does look like a nice hair cut. You might wish to give Ingy a ring. He has been itching to move forward on a reduced/secure YAML subset.

That said, StrictYAML seems to be a tad bit more of a hair cut than I'd imagine. I'd keep nodes/anchors, since I think a graph storage model is underrated; I think that data processing techniques just haven't caught up with graph structures.

Further, I'm not sure everything can be easily typed based upon a schema. Hence, I'm not sure about completely dropping implicit types, perhaps you may want to provide a way for applications to resolve them if they wish. For example, an application may want to attempt to treat anything starting with "[" or "{" as JSON sub-tree. Perhaps keeping "!tag" but handing it off to the application to resolve might also be a good idea in this regard. Even so, typing should be done at the application level and default to something very boring.

crdoconnor · on June 20, 2018

>Thanks for StrictYAML, I might just use it.

Thanks, that's very flattering.

> I'd keep nodes/anchors, since I think a graph model is underrated

Well, you can create graph models without it (and I do) - you can just use string identifiers to identify nodes and let the application decide what that means.

I always thought the intent behind nodes/anchors was not so much graph models but rather to take repetitive YAML and make it DRY. That appears to be how it is used, e.g. in gitlab's ci YAML.

>I'm not sure about completely dropping implicit types, perhaps you may want to provide a way for applications to resolve them if they wish. For example, an application may want to attempt to treat anything starting with [ or { as JSON.

I think that would cause surprise type conversions. There will be plenty of times when you want something to start with a [ or { and you won't want it parsed as JSON.

I embed snippets of JSON in YAML multiline strings sometimes and I usually just parse it directly as a string. Then I run that string through a JSON parser elsewhere in the code.

>You might wish to give Ingy a ring.

I would like that.

clarkevans · on June 20, 2018

> I think that would cause surprise type conversions.

YAML has traditionally been used as the basis of higher-level configuration files for particular applications. What I'm saying is that implicit typing should be permitted, but delegated to those applications.

Conversely, I'm not saying that StrictYAML should do anything by default with unquoted values, except reporting them to the application as being an unquoted value. This way the application could choose to process the value differently from those that are quoted.

patrec · on June 21, 2018

An interesting idea, but it's not clear that this will be less confusing or that application authors will make better at avoiding config languages gotchas than config language designers such as yourself (and existing app specific config languages suggest otherwise).

I think a reason this won't necessarily fix the problem with unmet expectations is that identical constructs in different but analogous yaml files would be likely to end up with very different semantics and users effectively have to remember which particular idiosyncratic YAML dialect choices various apps make. Say

   version: 1.3

means the string "1.3" in app a), the float 1.3 in app b) and a version number in app c) one. Furthermore let's assume that app c) required a version number, whereas a) and b) required strings.

Another, more subtle problem, is that such a scheme would make it more likely that applications would end up parsing raw string representations themselves (with ensuing subtle differences even for things which are nominally meant to be identical, say dates or numbers and possibly security problems as well).

daveFNbuck · on June 21, 2018

> I always thought the intent behind nodes/anchors was not so much graph models but rather to take repetitive YAML and make it DRY. That appears to be how it is used, e.g. in gitlab's ci YAML.

That's how I use it too. When I read about competing formats, that's the first feature I check for. It's really key for readability and usability in some use cases.

veli_joza · on June 21, 2018

Great to have you here elaborating on various design choices. Are you perhaps familiar with OGDL [1] and what's your opinion?

[1] http://ogdl.org/spec

clarkevans · on June 21, 2018

I don't have much to suggest. For YAML, the use of whitespace, colons and dashes primarily emerged from usability testing with domain experts who are not programmers. In particular, testing was done in the context of an application that needed a configuration and data auditing interface, an accounting application. Even anchors/aliases worked in this context and supported the application's use by making the audit records less repetitive without introducing artificial handles.

Other use cases such as dumping any in-memory data structure from memory, perhaps out of a sense that we needed full completeness, actually didn't have any end-user usability testing. Round-tripping data seems in retrospect to be a diversion from the primary value that YAML provided.

freedomben · on June 20, 2018

Is there an implementation of strict yaml that you know of for Ruby?

bmurphy1976 · on June 20, 2018

If you are writing a new YAML implementation, then yeah, you want a simpler spec to follow.

If on the other hand you are using a YAML library... I've had pretty good success using YAML compatibly across Python, Ruby, C# and Go projects. Do you have a particular issue in mind that the existing Ruby implementation doesn't address?

dillnumber0 · on June 21, 2018

It's an implementation of YAML, not StrictYAML which has different semantics.

freedomben · on June 21, 2018

Yes, strict YAML is different than YAML. If you take a look at the github page linked in the GP, it explains the differences.

ramses0 · on June 21, 2018

"JSON didn't exist because Us and We"?

YAML is an invented serialization format, JSON is a discovered one. As CrOCKford points out, JSON existed as long as JS existed, he just called it out and put a name on it.

Anyway, XML is a strong anti-pattern (too much security, even if you get it right on your end, the other party likely screwed something up). YAML seems to be going down that path too.

TOML seems to be "the JSON of *.ini" (ie: discovering old conventions, rather than inventing new ones), and I'm glad to have been exposed to it.

clarkevans · on June 21, 2018

> "JSON didn't exist because Us and We"?

If you define JSON as the underlying practice that Crawford later named and documented, then sure, what I wrote reads completely wrong headed. However, when we were working on YAML, JSON was not yet called out and given a name.

I believe the most important convention that YAML and JSON shared was a recognition of the typed map/list/scalar model used by modern languages. Further, as far as conventions go, I think there's quite a bit to be said about languages that use light-weight structural markers such as: indentation, colon and dash.