HTTP/2 removes some of the overhead of requests, but there is still the problem of multiple round trips.
For example, if you request your top three friends and their most recent post using REST you'll probably need to do four requests. And you can't parallelize them because you need to know your friends' IDs before you can construct the comment requests.
Actually this can be addressed with HTTP/2 although I think the solution may be just as complicated. If the yet-to-be-known parameters are encoded in the query string then the requests can be pushed to the client before the client knows it will be requesting them. This could be done with a middleware that used the Referer header (and maybe some fancy isomorphism) to determine what should be pushed.
True, but then you have the server duplicating logic from the client. This is very similar to the custom endpoint solution, which breaks down when you have multiple clients needing different data. You end up either under or over fetching.
To prevent the over/under fetching you're describing you could partition your endpoints and make multiple requests. Although, thats definitely a code-maintenance win for GraphQL.
It seems like if you were co-executing the client on the server you could trivially achieve perfect fetching. GraphQL may actually over fetch in many situations. Here's an example: the client fetches a list of objects, filters it, and then fetches more data referenced by the results. With GraphQL, if you don't automagically parse the filter out of the client code, you over-fetch. However, the HTTP/2 solution could just push the 2nd fetch as it was made by the co-executed client.
All that being said, GraphQL certainly alleviates the server-side load co-execution would imply and that's likely more suitable to the scale Facebook operates at.
Yes, that's a tricky problem. Generally GraphQL solves it by filtering on the server. You'd request something like `friends(age: 25, order: dob, first: 10) { name, profilePicture }` and pass that straight through to the UI.
There are some situations where this doesn't work to well. For search suggestions, for example, you might not want to request `searchSuggestion(query: "Tom Smi") { <lots of data> }` on every keystroke because sequential queries will have a lot of duplication. In this case we can just fetch the IDs of the results and do a separate query to fetch the data for the people that we don't know about yet.
Having the server know about client queries (and therefore preemptively running queries) is something we specifically avoided with GraphQL. If the server knows what the client wants then sending any query across the wire doesn't make sense at all. It also falls down if data requirements client changes across client versions. You quickly end up in a place where the server has to know about all possible client queries, which is complex difficult to maintain.
I assume Relay performs the diff-outward-query/patch-inward-data thing. Frankly, that's fucking brilliant. I hope the people at Facebook are extremely pleased with themselves. If I understand correctly then once the number of items shown was less than the initial range of requested items, further refinements wouldn't need to make a second request. Relay could detect from the ids of the first request that it had all the data it needed.
HTTP/2 optimizations could just be layered on top of GraphQL post hoc. GraphQL/Relay is likely better for that purpose than a bag of endpoints. You get all the benefits you've mentioned and extremely bearable unoptimized performance. I guess I just needed to tease the two problems apart in my head.
Yup, and the same optimization can apply to two different queries requesting different fields on a single object, not just ranges.
GraphQL endpoints don't have to return a single JSON object either, you could hypothetically stream back objects as they become available (Netflix's Falcor seems to be heading in this direction too).
Ah, so it goes the other way too. Sort of like a request inliner/outliner that can perform dynamic optimizations. The client can inline requests using view-state unknown to the server. Then the server can outline them in whatever desirable way to provide an immediate and eventually complete response. That's clever.
For example, if you request your top three friends and their most recent post using REST you'll probably need to do four requests. And you can't parallelize them because you need to know your friends' IDs before you can construct the comment requests.