If you can't get the data from the endpoints the javascript hits then write your scraper in javascript and have it run in a headless browser, and it's the webkit engine so most sites test their site against it heavily.
Either pull the data out of the javascript objects or trigger your extraction from the html by attaching to the events in the javascript.
I've started playing with zombie.js recently as well - much lighter and faster than the ones that instrument a completely full browser. But has a full Javascript engine.
zombie.js is not a full browser. It's a poor emulation using jsdom as its backing. http://zombie.labnotes.org/guts Beware, for some applications, jsdom is super buggy.
We're also trying it for integration tests, as it is much quicker than Phantom or Selenium. Even there, where we control the standards-compliant site, it isn't quite good enough yet.
Would love to see more people helping make it so, though!
If you can't get the data from the endpoints the javascript hits then write your scraper in javascript and have it run in a headless browser, and it's the webkit engine so most sites test their site against it heavily.
Either pull the data out of the javascript objects or trigger your extraction from the html by attaching to the events in the javascript.