Hacker Newsnew | past | comments | ask | show | jobs | submit | vklmn's commentslogin

You need to validate complicated optimization ideas, it's seems to be fine extracting data from documents you already have, ask for missing info / documents and calculating numbers


I don't think that it's google's fault. Google sometimes trade ads on auctions, meaning they issue and HTTP request to partners asking "Hey, you want to show an ad here", and partner respond with price and HTML code, the highest bidder wins and HTTP code is inserted.

HTTP contains JavaScript, and theoretically anything can be executed within the browser (I've seen people mining bitcoins!).

Google can't monitor an execute every HTML snippet, but they doing pretty great job sampling responses and evaluating some of them. Fraudsters are smart, and trying to understand if the code is executed on Google's servers, but overall they are loosing.

It seems like a case where google's system didn't work.

By they way, all google partners are listed here: https://developers.google.com/third-party-ads/adx-vendors. Usually, it's possible to track down who's exactly responsible by looking at dev console


> I don't think that it's google's fault

Of course it is. It's their ad network.

> Google can't monitor an execute every HTML snippet

Of course they can. There's no excuse for allowing this nonsense on their network.


Well, they do monitor snippets. There's a lot more going on here than meets the eye.

The problem is bad actors are really good at evading detection through obfuscation and dynamically serving different code depending on the IP address so the creative behaves normally if it thinks you're a server Chrome instance and does bad stuff for real people.

To make matters worse bad actors have automated their process, so when they discover they're blocked everywhere, they rotate to a new account, domain, change their obfuscated code to look different, and are back up in a few hours. This leaves everyone else playing whack-a-mole.

And even if Google sees through all of that, the code might never actually touch Google, but come from one of the many marketplaces or resellers being rendered through Google's Ad Server. For any given site, the list of what markets they work with is usually public. This site, https://techsparx.com/ads.txt, is doing business with way too many markets - 680 of which are resellers of other markets' inventory.

This means if you're a bad actor, you can evade anyone capable of seeing through your obfuscation entirely, select for marketplaces that have extremely poor quality control (I see a few), and wind up on this website.


That sill is Google's fault as far as I'm concerned as an end user.


If you're looking at it from an end-user perspective then it's the fault of techsparx.com.


If Google can't guarantee no malicious javascript then they should strip all javascript.

If I serve any content to my users, then I'm responsible for any malware it contains.


Hi! We don't have plans to add Vertica yet. But if your team is willing to help or send PR we can reconsider!Feel free to reach out directly to me at vladimir@jitsu.com


Yes, just add more jitsu nodes. It's hard to answer how many nodes do you need (depends on transformations, CPU/RAM/etc), but you can count on thousands request per second per node at least


How is event ordering handled when multiple instances of Jitsu are involved.


Thanks, we will fix that!


That's what the website missing indeed. We have a few words about that in docs, but it's still not enough https://jitsu.com/docs/internals/jitsu-server#mapping-step

Overall, Jitsu tries to decompose (aka flatten) JSON as deep as possible. E.g. {a: {b:1, c:2}} will become a_b=1, a_c=2. If column is missing, it will be created. We don't decompose arrays so far


Can one disable the flattening? BigQuery, for example, supports nested objects just fine, and flattening them for no particular reason seems counter-productive.

I work on an application where we already have a schema in BQ, but we'd like to start moving events through something like Jitsu or Rudderstack. This uses nested objects extensively. Looking at Jitsu, it looks like we wouldn't be able to keep our existing table schema.

PS. Whoever wrote your BigQuery code does not understand Go contexts. Only functions should take a context argument; you should almost never store contexts in structs!


Two reasons a) you can push events from your apps b) if you want to have more connectors available (Singer, and few native connectors)


Airbyte uses Singer too, why would it have less connectors? That doesn't make sense.


Check the answer in another thread! https://news.ycombinator.com/item?id=29106531


Mixpanel will store the data for you and do visualization. Jitsu just help you to get you data to your data warehouse.

Downside: you'll need to build all visualization by yourself. Fortunately that's easy with tools such as Looker, Mode, Metabase etc

Upside: you can do with your data whatever you want - built any reports, join with other datasets etc. You not limited by reports MixPanel team build.

In reality, Jitsu and MixPanel could co-exist. Jitsu support s MixPanel as a destination (e.g. you send data to Jitsu ; Jitsu sends it to MixPanel and data warehouse).


Feedback on "Jitsu support s MixPanel as a destination"

This is not really clear from the website. Mixpanel is mentioned in https://jitsu.com/sources but NOT in https://jitsu.com/destinations. Also the docs seem very clear about that.

We are currently looking at Segment, Jitsu and others. While we generally liked Jitsu, this was kind of a big deal for us and made us lean towards Segment.

However, no final decision has been made yet. ;-)


Mixpanel employee here, I actually just emailed Jitsu this morning to ask about adding Mixpanel as a destination. We're willing to work with them (or any of you!) to get the PR open. Feel free to reach out directly to josh@mixpanel.com!


You can hack almost anything using inbound Event API (https://jitsu.com/docs/sending-data/api) and JavaScript transformations (https://jitsu.com/blog/javascript-transform)


A standard webhook source abstraction would be very useful, that captures the URI, POST payload and HTTP headers.

This way I can setup my source in Jitsu, get a unique URL, and then paste that URL into the tool generating webhook events (e.g. Shopify). A normalized schema based on the JSON payload doesn't need to be created for this to be useful.


Ok cool.

As a bit of feedback, I highly suggest adding Webhooks as a source on your marketing site.

The first thing I did is navigate to the Sources page and searched for "webhook" which brought up no results.

I then searched your docs which only mention Webhooks in the context of being a destination rather than a source.

I realise now that you have quite a flexible ingestion API, but it took quite a while (and your confirmation above) to understand this!

The product looks awesome though! Good luck with the launch.


Thanks for observation! We will add it. A fresh look to marketing materials is always appreciated!


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: