Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm curious to know why you think they'd be any more accurate than public blacklists. About a fifth of the spam that hits my domains originates from gmail servers, and the majority of people I know complain (when asked) about gmail's tendency to spam-bin legit mail.


I would wager that Google handles a large enough cross section of all email traffic such that their internal statistics on the trustworthiness of domains/ips would rival any other blacklist system.

I've never had issues inboxing on gmail accounts using SPF/DKIM/DMARC + Sendgrid, even when sending 125,000+ emails (legit!) per day.


Accuracy in spam filters is a difficult thing to measure, particularly if you don't have access to internal metrics on filter performance.

The primary limiting factor for most blacklists is not scale, but simply the fact that most of them have no more than a few data sources - most commonly spamtrap data. It's useful, but it's not a comprehensive enough data point to accurately evaluate mail.

Having dozens or hundreds of data points available - things like how many recipients open a message and spend time reading it, how quickly they seek out a message when initially opening their inbox, or a sending domain's pagerank - gives Google a considerable edge in assessing overall mail quality.

(Caveat: outbound filtering is often more difficult than inbound filtering - perhaps in part because there are fewer data points available when assessing outbound mail.)


Agreed on all points!


> majority of people I know complain (when asked) about gmail's tendency to spam-bin legit mail.

I observed a former boss, who was very technically competent otherwise, using "mark as spam" instead of delete.

I guess if he could underestimate that button so can a million other people.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: