Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks for this clarification

>> What do you mean by half a hash table?

What I'm trying to get across is my impression that you don't have to load an entire B-tree index into memory to use it... you can traverse it. Whereas a hash index can't be "streamed" or traversed like that, you have to use the whole thing.

I read some papers on hash indexes and hash joins several months ago, and know little about the Postgres specifics... just wanting to wrap my head around it, so please help me understand if I'm off base with this.



>> That means you don't load it into memory at all, you just jump to Bucket[Hash(KEY)] and then check there for KEY (it may or may not be there due to false positives and the bucket being full).

Right, of course. You're making me realize this really is my confusion between hash indexes and hash joins. The issue I'm thinking of is all about dynamically constructed hash joins. Sorry, you called it right from the start but it took me a minute to see... thanks for helping talk through it.


A hash index allows you to determine which bucket an arbitrary key is mapped to without having to look anything else up.

That means you don't load it into memory at all, you just jump to Bucket[Hash(KEY)] and then check there for KEY (it may or may not be there due to false positives and the bucket being full).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: