Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So this could be used to find similar images in a database?

For now I sort of tested a naive method where I convert all images in 64x64 black and white, and use a simple levenshtein. It's not efficient, but for not too large dataset, it works.

I guess I could just add the image histogram.

I'm still curious of how tineye works.



It is not only about visual similarity, but the semantics of the images as well (for example, all the dog photos should be close to each other, no matter the colours or the scene).

The simplest way to do that is probably to use one of the pretrained neural networks (like resnet), convert the images into embeddings, index them and use for search.

I'm sharing my article describing how to implement it: https://medium.com/p/5515270d27e3


This is really great. Thanks for sharing.


Have you looked at CLIP? You can use that to create a vector embedding for an image that includes semantic information (what's actually in the image - animals, colours, etc) - those could then be used with pgvector to find similar images.

CLIP is how search engines like https://lexica.art/?q=6dc768e2-7a7c-494d-9a39-fd8f27e69248 work.


Use CLIP could let you search by text sentences. And this could be used to accelerate the embedding similarity sorting part. Like https://mazzzystar.github.io/2022/12/29/Run-CLIP-on-iPhone-t...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: