Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It is confusing, but they have diff calls for pdfs vs images. In their example google colab: https://colab.research.google.com/drive/11NdqWVwC_TtJyKT6cmu...

The first couple of sections are for pdfs and you need to skip all that (search for "And Image files...") to find the image extraction portion. Basically it needs ImageURLChunk instead of DocumentURLChunk.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: