Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

    import csv
    
    r = csv.reader(open('file.csv'))
    first_row = r.next()
    for row in r:
        if row[0] == first_row[0]:
            return True
    return False
287MB file 100m rows 15.804s peak memory usage 7MB


Thank you, that is beautiful. I've been proved decisively wrong and I've gained some greater appreciation for python's libraries.


It's not even a library thing. Reading a file in this way does reads in approximately optimal (for the FS) sized blocks. The CSV parser just rides on top of stdio.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: