Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If anyone is interested in using spaced repetition for language learning, you might like to give http://readlang.com a shot (disclaimer - my site). It generates a flashcard every time you come across an unknown word or phrase while reading a book or website, and each flashcard includes the context sentence. You can also export to other software like Anki.


I checked out the site. Congrats for great work. I'll recommend it to people I know.

There are so many language pairs you support. What sources (dictionaries) do you use for translations? Wiktionary?

Since each word might translate differently in different contexts, do you also select the most appropriate translation to display? It would be interesting to discuss about the most optimal methods for that. Google Translate often gets this wrong. (I am working on NLP/AI for language learning myself although we will mostly use it for conversations at first.) I believe many people here can give interesting perspectives on this hard problem (semantic disambiguation for translation).


Thanks!

For the inline translations I use the Google Translate API, which is surprisingly good for words and short phrases. Unfortunately the API doesn't provide multiple translations, even though they show them on their own web site. Grrr.

Users are encouraged to edit translations before learning them, and I often think it would be awesome to crowd source appropriate translations for given contexts using this data. I'm getting ahead of myself though, I'd need orders of magnitude more active users and development time for that to work.


Just an idea. For some individual words and common phrases, you can supplement the API translation with info from Wiktionary and many freely available dictionaries on the Web.

When you have a lot of users, providing static translation for individual words with single meaning should also save you money in the long run, as Google Translate API charges you for all translations the user activates.


I did look into Wikitionary briefly but it looked horrible to use since the data isn't stored in a structured format. It seemed I'd need to parse the wiki markup and the formatting conventions were different for the different languages, which would make it a tricky job to support all the languages that Readlang does.

It made me wish for an alternative open bilingual dictionary project which is structured and machine readable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: