Idea: the next Google Books

1) Take a collection of books from Project Gutenberg or Google Books on a given time period

2) Apply Named Entity Recognition to build a cross-book NLP index

3) Load this into a public Roam graph

4) View what N books have to say about a given event https://twitter.com/Conaw/status/1307107745397604354
You want to do this for one time period to debug the NER and demonstrate the value proposition.

As you add genres, you'll need to address new difficulties (eg Unicode). Once it works for all old books...then extend to Sci-Hub & Kindle?

See related tweet: https://twitter.com/mekarpeles/status/1307546539376566274
If this adds enough value, perhaps the next generation of authors will open source their books, load them into Roam, encourage annotation/updates/interaction, and monetize by charging for the community rather than the content.
You can follow @balajis.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: