Section 1.9 "Index, please meet neuron" is really enlightening, esp. the bullet pointers.

(Disclaimer: I won't post them since it'll be a big spoiler 😅)
For some awkward reason, I'm really glad to see Java code when chapter 2 features Lucene!! Also, Word2vec synonym expansion reminds me of @VeredShwartz blog on http://veredshwartz.blogspot.com/2017/08/paraphrasing.html?m=1
Also, Spiderman quote is paraphrased in 2.4.2 😆
Chapter 3 walkthrough on dl4j is pretty neat. If we do away with some of the OOP inits, it'll look like C++ and further removing the explicit types and semi-colon,and camelCase, it can easily be parsed as Python code (in my mind) lol...
The line-by-line code explanation in snippet 3.7 and 3.8 for alternate query expansion is really nice to read.

// where the "magic" happens 😆
Chapter 4: Autocomplete is a nice way to generate results that the search engine is more confident of.

Google does the same "do you mean" trick in #neuralempty too!
Table 4.1 comparing different Autocomplete outputs is a very good example of how #nlproc papers should present their picked cherries.

Each column telling the different systems' story and each row showing how with more context the story of the diff systems change.
Section 5.1 on the importance of ranking is also very enlightening!! Never thought of users that way.

Users are ____ and un________.

(P/S: Avoiding spoilers 😁)
Feeling a strange itch to start putting up @huggingface's Transformer + @srchvrs nmslib snippets to compliment the DL4J + Lucene code in the book...

Note to self: Must resist starting more side-projects...
Chapter 5.5 section on metrics is a good reminder that foundations don't change much. It has been ~7 years since I dealt with search and glad the same metrics I've learnt for the awesome tutors at COLI (Saarland) are still relevant today.
Chapter 6 section on recommender and MoreLikeThis is really interesting, it looks like Google has been using similar mechanisms in the "People also ask" feature.

I wonder what's the overlap between "Do you mean ...?" and "People also ask ...?".
Ah at last, was wondering when Seq2Seq will appear in the book and voila, Part 3 #neuralempty !!!
Oh really nice section on "Working with parallel corpora" in Section 7.3, introduction to TMX is a must read for all #neuralempty folks who hasn't heard of TMX before 😄
Snippets 7.7 to 7.12 is a very good introduction to unsupervised #neuralempty!!

Food for thought: Multilingual sesame street language models already have some pseudo joint learning, what would a project matrix learn in these pre-trained models
Chapter 8 is yet another reminder that lost knowledge exists in our field. When I first worked with images, CNN was already made popular. I'm embarrassed to say that this is the first time I've learnt about LIRE and I've only barely heard SIFT before.
Chapter 8 is really nice with introduction of many small concepts that are applicable in most ML tasks,

Representation, compression, nearest neighbour, locality sensitive hashing, variational approach and latent space (briefly but a good way to inject new knowledge).
You can follow @alvations.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: