Thread by @kzwa, Do me a favor? Drop what you're doing RIGHT NOW, and take [...]

Do me a favor? Drop what you& #39;re doing RIGHT NOW, and take a second to play with this new @LC_Labs project:

Search photos from millions of historic newspapers by keyword from their captions THEN train a machine learning algorithm to find more visually similar items https://twitter.com/LC_Labs/status/1305884978253901825">https://twitter.com/LC_Labs/s...

https://twitter.com/LC_Labs/status/1305884978253901825

Congratulations to Innovator in Residence, @lee_bcg, who managed to make something important, useful, and fun in his short residency, which is the magic middle of the happy software Venn diagram

There are a few things I love about this experiment: one is that it& #39;s a tangible example of machine learning in our context, following several reports, experiments, and meetings over the past year or so https://labs.loc.gov/work/reports/ ">https://labs.loc.gov/work/repo...

Reports | Work | Library of Congress

In March read the books you've always meant to read We write or commission reports when we need to know more. Perhaps they’ll be useful for you and your organization too!

https://labs.loc.gov/work/reports/

This multi-modal approach (including practical experimentation, consultation with researchers and the literature, collaboration with implementors) will hopefully allow us to consider many facets of this technology before we think about production applications.

ANOTHER thing that I absolutely LOVE about this project is that it highlights the value of "the village" in innovation. Obviously, what makes this experiment possible is @lee_bcg& #39;s creativity, insight, drive, collaborative approach & hard-won technical ability

What& #39;s less obvious are all of the things that came first and during, which I& #39;ll list (incompletely!) here:

1. The NDNP newspaper program, a partnership between @NEHgov & @librarycongress which has enabled millions of historic newspaper pages to be published on the open web

2. The newspapers from NDNP are served as a corpus on the Chronicling America website, which later added an API. Both the website and the API were built by Library developers in partnership with users and product owners https://chroniclingamerica.loc.gov/about/api/ ">https://chroniclingamerica.loc.gov/about/api...

When @LC_Labs was first considering an Innovator in Residence program (led by @JaimeMears and later joined by @JakewayEileen), we piloted it internally, inviting Library staff to propose a short-term project.

3. @tongwang, a senior software developer at the Library, proposed making a crowdsourcing tool that would make the images in Chronicling America more discoverable. (We OCR the text in the newspapers, but captions are often smaller so have more errors.)

So he made, and LC Labs launched, Beyond Words, one of the first experiments in Labs http://beyondwords.labs.loc.gov/

It">https://beyondwords.labs.loc.gov/">... invited anyone to draw bounding boxes around images (photos, cartoons, maps, etc) in the newspapers and transcribe the captions.

This experiment was LC Lab& #39;s (though not the Library& #39;s) first crowdsourcing project, and the learnings we took were super useful when we later spun up @Crowd_LOC. I should also mention this launch was @MeghaninMotion& #39;s first assignment, which she CRUSHED

4. I also want to highlight all of the consultation and guidance that folks were so generous with, including folks from our Serials division, other colleagues in Labs I haven& #39;t already mentioned ( @opba, @leahwg, @librlaurie), Ben& #39;s academic colleagues, deployment help from @acdha

5. Not to mention all of the open-source software, libraries and tools that all of this rests on, like a foundation.

There& #39;s beauty in seeing this experiment as a product of genius. There& #39;s also beauty in seeing it as a shared story with many narrators, each handing it off to another to build on. Both are true and not in conflict.

Thanks for listening to KZ story time. Please add on here if you think of elements of this story that are important to tell (or tag people relevant to it) <3

Latest Threads Unrolled: