Kragen Sitaker did amazing work back in 2005/2006 'liberating' the OED first edition which is now (mostly) in the public domain [2]. He posted up fairly good scans of volumes 1-6 on (see [2]). However at the time he was unable to do much on the OCR front (no doubt because of the poor performance of open source OCR, particularly on such a complex text as the OED which has lots of non-standard english and font changes). With the better open source OCR engine it would be possible to convert the OED back into text and perhaps wikify it to allow for gradual proof-editing and correction.


  1. OCR text
  2. Load up in a wiki or the like for proof editing and correction


Was originally recorded as:

submitted 02 Jan '11, 19:31

rgrp's gravatar image

rgrp ♦♦
accept rate: 45%

Be the first one to respond to this idea!
toggle preview

Follow this question

By Email:

<span class='strong'>Here</span> (once you log in) you will be able to sign up for periodic email updates about this idea.



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text]( "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 02 Jan '11, 19:31

Seen: 987 times

Last updated: 02 Jan '11, 19:31

powered by OSQA