UW English Document Image Database I

               A Database of Document Images for OCR Research

  2 CD-ROM set containing 1147 document page images from English
  Scientific and Technical Journals having

- Binary images scanned from 1st and other generation photocopies

- Binary and grayscale images scanned directly from technical journals

- Synthetic noise-free images generated from LaTeX files

- Document images from UNLV ISRI database

- All document images zoned and tagged

- Software for OCR performance evaluation

- Software for simulation of photocopy degradation

- Text ground truth generated from two independent data-entry
    operators followed by three independent verifications

Each document page has associated with it

- Text ground truth data for each text zone

- Bounding box information for each zone on the page

- Coarse level attributes for each document page

- Finer level attributes (such as font size, alignment etc.) for each zone

-  Qualitative information on the condition of each page

AT & T Bell Labs degraded character images database

Price $200 plus $10 shipping and handling.

Make your P.O. out to

        Intelligent Systems Laboratory
        Dept. of Electrical Engineering, FT-10
        University of Washington
        Seattle, WA 98195
        Attention: Dr. Robert M. Haralick

        Phone: (206) 685-4974
        FAX: (206) 543-3842
        e-mail: haralick@ee.washington.edu

Please make checks payable to Intelligent Systems Laboratory,
Univ. of Washington.