X-Received: by 2002:a05:620a:100d:: with SMTP id z13mr11901366qkj.266.1581525522760; Wed, 12 Feb 2020 08:38:42 -0800 (PST) X-Received: by 2002:a81:8785:: with SMTP id x127mr9630438ywf.455.1581525522501; Wed, 12 Feb 2020 08:38:42 -0800 (PST) Path: csiph.com!xmission!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail Newsgroups: comp.lang.java.programmer Date: Wed, 12 Feb 2020 08:38:42 -0800 (PST) Complaints-To: groups-abuse@google.com Injection-Info: google-groups.googlegroups.com; posting-host=2607:fb90:9c39:58f4:8cfd:ecfc:edc5:6578; posting-account=kTs1ygoAAACgG1TSoyECpovEyy-V6_8b NNTP-Posting-Host: 2607:fb90:9c39:58f4:8cfd:ecfc:edc5:6578 User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <5c823915-a4fb-49c5-a24f-bb2e741f69eb@googlegroups.com> Subject: Optical Character Recogontion Algorithm question From: chad altenburg Injection-Date: Wed, 12 Feb 2020 16:38:42 +0000 Content-Type: text/plain; charset="UTF-8" Lines: 8 Xref: csiph.com comp.lang.java.programmer:39290 I have a camera that can read in text from an image like 12:05 PM And based on this image, it will convert it to the corresponding text so that the computer can further processes this time. The problem is that the "0" has a line through it. As a result, sometimes the algorithm will interpret the "0" with a line though it as either an "8" or an "e". Should I handle this as special cases in the OCR processing system itself or should I attempt to maybe try to further train the images at the Machine Learning code level?