02 Nov 01:34

junhoyeo

5b17ef8

Latest

BetterOCR

🔍 BetterOCR combines results from multiple OCR engines with an 🧠 LLM to correct & reconstruct the output.

Before	After (✨ latest at v1.2.0)

Improved English/Korean text recognition with new Pororo OCR support! 🎉
Special thanks to @black7375 for https://github.com/black7375/korean_ocr_using_pororo (where he suggested using EasyOCR for text detection and BrainOCR (Pororo's OCR module) for text recognition) and #2.
Also kudos to the @kakaobrain team and @yunwoong7.

Notes

Pororo is used only if the language options (lang) specified include either 🇺🇸 English (en) or 🇰🇷 Korean (ko). Also additional dependencies listed in [tool.poetry.group.pororo.dependencies] must be available. (If not, it'll automatically be excluded from enabled engines.)

What's Changed

[ImgBot] Optimize images by @imgbot in #7
Write parser tests by @junhoyeo in #9
EasyPororoOCR Integration by @junhoyeo in #8

New Contributors

@imgbot made their first contribution in #7

Full Changelog: v1.1.2...v1.2.0

Contributors

black7375, kakaobrain, and 3 other contributors

Assets 2

29 Oct 06:52

junhoyeo

v1.1.2

198d4c1

v1.1.2 (🛠️ Bug Fix)

BetterOCR

🔍 BetterOCR combines results from multiple OCR engines with an 🧠 LLM to correct & reconstruct the output.

What's Changed

Fix Incorrect Inclusion of API_KEY in chat.completions.create Call by @snacsnoc in #3

New Contributors

@snacsnoc made their first contribution in #3

Full Changelog: v1.1.1...v1.1.2

Contributors

snacsnoc

Assets 2

29 Oct 05:59

junhoyeo

v1.1.1

618d7be

v1.1.1 (Bug Fixes for 📦 Box Detection)

BetterOCR

🔍 BetterOCR combines results from multiple OCR engines with an 🧠 LLM to correct & reconstruct the output.

What's Changed

Improved prompt in Box Detection
Fix bug inside detect_boxes's fallback logic (when LLM output format is invalid)

Full Changelog: v1.1.0...v1.1.1

Assets 2

29 Oct 04:59

junhoyeo

v1.1.0

3054a85

v1.1.0 (📦 Box Detection)

BetterOCR

🔍 BetterOCR combines results from multiple OCR engines with an 🧠 LLM to correct & reconstruct the output.

What's Changed

Box Detection (detect_boxes) had been implemented by @junhoyeo in #1
Demo: https://github.com/junhoyeo/BetterOCR#-box-detection (Example Script: https://github.com/junhoyeo/BetterOCR/blob/main/examples/detect_boxes.py)

Original	Detected

Full Changelog: https://github.com/junhoyeo/BetterOCR/commits/v1.1.0

Contributors

junhoyeo

Assets 2

28 Oct 08:06

junhoyeo

v1.0.1

a47f842

v1.0.1 (Initial Release 📖)

BetterOCR

🔍 Better text detection by combining multiple OCR engines with 🧠 LLM.

OCR still sucks! ... Especially when you're from the other side of the world (and face a significant lack of training data in your language) — or just not thrilled with noisy results.

BetterOCR combines results from multiple OCR engines with an LLM to correct & reconstruct the output.

🔍 OCR Engines: Currently supports EasyOCR and Tesseract.
🧠 LLM: Supports Chat models from OpenAI.
📒 Custom Context: Allows users to provide an optional context to use specific keywords such as proper nouns and product names. This assists in spelling correction and noise identification, ensuring accuracy even with rare or unconventional words.

Head over to 💯 Examples to view performace by languages (🇺🇸, 🇰🇷, 🇮🇳).

Coming Soon: improved interface, async support, box detection, and more.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BetterOCR

Notes

What's Changed

New Contributors

Contributors

BetterOCR

What's Changed

New Contributors

Contributors

BetterOCR

What's Changed

BetterOCR

What's Changed

Contributors

BetterOCR

Releases: junhoyeo/BetterOCR

v1.2.0 (🔍 Pororo OCR)

BetterOCR

Notes

What's Changed

New Contributors

Contributors

v1.1.2 (🛠️ Bug Fix)

BetterOCR

What's Changed

New Contributors

Contributors

v1.1.1 (Bug Fixes for 📦 Box Detection)

BetterOCR

What's Changed

v1.1.0 (📦 Box Detection)

BetterOCR

What's Changed

Contributors

v1.0.1 (Initial Release 📖)

BetterOCR