Skip to content

This project is a web-based application that enables users to upload images and automatically extract any visible text using Optical Character Recognition (OCR). The backend is built with Flask and integrates OpenCV for image preprocessing and Tesseract OCR and frontend responsive interface for uploading images and viewing the extracted text.

License

Notifications You must be signed in to change notification settings

shakiliitju/Image-to-Text-Converter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image-to-Text Converter

A modern web application for extracting visible text from images using advanced OCR (Optical Character Recognition) technology. Built with Flask, OpenCV, Pillow, and Tesseract OCR for accurate text extraction from various image formats.


Features

  • Upload images in multiple formats (PNG, JPG, JPEG, GIF, BMP, TIFF)
  • Automatic image preprocessing with OpenCV for enhanced OCR accuracy
  • Advanced text extraction using Tesseract OCR
  • Real-time text extraction and display
  • Clean, responsive web interface
  • Cross-platform compatibility (Windows, Linux, macOS)

Installation

1. Install Tesseract OCR

Windows:
Download and install from UB Mannheim’s Tesseract page (commonly recommended for Windows users).
Default path: C:\Program Files\Tesseract-OCR

Linux (Ubuntu/Debian):

sudo apt update
sudo apt install tesseract-ocr

macOS:

brew install tesseract

2. Add Tesseract to your System PATH (Windows)

  • Open Start Menu → search "Environment Variables" → Edit the system environment variables.
  • Click "Environment Variables".
  • In "System Variables", select Path → Edit → New.
  • Add your Tesseract install directory (e.g. C:\Program Files\Tesseract-OCR).
  • Restart your terminal.

Test installation:

tesseract -v

3. Clone the Repository and Install Python Dependencies

git clone https://github.com/shakiliitju/Image-to-Text-Converter.git
cd Image-to-Text-Converter

4. Install Python Dependencies

Install dependencies using the included requirements.txt file:

pip install -r requirements.txt

The requirements.txt includes:

Flask
opencv-python
pytesseract
Pillow
numpy
requests

Usage

  1. Configure Tesseract Path: Make sure Tesseract is installed and the path in app.py is correct:

    pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
  2. Start the Flask Application:

    python app.py
  3. Access the Application: Open your web browser and navigate to http://127.0.0.1:5000

  4. Upload and Process: Select an image file and click upload to extract text from the image.


Project Structure

Image-to-Text-Converter/
│
├── app.py                 # Main Flask application
├── requirements.txt       # Python dependencies
├── LICENSE               # MIT License file
├── README.md            # Project documentation
├── uploads/             # Directory for uploaded images
├── templates/           # HTML templates
│   └── index.html      # Main web interface
└── static/             # Static files (CSS, JS)
    ├── style.css       # Stylesheet
    └── script.js       # JavaScript functionality

Troubleshooting

Common Issues

  1. Tesseract not found error:

    • Ensure Tesseract is properly installed
    • Verify the path in app.py matches your installation
    • Add Tesseract to your system PATH
  2. Poor OCR results:

    • Use high-resolution images with clear text
    • Ensure good contrast between text and background
    • Try preprocessing the image manually if needed
  3. Flask server not starting:

    • Check if port 5000 is available
    • Ensure all dependencies are installed
    • Run pip install -r requirements.txt again

Notes

  • This tool extracts visible text overlays from images, not diagnostic scan data
  • For optimal results, use high-resolution images with clear, readable text
  • Supported image formats: PNG, JPG, JPEG, GIF, BMP, TIFF
  • For PDF or DICOM support, convert pages/files to image format before uploading
  • The application automatically preprocesses images using OpenCV for better OCR accuracy

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

MIT License


Acknowledgements

About

This project is a web-based application that enables users to upload images and automatically extract any visible text using Optical Character Recognition (OCR). The backend is built with Flask and integrates OpenCV for image preprocessing and Tesseract OCR and frontend responsive interface for uploading images and viewing the extracted text.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published