Flaskraper

Flaskraper is a Flask web app that searches job boards and displays scraped listings in the browser. Pick a source, enter a keyword, and get back job titles, companies, descriptions, and links — all fetched live from the target site.

Features

Search-engine-style home page with centered title and search bar
Multi-site support via a scraper registry
Results table with title, company, description, and a link to the original posting
Pluggable scraper architecture — each site defines its own URL builder and HTML parsers

Supported sources

Source	Example query
Berlin Startup Jobs	`python`, `javascript`, `typescript`
Web3 Careers	`python`, `rust`, `solidity`
We Work Remotely	`python`, `react`, `design`

Tech stack

Flask — web framework
BeautifulSoup + lxml — HTML parsing
requests — HTTP client
gunicorn — production WSGI server

Project structure

flaskraper/
├── app.py                  # WSGI entry point for gunicorn
├── main.py                 # Local dev entry point
├── requirements.txt
├── render.yaml             # Render deployment config
├── templates/
│   ├── home.html           # Home / search form
│   └── search.html         # Search results
└── flaskraper/
    ├── __init__.py         # create_app() and routes
    ├── pages/
    │   ├── home.py         # Home page context
    │   └── search.py       # Search page context
    └── scrapers/
        ├── registry.py     # Scraper registry and config
        ├── scrapper.py     # Generic pagination scraper
        ├── runner.py       # Runs a scraper with error handling
        ├── berlin_startup_jobs.py
        ├── web3_careers.py
        └── we_work_remotely.py

How it works

The user submits a keyword and source from the home page (/).
Flask routes the request to /search?q=...&site=....
The selected scraper builds a target URL from the query.
The Scrapper class fetches the page, detects pagination, and collects job listings.
Each job is normalized to a dict with title, company, description, and link.
Results are rendered in search.html.

Getting started

Prerequisites

Python 3.11+

Install

git clone <repository-url>
cd flaskraper
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Run locally

python3 main.py

Open http://localhost:3333.

The dev server binds to 0.0.0.0 and uses port 3333 by default. Override with the PORT environment variable:

PORT=8000 python3 main.py

Run with gunicorn

gunicorn --bind 0.0.0.0:3333 app:app

Deployment

The repo includes a Render blueprint in render.yaml. Render installs dependencies and starts the app with:

gunicorn --bind 0.0.0.0:$PORT app:app

Adding a new scraper

Create a module under flaskraper/scrapers/ with two functions:
- scrap_pages_in_<site>(document) — returns pagination elements
- scrap_jobs_in_<site>(scrapper) — returns a list of job dicts
Register the scraper in flaskraper/scrapers/registry.py:

"mysite": ScraperConfig(
    id="mysite",
    name="My Site",
    build_url=lambda query: f"https://example.com/jobs?q={query}",
    scrap_pages=scrap_pages_in_mysite,
    scrap_jobs=scrap_jobs_in_mysite,
    search_placeholder="Try keyword, role, or skill",
),

Each job dict should include:

{
    "title": "...",
    "company": "...",
    "description": "...",
    "link": "...",
}

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flaskraper

Features

Supported sources

Tech stack

Project structure

How it works

Getting started

Prerequisites

Install

Run locally

Run with gunicorn

Deployment

Adding a new scraper

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
flaskraper		flaskraper
templates		templates
.python-version		.python-version
README.md		README.md
app.py		app.py
main.py		main.py
render.yaml		render.yaml
requirements.txt		requirements.txt
runtime.txt		runtime.txt

Folders and files

Latest commit

History

Repository files navigation

Flaskraper

Features

Supported sources

Tech stack

Project structure

How it works

Getting started

Prerequisites

Install

Run locally

Run with gunicorn

Deployment

Adding a new scraper

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages