A production-ready FastAPI backend for text processing with user authentication, paragraph submission, and efficient word search functionality.
- User registration and authentication with JWT tokens
- Submit and store paragraphs of text
- Automatic word indexing and frequency analysis
- Search paragraphs by word with relevance ranking
- RESTful API with automatic documentation
- User Authentication: Secure JWT-based authentication with access and refresh tokens
- Text Processing: Efficient word indexing and frequency analysis
- Search Capabilities: Fast search with relevance ranking by word frequency
- Asynchronous Processing: Background tasks for non-blocking indexing operations
- Containerized: Ready for Docker deployment
| Layer | Technology |
|---|---|
| Framework | FastAPI (Python 3.11+) |
| ORM | SQLAlchemy |
| Database | SQLite (dev) / PostgreSQL (prod) |
| Auth | JWT + bcrypt password hashing |
| Background Processing | FastAPI BackgroundTasks |
| API Docs | Auto-generated OpenAPI / Swagger UI |
| Containerization | Docker |
erDiagram
users ||--o{ paragraphs : "1-to-many"
users ||--o{ word_counts : "1-to-many"
users ||--o{ refresh_tokens : "1-to-many"
paragraphs ||--o{ paragraph_word_counts : "1-to-many"
users {
int id PK
string email "UQ, indexed"
string hashed_password
datetime created_at
}
paragraphs {
int id PK
int user_id FK
text content
datetime created_at
}
word_counts {
int id PK
int user_id FK
string word "indexed"
int count "indexed"
}
paragraph_word_counts {
int id PK
int user_id FK
int paragraph_id FK
string word "indexed"
int count "indexed"
}
refresh_tokens {
int id PK
int user_id FK
string token "UQ, indexed"
datetime expires_at
}
Text Processing API/
├── app/
│ ├── core/
│ │ ├── database.py # SQLAlchemy engine, session, Base, init_db
│ │ ├── models.py # ORM models: User, Paragraph, ParagraphWordCount
│ │ └── schemas.py # Pydantic request/response schemas
│ ├── routers/
│ │ ├── auth.py # Register, login endpoints
│ │ └── paragraphs.py # Submit, list, search endpoints
│ ├── services/
│ │ ├── auth.py # Password hashing, JWT token logic, create_user
│ │ └── indexing.py # Background word-count indexing logic
│ ├── utils/
│ │ └── dependencies.py # get_current_user dependency (JWT validation)
│ ├── __init__.py
│ └── main.py # FastAPI app entry point, router registration
├── tests/
├── .env.example
├── .gitignore
├── database.db # SQLite database (local dev only, not committed)
├── Dockerfile
├── requirements.txt
└── README.md
- Python 3.11+
- pip (or Docker for containerized setup)
-
Clone the repository:
git clone <repository-url> cd "py API pj"
-
Create and activate a virtual environment:
python -m venv .venv # Windows .venv\Scripts\activate # macOS/Linux source .venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables:
cp .env.example .env # Edit .env with your values -
Run the development server:
uvicorn app.main:app --reload --port 8000
-
Access the API:
- Swagger UI:
http://localhost:8000/docs - Root:
http://localhost:8000/
- Swagger UI:
docker build -t text-processing-api .
docker run -p 8000:8000 --env-file .env text-processing-apiCreate a .env file in the root directory:
SECRET_KEY=your-secret-key-here
ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=15
REFRESH_TOKEN_EXPIRE_DAYS=7
DATABASE_URL=sqlite:///./database.dbGenerate a secure secret key:
openssl rand -hex 32| Method | Endpoint | Description |
|---|---|---|
| POST | /auth/register |
Register a new user |
| POST | /auth/login |
Login and receive JWT token |
| Method | Endpoint | Description |
|---|---|---|
| POST | /paragraphs/ |
Submit one or more paragraphs |
| GET | /paragraphs/ |
List your paragraphs (paginated) |
| GET | /paragraphs/search?word=xyz |
Search paragraphs by word frequency |
All
/paragraphs/endpoints require aBearertoken in theAuthorizationheader.
- Open
http://localhost:8000/docs - Register via
POST /auth/register→{"email": "test@example.com", "password": "password123"} - Login via
POST /auth/login→ copy theaccess_token - Authorize → click the "Authorize" button → enter
Bearer YOUR_ACCESS_TOKEN - Submit via
POST /paragraphs/→{"paragraphs": ["Python is great. Python is popular."]} - Search via
GET /paragraphs/search?word=python
# Register
curl -X POST "http://localhost:8000/auth/register" \
-H "Content-Type: application/json" \
-d '{"email": "test@example.com", "password": "password123"}'
# Login (save the token)
curl -X POST "http://localhost:8000/auth/login" \
-H "Content-Type: application/json" \
-d '{"email": "test@example.com", "password": "password123"}'
# Submit paragraphs (replace TOKEN)
curl -X POST "http://localhost:8000/paragraphs/" \
-H "Authorization: Bearer TOKEN" \
-H "Content-Type: application/json" \
-d '{"paragraphs": ["Test paragraph with words."]}'
# Search (replace TOKEN)
curl -X GET "http://localhost:8000/paragraphs/search?word=test" \
-H "Authorization: Bearer TOKEN"database.dbis a local SQLite file for development — do not commit it (already in.gitignore)- For production, swap
DATABASE_URLto a PostgreSQL connection string and update the engine config