🔍 Code Snippet Generation using Fine-Tuned T5

This project is a code snippet generation system built using a fine-tuned T5-small model. It takes natural language programming queries and returns relevant code snippets. A simple web interface is built using Streamlit.

📌 Features

Fine-tuned T5 on natural language → code mapping
Streamlit-based frontend for live query testing
Easily extendable to other languages or domains

📂 Directory Structure

├── app/ │ └── streamlit_app.py # Streamlit frontend ├── src/ │ └── train_t5.py # Model training script ├── data/ │ └── extended_programming_code_snippets.csv # Dataset ├── t5_finetuned_model/ # Fine-tuned model (excluded from Git) ├── requirements.txt ├── .gitignore └── README.md

📊 Dataset

File: data/extended_programming_code_snippets.csv
The dataset consists of three columns:

Column	Description
`Query`	Natural language programming query
`Code_Snippet`	Expected output code
`Tags`	(Optional) Tags like language/domain

🏋️‍♂️ Training the Model

Prepare your dataset at ./data/extended_programming_code_snippets.csv
Run the training script:

python src/train_t5.py


This will save the fine-tuned model and tokenizer to:
./t5_finetuned_model/

🌐 Running the Streamlit App
After training, you can run the frontend locally:

1. Install dependencies

pip install -r requirements.txt

 Start the app

streamlit run app/streamlit_app.py

Sample Output
Query: how to sort a list in python

my_list = [3, 1, 4, 1, 5, 9]
my_list.sort()
print(my_list)


🧰Tech Stack
Hugging Face Transformers

PyTorch

Streamlit

Google Colab (for training)

📌 Notes
The model (t5_finetuned_model/) is excluded from the repo to avoid large file uploads. You should train or manually place it locally.

You can upload the trained model to Hugging Face Hub or Google Drive for deployment.


📄 License
MIT License © 2025 Vishal Meena

👨‍💻 Author
Vishal Meena
IIT Kharagpur
🔗 [LinkedIn](https://linkedin.com/in/vishalmeenaiit)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
Code_Snippet_Generation.ipynb		Code_Snippet_Generation.ipynb
README.md		README.md
extended_programming_code_snippets.csv		extended_programming_code_snippets.csv
requirements.txt		requirements.txt
streamlitapp.py		streamlitapp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔍 Code Snippet Generation using Fine-Tuned T5

📌 Features

📂 Directory Structure

📊 Dataset

🏋️‍♂️ Training the Model

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🔍 Code Snippet Generation using Fine-Tuned T5

📌 Features

📂 Directory Structure

📊 Dataset

🏋️‍♂️ Training the Model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages