📌 Summary
We currently have publication metadata under data/bib/*.bib (one file per PI).
We want to automatically infer people info from these BibTeX files and:
- Export a JSON file with basic people info.
- Generate Astro content files under
src/content/people/*.md with a minimal frontmatter.
This should be implemented as a GitHub Actions workflow.
🎯 Goals
-
Parse all data/bib/*.bib files.
-
Extract unique people (authors) across all publications.
-
Infer an advisor for each person (based on co-authorship frequency with PIs).
-
Export:
- A machine-readable JSON file with people info.
- Astro-compatible markdown files in
src/content/people/ with minimal frontmatter.
📂 Inputs
-
BibTeX files:
Each file corresponds to a PI (e.g., arzucan-ozgur.bib, tunga-gungor.bib, suzan-uskudarli.bib) and contains multiple @article, @inproceedings, etc. entries with author fields.
-
PI mapping (explicit or inferred):
🧠 Logic / Requirements
1. People extraction
2. Advisor inference
3. JSON export
4. Astro markdown generation
For each person (key = slug), create a file:
src/content/people/<slug>.md
-
Example: src/content/people/abdullatif-koksal.md
-
Frontmatter rules:
-
Fill only:
name
advisor (if inferred)
category
-
Leave the rest EMPTY/blank for now (title, photo, bio, email, order, degree, body content).
-
Template format:
---
name: "Abdullatif Köksal"
title: ""
photo: ""
bio: ""
email: ""
category: "student"
order:
advisor: "Arzucan Özgür"
degree: ""
---
-
If advisor is unknown:
-
Do not auto-generate description text for now; keep the body empty.
-
The example currently used in the site for reference (not to be fully filled now):
---
name: "Abdullatif Köksal"
title: "MS Student"
photo: "/images/people/abdullatif-koksal.jpg"
bio: "MS student at Boğaziçi University, working on natural language processing under the supervision of Arzucan Özgür."
email: "abdullatif.koksal@boun.edu.tr"
category: "student"
order: 10
advisor: "Arzucan Özgür"
degree: "MS"
---
Abdullatif Köksal is an MS student at the Computer Engineering Department of Boğaziçi University, working under the supervision of Arzucan Özgür.
## Research Interests
- Natural Language Processing
- Machine Learning
- Cross-lingual NLP
- Text Classification
## Advisor
- **Advisor:** Arzucan Özgür
👉 In this issue, we only want name, advisor, and category populated.
⚙️ GitHub Actions workflow
Implement a workflow, e.g. .github/workflows/generate-people-from-bib.yml:
-
Trigger:
workflow_dispatch (manual for now; later optionally schedule).
-
Steps:
-
Checkout repo.
-
Set up Python.
-
Install dependencies (bibtexparser or similar).
-
Run a Python script, e.g. scripts/generate_people_from_bib.py, that:
-
Parses data/bib/*.bib.
-
Builds the people map.
-
Writes:
data/people.generated.json
src/content/people/*.md
-
Optionally:
- Commit changes on a new branch and open a PR (similar style to existing Scholar workflows).
✅ Acceptance Criteria
📌 Summary
We currently have publication metadata under
data/bib/*.bib(one file per PI).We want to automatically infer people info from these BibTeX files and:
src/content/people/*.mdwith a minimal frontmatter.This should be implemented as a GitHub Actions workflow.
🎯 Goals
Parse all
data/bib/*.bibfiles.Extract unique people (authors) across all publications.
Infer an advisor for each person (based on co-authorship frequency with PIs).
Export:
src/content/people/with minimal frontmatter.📂 Inputs
BibTeX files:
Each file corresponds to a PI (e.g.,
arzucan-ozgur.bib,tunga-gungor.bib,suzan-uskudarli.bib) and contains multiple@article,@inproceedings, etc. entries withauthorfields.PI mapping (explicit or inferred):
Names / slugs of PIs (e.g., from
data/googlescholar.jsonor filename):arzucan-ozgurtunga-gungorsuzan-uskudarli🧠 Logic / Requirements
1. People extraction
For each BibTeX entry in
data/bib/*.bib:authorfield."A. Özgür and T. Güngör and A. Köksal"→ 3 people).Normalize names:
Özgür,Köksal).Build a global map:
{ "abdullatif-koksal": { "name": "Abdullatif Köksal", "advisor": "Arzucan Özgür" | null, "category": "student" | "alumni" | etc. (initially just one value), // other fields can be added later }, ... }Slug generation:
"Abdullatif Köksal"→abdullatif-koksal.2. Advisor inference
A set of PIs is known (from filenames or config), e.g.:
For each non-PI person:
advisorempty.3. JSON export
Write a JSON file, e.g.:
JSON schema (minimal for now):
{ "abdullatif-koksal": { "name": "Abdullatif Köksal", "advisor": "Arzucan Özgür", "category": "student" }, "tunga-gungor": { "name": "Tunga Güngör", "advisor": "", "category": "pi" } }Category:
For now, set only one categorical field:
"pi"(or"faculty")"student"by default (can be refined later).4. Astro markdown generation
For each person (key = slug), create a file:
Example:
src/content/people/abdullatif-koksal.mdFrontmatter rules:
Fill only:
nameadvisor(if inferred)categoryLeave the rest EMPTY/blank for now (
title,photo,bio,email,order,degree, body content).Template format:
If advisor is unknown:
Do not auto-generate description text for now; keep the body empty.
The example currently used in the site for reference (not to be fully filled now):
👉 In this issue, we only want name, advisor, and category populated.
⚙️ GitHub Actions workflow
Implement a workflow, e.g.
.github/workflows/generate-people-from-bib.yml:Trigger:
workflow_dispatch(manual for now; later optionallyschedule).Steps:
Checkout repo.
Set up Python.
Install dependencies (
bibtexparseror similar).Run a Python script, e.g.
scripts/generate_people_from_bib.py, that:Parses
data/bib/*.bib.Builds the people map.
Writes:
data/people.generated.jsonsrc/content/people/*.mdOptionally:
✅ Acceptance Criteria
Running the workflow on existing
data/bib/*.bibgenerates:data/people.generated.jsonsrc/content/people/<slug>.mdfor all discovered people.Each
.mdfile has:name.advisorpopulated with the most frequent co-author PI, if any.categoryset appropriately (at leastpivsstudent).""or omitted as agreed).Existing Astro site builds successfully using these generated people files.
Workflow is documented in the repo (short note in README or
CONTRIBUTING.md).