GLMask Semantic2Instance

This repository contains the official implementation of GLMask, a structural image representation designed for semantic-to-instance segmentation with minimal manual annotation. The method was introduced in our paper:

From Semantic To Instance: A Semi-Self-Supervised Learning Approach

GLMask replaces standard RGB inputs with a three-channel structural representation composed of Grayscale (G), CIELAB Lightness (L), and a semantic mask (M). This design encourages the model to focus on shape, texture, and structural cues rather than color, improving generalization across diverse acquisition domains.

Our framework combines:

Synthetic data generation via cut-and-paste simulation
GLMask-based structural representation learning
Rotation-based domain adaptation
YOLOv9 instance segmentation

The approach achieves:

98.5% mAP@50 on wheat head instance segmentation
Consistent cross-domain robustness across 18 acquisition environments
Significant performance gains on the MS COCO dataset

Method Overview

Figure G1: Overall pipeline from semantic mask generation, to GLMask construction, and instance segmentation using GLMask and synthetic pretraining.

Domain-Based Performance Analysis ($GHD_{te}$)

Table G1: Per-domain mAP@50 comparison between the RGB-based BaseModel and the proposed RoAModel across the 18 acquisition domains of the GHDte test set. The BaseModel exhibits substantial performance variability, with $mAP@50$ ranging from as low as 8.7% to 71.1%, reflecting its sensitivity to domain shifts such as growth stage variation, illumination changes, and object density. In contrast, RoAModel consistently achieves high performance across all domains, with mAP@50 values ranging from 95.8% to 99.5%. Notably, the largest gains are observed in visually challenging environments (e.g., domains where BaseModel performance drops below 30%), indicating that the proposed GLMask representation and rotation-based domain adaptation effectively mitigate cross-domain instability. These results demonstrate that structural semantic-to-instance transfer provides robust generalization in dense agricultural imagery, significantly reducing performance variance across heterogeneous acquisition conditions.

Quality Assessment

Figure G2: Prediction Performance of RoAModel across the $18$ domains of $GHD_{te}$ test set.

Figure G3: Prediction performance of the COCO models on the Microsoft COCO 2017 dataset. Our proposed GLMask approach consistently achieved superior segmentation performance (columns B, C, D, and E), and obtained higher confidence scores for detected objects (A through F). In some cases both RGB and \ColorMap models failed to detect objects of interest, examples include the truck in column D and people in F. In addition, near-perfect segmentation performance was observed in cases such as A.

Installation

Prerequisites

Python 3.11+
CUDA-enabled GPU (for training and inference)

Setup

Clone the repository:

git clone https://github.com/your-username/glmask-semantic2instance.git
cd glmask-semantic2instance

Create a virtual environment (recommended):

python -m venv venv
source venv/bin/activate

Install dependencies:
```
pip install -e .
```

Usage

Data Preparation

Place your datasets in the data/ directory following the structure outlined in the configuration files. Sample metadata files are provided in the data/ directory (background_videos_metadata.csv, foreground_videos_metadata.csv, and segmented_samples_metadata.csv) to illustrate the expected input format for running the data synthesis pipeline.

Training and evaluation datasets must be organized in the standard YOLO format (image files with corresponding label .txt files following the class x_center y_center width height convention).

Synthetic Data Generation

This stage generates synthetic training data by overlaying extracted objects onto backgrounds.

Extract Frames from Videos:

wheathead-sim-frames --config configs/simulator/frames_extractor.yaml

Extract Objects from Images:

wheathead-sim-objects --config configs/simulator/objects_extractor.yaml

Run Simulator:

wheathead-simulator --config configs/simulator/simulator.yaml

Visualize Simulated Data:

wheathead-sim-visualizer --config configs/simulator/visualizer.yaml

Semantic Segmentation (s5seg)

The semantic mask channel (M) in the GLMask representation is produced by this stage. Place pretrained weights at model_weights/S5Seg_Best.pt before running.

Pretrained weights can be downloaded from: Download Weights

Train (optional — skip if using pretrained weights):

wheathead-s5seg-train --config configs/s5seg/train.yaml

Evaluate:

wheathead-s5seg-eval --config configs/s5seg/eval.yaml

Predict (generates semantic masks required for GLMask construction):
```
wheathead-s5seg-predict --config configs/s5seg/predict.yaml
```

Update data_path in the config to point to your image metadata CSV, and set predict.prediction_dir to the directory where masks should be written.

Preprocessing (GLMask & Utilities)

GLMask replaces RGB input with a three-channel structural representation (Grayscale, LAB-L, Mask) to reduce color dependency.

Convert RGB to GLMask:

wheathead-rgb2glm --config configs/yolo/process_confs/rgb2glm.yaml

Convert Masks to Contours:

wheathead-mask2contour --config configs/yolo/process_confs/mask2contour.yaml

Rotate Images (Domain Adaptation):

wheathead-rotator --config configs/yolo/process_confs/rotator.yaml

Visualize Annotations:

wheathead-visualizer --config configs/yolo/process_confs/visualizer.yaml

Training

To train a new model, use the wheathead-train script with a configuration file:

Modify configs/yolo/train.yaml
Modify configs/yolo/model_confs/simulated.yaml

wheathead-train --config configs/yolo/train.yaml

All training hyperparameters, including model weights, data paths, and learning rates, are defined in the YAML configuration file.

Evaluation

To evaluate a trained model, use the wheathead-eval script:

Modify configs/yolo/eval.yaml
Modify configs/yolo/model_confs/gwhd_centers.yaml

wheathead-eval --config configs/yolo/eval.yaml

Prediction

To run inference with a trained model, use the wheathead-pred script:

Modify configs/yolo/pred.yaml

wheathead-pred --config configs/yolo/predict.yaml

Configuration

All scripts (train, eval, predict) are controlled by YAML configuration files in the configs/yolo/ directory. Modify these files to change hyperparameters, paths, and other settings.

Key Configuration Fields:

model_weights: Path to the model weights file (relative to model_weights/).
data: Path to the data configuration file.
All other YOLOv9 hyperparameters.

Refer to the provided *.yaml files for examples.

Citation

@article{najafian2025semantic,
  title={From Semantic To Instance: A Semi-Self-Supervised Learning Approach},
  author={Najafian, Keyhan and Maleki, Farhad and Jin, Lingling and Stavness, Ian},
  journal={arXiv preprint arXiv:2506.16563},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
configs		configs
data		data
src/wheathead		src/wheathead
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GLMask Semantic2Instance

Method Overview

Domain-Based Performance Analysis ($GHD_{te}$)

Quality Assessment

Installation

Prerequisites

Setup

Usage

Data Preparation

Synthetic Data Generation

Semantic Segmentation (s5seg)

Preprocessing (GLMask & Utilities)

Training

Evaluation

Prediction

Configuration

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GLMask Semantic2Instance

Method Overview

Domain-Based Performance Analysis ($GHD_{te}$)

Quality Assessment

Installation

Prerequisites

Setup

Usage

Data Preparation

Synthetic Data Generation

Semantic Segmentation (s5seg)

Preprocessing (GLMask & Utilities)

Training

Evaluation

Prediction

Configuration

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages