Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions .github/workflows/references.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ jobs:
pip install -r references/requirements.txt
sudo apt-get update && sudo apt-get install fonts-freefont-ttf -y
- name: Train for a short epoch
run: python references/classification/train_pytorch_character.py vit_s -b 32 --val-samples 1 --train-samples 1 --epochs 1
run: python references/classification/train_character.py vit_s -b 32 --val-samples 1 --train-samples 1 --epochs 1

train-orientation-classification:
runs-on: ${{ matrix.os }}
Expand Down Expand Up @@ -75,9 +75,9 @@ jobs:
sudo apt-get update && sudo apt-get install unzip -y
unzip toy_recogition_set-036a4d80.zip -d reco_set
- name: Train for a short epoch (document orientation)
run: python references/classification/train_pytorch_orientation.py resnet18 --type page --train_path ./det_set --val_path ./det_set -b 2 --epochs 1
run: python references/classification/train_orientation.py resnet18 --type page --train_path ./det_set --val_path ./det_set -b 2 --epochs 1
- name: Train for a short epoch (crop orientation)
run: python references/classification/train_pytorch_orientation.py resnet18 --type crop --train_path ./reco_set --val_path ./reco_set -b 4 --epochs 1
run: python references/classification/train_orientation.py resnet18 --type crop --train_path ./reco_set --val_path ./reco_set -b 4 --epochs 1

train-text-recognition:
runs-on: ${{ matrix.os }}
Expand Down Expand Up @@ -111,7 +111,7 @@ jobs:
sudo apt-get update && sudo apt-get install unzip -y
unzip toy_recogition_set-036a4d80.zip -d reco_set
- name: Train for a short epoch
run: python references/recognition/train_pytorch.py crnn_mobilenet_v3_small --train_path ./reco_set --val_path ./reco_set -b 4 --epochs 1
run: python references/recognition/train.py crnn_mobilenet_v3_small --train_path ./reco_set --val_path ./reco_set -b 4 --epochs 1

evaluate-text-recognition:
runs-on: ${{ matrix.os }}
Expand All @@ -137,7 +137,7 @@ jobs:
python -m pip install --upgrade pip
pip install -e .[viz,html] --upgrade
- name: Evaluate text recognition
run: python references/recognition/evaluate_pytorch.py crnn_mobilenet_v3_small --dataset SVT -b 32
run: python references/recognition/evaluate.py crnn_mobilenet_v3_small --dataset SVT -b 32

latency-text-recognition:
runs-on: ${{ matrix.os }}
Expand All @@ -163,7 +163,7 @@ jobs:
python -m pip install --upgrade pip
pip install -e .[viz,html] --upgrade
- name: Benchmark latency
run: python references/recognition/latency_pytorch.py crnn_mobilenet_v3_small --it 5
run: python references/recognition/latency.py crnn_mobilenet_v3_small --it 5

train-text-detection:
runs-on: ${{ matrix.os }}
Expand Down Expand Up @@ -197,7 +197,7 @@ jobs:
sudo apt-get update && sudo apt-get install unzip -y
unzip toy_detection_set-bbbb4243.zip -d det_set
- name: Train for a short epoch
run: python references/detection/train_pytorch.py db_mobilenet_v3_large --train_path ./det_set --val_path ./det_set -b 2 --epochs 1
run: python references/detection/train.py db_mobilenet_v3_large --train_path ./det_set --val_path ./det_set -b 2 --epochs 1

evaluate-text-detection:
runs-on: ${{ matrix.os }}
Expand All @@ -224,7 +224,7 @@ jobs:
pip install -e .[viz,html] --upgrade
pip install -r references/requirements.txt
- name: Evaluate text detection
run: python references/detection/evaluate_pytorch.py db_mobilenet_v3_large
run: python references/detection/evaluate.py db_mobilenet_v3_large

latency-text-detection:
runs-on: ${{ matrix.os }}
Expand All @@ -250,4 +250,4 @@ jobs:
python -m pip install --upgrade pip
pip install -e .[viz,html] --upgrade
- name: Benchmark latency
run: python references/detection/latency_pytorch.py db_mobilenet_v3_large --it 5 --size 512
run: python references/detection/latency.py db_mobilenet_v3_large --it 5 --size 512
2 changes: 1 addition & 1 deletion docs/source/using_doctr/sharing_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ It is also possible to push your model directly after training.

.. code:: bash

python3 ~/doctr/references/recognition/train_pytorch.py crnn_mobilenet_v3_large --name doctr-crnn-mobilenet-v3-large --push-to-hub
python3 ~/doctr/references/recognition/train.py crnn_mobilenet_v3_large --name doctr-crnn-mobilenet-v3-large --push-to-hub


Pretrained community models
Expand Down
8 changes: 4 additions & 4 deletions references/classification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,15 @@ pip install -r references/requirements.txt
You can start your training in PyTorch:

```shell
python references/classification/train_pytorch_character.py mobilenet_v3_large --epochs 5 --device 0
python references/classification/train_character.py mobilenet_v3_large --epochs 5 --device 0
```

## Usage orientation classification

You can start your training in PyTorch:

```shell
python references/classification/train_pytorch_orientation.py resnet18 --type page --train_path path/to/your/train_set --val_path path/to/your/val_set --epochs 5
python references/classification/train_orientation.py resnet18 --type page --train_path path/to/your/train_set --val_path path/to/your/val_set --epochs 5
```

The type can be either `page` for document images or `crop` for word crops.
Expand Down Expand Up @@ -58,11 +58,11 @@ Feel free to inspect the multiple script option to customize your training to yo
Character classification:

```shell
python references/classification/train_pytorch_character.py --help
python references/classification/train_character.py --help
```

Orientation classification:

```shell
python references/classification/train_pytorch_orientation.py --help
python references/classification/train_orientation.py --help
```
6 changes: 3 additions & 3 deletions references/detection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ pip install -r references/requirements.txt
You can start your training in PyTorch:

```shell
python references/detection/train_pytorch.py db_resnet50 --train_path path/to/your/train_set --val_path path/to/your/val_set --epochs 5
python references/detection/train.py db_resnet50 --train_path path/to/your/train_set --val_path path/to/your/val_set --epochs 5
```

### Multi-GPU support
Expand All @@ -41,7 +41,7 @@ By default all visible GPUs will be used. To limit which GPUs participate, set t

```shell
CUDA_VISIBLE_DEVICES=0,2 \
torchrun --nproc_per_node=2 references/detection/train_pytorch.py \
torchrun --nproc_per_node=2 references/detection/train.py \
db_resnet50 \
--train_path path/to/train \
--val_path path/to/val \
Expand Down Expand Up @@ -124,5 +124,5 @@ You can follow this page on [how to create a Slack App](https://api.slack.com/qu
Feel free to inspect the multiple script option to customize your training to your own needs!

```python
python references/detection/train_pytorch.py --help
python references/detection/train.py --help
```
File renamed without changes.
10 changes: 5 additions & 5 deletions references/recognition/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ pip install -r references/requirements.txt
You can start your training in PyTorch:

```shell
python references/recognition/train_pytorch.py crnn_vgg16_bn --train_path path/to/your/train_set --val_path path/to/your/val_set --epochs 5
python references/recognition/train.py crnn_vgg16_bn --train_path path/to/your/train_set --val_path path/to/your/val_set --epochs 5
```

### Multi-GPU support
Expand All @@ -41,7 +41,7 @@ By default all visible GPUs will be used. To limit which GPUs participate, set t

```shell
CUDA_VISIBLE_DEVICES=0,2 \
torchrun --nproc_per_node=2 references/recognition/train_pytorch.py \
torchrun --nproc_per_node=2 references/recognition/train.py \
crnn_vgg16_bn \
--train_path path/to/train \
--val_path path/to/val \
Expand Down Expand Up @@ -94,18 +94,18 @@ You can follow this page on [how to create a Slack App](https://api.slack.com/qu
Feel free to inspect the multiple script option to customize your training to your own needs!

```shell
python references/recognition/train_pytorch.py --help
python references/recognition/train.py --help
```

## Using custom fonts

If you want to use your own custom fonts for training, make sure the font is installed on your OS.
Do so on linux by copying the .ttf file to the desired directory with: ```sudo cp custom-font.ttf /usr/local/share/fonts/``` and then running ```fc-cache -f -v``` to build the font cache.

Keep in mind that passing fonts to the training script will only work with the WordGenerator which will not augment or change images from the dataset if it is passed as argument. If no path to a dataset is passed like in this command ```python3 doctr/references/recognition/train_pytorch.py crnn_mobilenet_v3_small --vocab french --font "custom-font.ttf"``` only then is the WordGenerator "triggered" to create random images from the given vocab and font.
Keep in mind that passing fonts to the training script will only work with the WordGenerator which will not augment or change images from the dataset if it is passed as argument. If no path to a dataset is passed like in this command ```python3 doctr/references/recognition/train.py crnn_mobilenet_v3_small --vocab french --font "custom-font.ttf"``` only then is the WordGenerator "triggered" to create random images from the given vocab and font.

Running the training script should look like this for multiple custom fonts:

```shell
python references/recognition/train_pytorch.py crnn_vgg16_bn --epochs 5 --font "custom-font-1.ttf,custom-font-2.ttf"
python references/recognition/train.py crnn_vgg16_bn --epochs 5 --font "custom-font-1.ttf,custom-font-2.ttf"
```
Loading