Sign recognition network that predicts a digit giving an image of the sign.
This project structure follows the best practice tensorflow folder structure of Tensorflow Best Practice
- Project structure
- Download pretrained models
- Dependencies
- Config file
- How to train
- How to predict
- Implementation details
├── Configs
│ └── config_model.json - Contains the paths used and config of the models(learning_rate, num_epochs, ...)
│
├── base
│ ├── base_model.py - This file contains the abstract class of all models used.
│ ├── base_train.py - This file contains the abstract class of the trainer of all models used.
│ └── base_test.py - This file contains the abstract class of the testers of all models used.
│
├── models - This folder contains 1 model for sign language detection.
│ └── gesture_recognition_model.py - Contains the architecture of the gesture/sign recognition model used
│
│
├── trainer - This folder contains trainers used which inherit from BaseTrain.
│ └── gesture_recognition_trainer.py - Contains the trainer class of the gesture recognition model.
│
|
├── testers - This folder contains testers used which inherit from BaseTest.
│ └── sentiment_tester.py - Contains the tester class of the gesture recognition model.
│
|
├── mains
│ └── main.py - responsible for the whole pipeline.
|
│
├── data _loader
│ ├── data_generator.py - Contains DataGenerator class which handles Sign Language dataset.
│ └── preprocessing.py - Contains helper functions for preprocessing Sign Language dataset.
|
└── utils
├── config.py - Contains utility functions to handle json config file.
├── logger.py - Contains Logger class which handles tensorboard.
└── utils.py - Contains utility functions to parse arguments and handle pickle data.
Pretrained models can be found at saved_models/checkpoint
-
Python3.x
-
Tensorboard[optional]
-
Numpy
pip3 install numpy
- scipy version 1.1.0
pip3 install scipy==1.1.0
- Bunch
pip3 install bunch
- Pandas
pip3 install pandas
- tqdm
pip3 install tqdm
In order to train, pretrain or test the model you need first to edit the config file:
{
"num_epochs": 200, - Numer of epochs to train the model if it is in train mode.
"learning_rate": 0.001, - Learning rate used for training the model.
"state_size": [64, 64, 1], - Holding the input size ( our state).
"batch_size": 256, - Batch size for training, validation and testing sets(#TODO: edit single batch_size per mode)
"val_per_epoch": 1, - Get validation set acc and loss per val_per_epoch. (Can be ignored).
"max_to_keep":1, - Maximum number of checkpoints to keep.
"per_process_gpu_memory_fraction":1, - Usage percentage of the GPU for trainin, I needed it...
"train_data_path":"path_to_training_set", - Path to training data.
"test_data_path":"path_to_test_set", - Path to test data.
"checkpoint_dir":"path_to_store_the_model_checkpoints", - Path to checkpoints store location/ or loading model.
"summary_dir":"path_to_store_model_summaries_for_tensorboard", - Path to summaries store location/.
}
In order to train, pretrain or test the model you need first to edit the config file that is described at config file.
To train a Gesture Recognition model:
set:
"num_epochs":200,
"learning_rate":0.0001,
"batch_size":256,
"state_size": [64, 64, 1],
"train_data_path": set it to path of the training data e.g: "/content/train"
"checkpoint_dir": path to store checkpoints, e.g: "/content/saved_models/tiny_vgg_model/checkpoint/"
"summary_dir": path to store the model summaries for tensorboard, e.g: "/content/saved_models/tiny_vgg_model/summary/"
Then change directory to the project's folder and run:
python3.6 -m src.mains.main --config path_to_config_file
Configure the config file to the path of the model checkpoint.
cd to project folder.
python3.6 -m src.mains.main --config "path_to_config_file" -i "path_to_image"
Configure the config file to the path of the model checkpoint.
cd to project folder.
python3.6 -m src.mains.main --config "path_to_config_file" -t "path_to_images_folder"
talk about preprocessing
In order to decrease the probability of overfitting, training set is shuffled every new epoch
indices_list = [i for i in range(self.x_train.shape[0])] # Training examples.
shuffle(indices_list)
Dataset is splitted into 3 segments, training set(80%), validation set(10%) and testing set (10%)
Input image is a 100x100x3 images, they are downsampled to a 64x64x1 gray-scale image, as color information are redundant and meaningless in this task (we don't want to differentiate between white and dark hands..)
img = scipy.misc.imresize(img, (64, 64))
In order to change the input images values to a common scale, data is normalized to be a zero-mean
img = (img - img.min()) / (img_range + 1e-5) #1e-5 is to prevent division by zero
TODO: Augment training set.
Data augmentation will help in increasing the model performance
I trained the Sign Language model by splitting training_data into train/val/test with ratios 8:1:1 for 300 epochs
Acheived val accuracy of 97.5781261920929%, val_loss of 0.06723332
training accuracy of 99%, training_loss of 0.26895
alt="Image not loaded"
style="float: left; margin-right: 10px;" />
Acheived testing accuracy of 97.38948345184326% on 10% of the dataset (unseen in training process).
with test_loss: 0.049882233.


