Skip to content

TheMoAly/Sign-Language-Recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sign-Language-Recognition

Sign recognition network that predicts a digit giving an image of the sign.

This project structure follows the best practice tensorflow folder structure of Tensorflow Best Practice

Table of contents

Project structure


├── Configs
│   └── config_model.json  - Contains the paths used and config of the models(learning_rate, num_epochs, ...)
│ 
├──  base
│   ├── base_model.py   - This file contains the abstract class of all models used.
│   ├── base_train.py   - This file contains the abstract class of the trainer of all models used.
│   └── base_test.py    - This file contains the abstract class of the testers of all models used.
│
├── models              - This folder contains 1 model for sign language detection.
│   └── gesture_recognition_model.py  - Contains the architecture of the gesture/sign recognition model used
│
│
├── trainer             - This folder contains trainers used which inherit from BaseTrain.
│   └── gesture_recognition_trainer.py - Contains the trainer class of the gesture recognition model.
│ 
|
├── testers             - This folder contains testers used which inherit from BaseTest.
│   └── sentiment_tester.py - Contains the tester class of the gesture recognition model.
│ 
| 
├──  mains 
│    └── main.py  - responsible for the whole pipeline.
|
│ 
├──  data _loader 
│    ├── data_generator.py  - Contains DataGenerator class which handles Sign Language dataset.
│    └── preprocessing.py   - Contains helper functions for preprocessing Sign Language dataset.
| 
└── utils
     ├── config.py  - Contains utility functions to handle json config file.
     ├── logger.py  - Contains Logger class which handles tensorboard.
     └── utils.py   - Contains utility functions to parse arguments and handle pickle data. 

Download pretrained models:

Pretrained models can be found at saved_models/checkpoint

Install dependencies

  • Python3.x

  • Tensorflow

  • Tensorboard[optional]

  • Numpy

pip3 install numpy
  • scipy version 1.1.0
pip3 install scipy==1.1.0
  • Bunch
pip3 install bunch
  • Pandas
pip3 install pandas
  • tqdm
pip3 install tqdm

Config File

In order to train, pretrain or test the model you need first to edit the config file:

{
  "num_epochs": 200,               - Numer of epochs to train the model if it is in train mode.
  "learning_rate": 0.001,         - Learning rate used for training the model.
  "state_size": [64, 64, 1],      - Holding the input size ( our state).
  "batch_size": 256,               - Batch size for training, validation and testing sets(#TODO: edit single batch_size per mode)
  "val_per_epoch": 1,              - Get validation set acc and loss per val_per_epoch. (Can be ignored).
  "max_to_keep":1,                 - Maximum number of checkpoints to keep.
  "per_process_gpu_memory_fraction":1, - Usage percentage of the GPU for trainin, I needed it...

  "train_data_path":"path_to_training_set",                      - Path to training data.
  "test_data_path":"path_to_test_set",                           - Path to test data.
  "checkpoint_dir":"path_to_store_the_model_checkpoints",        - Path to checkpoints store location/ or loading model.
  "summary_dir":"path_to_store_model_summaries_for_tensorboard",  - Path to summaries store location/.

}

How to Train

In order to train, pretrain or test the model you need first to edit the config file that is described at config file.
To train a Gesture Recognition model:
set:

"num_epochs":200,
"learning_rate":0.0001,
"batch_size":256,
"state_size": [64, 64, 1],

"train_data_path": set it to path of the training data e.g: "/content/train"
"checkpoint_dir": path to store checkpoints, e.g: "/content/saved_models/tiny_vgg_model/checkpoint/"
"summary_dir": path to store the model summaries for tensorboard, e.g: "/content/saved_models/tiny_vgg_model/summary/"

Then change directory to the project's folder and run:

python3.6 -m src.mains.main --config path_to_config_file

Make predictions with pretrained models

To make predictions using single image as input input:

Configure the config file to the path of the model checkpoint.
cd to project folder.

python3.6 -m src.mains.main --config "path_to_config_file" -i "path_to_image"

To make predictions using path for test images input:

Configure the config file to the path of the model checkpoint.
cd to project folder.

python3.6 -m src.mains.main --config "path_to_config_file" -t "path_to_images_folder"

Implementation details

Gesture recognition model preprocessing

talk about preprocessing

Shuffling dataset

In order to decrease the probability of overfitting, training set is shuffled every new epoch

indices_list = [i for i in range(self.x_train.shape[0])]  # Training examples.
shuffle(indices_list)

Splitting dataset

Dataset is splitted into 3 segments, training set(80%), validation set(10%) and testing set (10%)

Changing input dimensions

Input image is a 100x100x3 images, they are downsampled to a 64x64x1 gray-scale image, as color information are redundant and meaningless in this task (we don't want to differentiate between white and dark hands..)

img = scipy.misc.imresize(img, (64, 64))

Normalizing batches

In order to change the input images values to a common scale, data is normalized to be a zero-mean

img = (img - img.min()) / (img_range + 1e-5)  #1e-5 is to prevent division by zero

Data augmentation

TODO: Augment training set. Data augmentation will help in increasing the model performance

Gesture recognition model arch

Image not loaded

Model Training

I trained the Sign Language model by splitting training_data into train/val/test with ratios 8:1:1 for 300 epochs
Acheived val accuracy of 97.5781261920929%, val_loss of 0.06723332
training accuracy of 99%, training_loss of 0.26895

model val_acc
Image not loaded

 alt="Image not loaded"
 style="float: left; margin-right: 10px;" />

and val_loss
Image not loaded

model testing

Acheived testing accuracy of 97.38948345184326% on 10% of the dataset (unseen in training process).
with test_loss: 0.049882233.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages