Accéder au contenu principal

In-Depth: Data Collection

 Assembling the Dictionary

As we were working on medical terms primarily, we decided to contact a medical expert, also a friend of one of us, in order to help us identify which medical terms are most common among patients. Thus, we managed to collect quite a considerable number of sentences that are relevant to our topic. Now comes the second part, figuring out the sign language interpretations for these medical terms and sentences, so we decided to pay a visit to our partner ATILS to find the translation for which word in order for us to start collecting the sign language data for our model. We can confidently say that we hit the jackpot with this one as one of the interpreters gave us a whole medical sign language interpretation book. Now we have our first data source, the next steps are to collect real-life data then train the model on it.

Here are some pictures of the medical sign language terms from the book: 




Collecting Video Data

This part was rather tedious, but it is of utmost necessity as we couldn't advance and train the model without it. Collecting sign language data, basically shooting videos of people in real life making different sign language. We decided to start off with a dictionary of 60 words, having 10 videos for each word for a total of 600 videos for our model. Not a huge amount, but it would have to do as a start, to test our model efficiency. Each one of us basically had to shoot 10 videos for 10 different persons each. We managed to collect about 60% of the amount we agreed on and we started to implement the data in the model in order to test it out. We arrived at the conclusion that we still need to collect more data and work on having the same camera quality, resolution and frame rate set for all of us in order to guarantee data precision and make it easier for the model to identify the relevant differences instead of extracting unimportant features such as lighting or background color. We also figured that we should all film the sign language gestures with 2-second pacing between each gesture in order to guarantee a constant speed for all data and avoid any confusion or misinterpretation it might cause during training.

Here is an example of the videos we shot which simply means "Hello, how are you?"


Here is the link for our dictionary which includes most of the medical terms we are working on: 

https://drive.google.com/drive/folders/1nU110eig5hiGW4zsld611qXNUJs5Bxj-?usp=sharing



Commentaires