-
Notifications
You must be signed in to change notification settings - Fork 3
Home
Today, the plan of execution for the project has been discussed. Monday and Tuesday are involved around learning the concepts and skills such as the introduction to python and neural networks. Then around Tuesday and Wednesday we will reproduce our knowledge and split the group in two to work on two different methods.
Objectives for next week are having made the two groups that will work on their chosen separate methods.
The basic theoretical concepts behind Neural Networks and Convolutional Neural Networks were discussed. Then, everyone familiarised themselves with the PyTorch library. We did that by watching a tutorial together in the classroom and then each of us continued working on that at home.
The objective of tomorrow will be to build a preliminary Neural Network to get familiar with tasks that lie ahead in the coming few weeks
The group was split into two. One group focused on processing the data into numpy arrays that could be then processed into the Neural Network whilst the other group focused on laying the groundwork for the overall structure of the Convolutional Neural Network.
The objective of tomorrow will be to continue the work of today as well as start to brainstorm on the two discussed methods to compute depth maps.
The group continued to work in two teams. The Neural Network has now a definite structure. Some additional work still needs to be implemented, namely: the NN is too slow and requires too much ram to run, the training of the network hasn't been addressed yet, and lastly, the feeding of the data into the NN is still dubious.
The objectives of tomorrow include: to address the issues related to the NN, order a stereo-raspberrypi, contact Jacco to get his insights into the aforementioned problematics relative to the NN.
Giovanni and I were assigned the task of data processing. We chose to convert the images and depth maps into numpy arrays for ease of use downstream. We split the images into categories based on their illumination conditions and whether they were from the left or right camera. We used this and named the files accordingly. After discussing with the neural network team we chose to further separate the data into testing and training data, with 90% of the data as training and the remaining 10% as testing data. To achieve variety in the testing dataset, we took every tenth image as opposed to taking the first (or last) 10% of data. We converted the files into numpy arrays, ensuring that they were being named correctly and matched to their original .png files, and then combined the numpy arrays into a single array. Following the combination of the separate numpy arrays, the array's shape was 1620x1x480x640x3. We removed the unwanted x1 layer and reshaped the array so it was 1620x3x480x640. The different illumination conditions were of the same 3D environment, so the two depth map files can be used with either camera data, as the depth of the image is independent of the lighting condition.
The final dimensions for the processed numpy arrays were: 1620x3x480x640 for each camera in each condition used for training, 180x3x480x640 for each camera in each condition used for testing. The data for the depth maps was also converted into numpy arrays and separated for training and testing. The depth map data did not have the x3 dimension as it lacked RGB channels, and so its dimensions were 1620x480x640 for the training data and 180x480x640 for the testing data. Final size of the 10 files used for testing (4x2 illumination conditions + 2 depth maps) was 5.75GB. Final size of the 10 files used for training (4x2 illumination conditions + 2 depth maps) was 51.76GB. Size makes sense since testing data is 10% of total data and is 9x smaller than training data size.
To create a suitable dataset for the training of the neural network we had to create 9x9 selections from images. For the 9x9 selection that would be used to compare to the original, we chose to not only shift the selection slightly, but to also add a bit of noise. We chose 10 images with a large variety of colours and patterns, and from each of them chose 100 random points around which we would create the 9x9 pixel selection. We then took the noisy version of each image, and took the locations but shifted them ± 2 pixels on the y-axis and ± 4 pixels in the x-axis. These shifted pixel locations were then chosen for the centre of the second 9x9 pixel selection, and assigned a score from 0-1 based on how many of the 81 pixels overlapped. This gave us 10 datasets each with a 100x3x9x9 array for the original image pixel selections, a 100x3x9x9 array for the noisy image shifted pixel selections, and 100 similarity scores.