Computer Science and Engineering, Department of

 

First Advisor

Dr. Vinod Variyam

Second Advisor

Dr. Stephen Scott

Third Advisor

Dr. Ashok Samal

Date of this Version

Summer 8-5-2022

Citation

Archit Srivastava, "Feed Forward Neural Networks with Asymmetric Training," MS thesis, University of Nebraska-Lincoln, Lincoln, Nebraska, August 2021.

Comments

A THESIS Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfilment of Requirements For the Degree of Master of Science, Major: Computer Science, Under the Supervision of Professor Vinod Variyam. Lincoln, Nebraska: August, 2022

Copyright © 2022 Archit Srivastava

Abstract

Our work presents a new perspective on training feed-forward neural networks(FFNN). We introduce and formally define the notion of symmetry and asymmetry in the context of training of FFNN. We provide a mathematical definition to generalize the idea of sparsification and demonstrate how sparsification can induce asymmetric training in FFNN.

In FFNN, training consists of two phases, forward pass and backward pass. We define symmetric training in FFNN as follows-- If a neural network uses the same parameters for both forward pass and backward pass, then the training is said to be symmetric.

The definition of asymmetric training in artificial neural networks follows naturally from the contrapositive of the definition of symmetric training. Training is asymmetric if the neural network uses different parameters for the forward and backward pass.

We conducted experiments to induce asymmetry during the training phase of the feed-forward neural network such that the network uses all the parameters during the forward pass, but only a subset of parameters are used in the backward pass to calculate the gradient of the loss function using sparsified backpropagation.

We explore three strategies to induce asymmetry in Neural networks.

The first method is somewhat analogous to drop-out because the sparsified backpropagation algorithm drops specific neurons along with associated parameters while calculating the gradient.

The second method is excessive sparsification. It induces asymmetry by dropping both neurons and connections, thus making the neural network behave as if it is partially connected while calculating the gradient in the backward pass.

The third method is a refinement of the second method; it also induces asymmetry by dropping both neurons and connections while calculating the gradient in the backward pass.

In our experiments, the FFNN with asymmetric training reduced overfitting had better accuracy, and reduced backpropagation time compared to the FFNN with symmetric training with drop-out.

Adviser: Vinod Variyam

Share

COinS