by Agnishwar
Introduction:
Artificial neural networks (ANNs) are biologically inspired computer programs designed to simulate the way in which the human brain processes information. ANNs gather their knowledge by detecting the patterns and relationships in data and learn (or are trained) through experience, not from programming. An ANN is formed from hundreds of single units, artificial neurons or processing elements (PE), connected with coefficients (weights), which constitute the neural structure and are organized in layers.
Artificial Neural Network
The power of neural computations comes from connecting neurons in a network. Each PE has weighted inputs, transfer function and one output. The behavior of a neural network is determined by the transfer functions of its neurons, by the learning rule, and by the architecture itself. The weights are the adjustable parameters and, in that sense, a neural network is a parameterized system. The weighed sum of the inputs constitutes the activation of the neuron. The activation signal is passed through transfer function to produce a single output of the neuron. Transfer function introduces non-linearity to the network.
During training, the inter-unit connections are optimized until the error in predictions is minimized and the network reaches the specified level of accuracy. Once the network is trained and tested it can be given new input information to predict the output. Many types of neural networks have been designed already and new ones are invented every week but all can be described by the transfer functions of their neurons, by the learning rule, and by the connection formula.
There are many Neural Network Algorithms are available for training (http://data-flair.training/blogs/artificial-neural-network/). Let us now see some important Algorithms for training Neural Networks:
- Gradient Descent — Used to find the local minimum of a function.
- Evolutionary Algorithms — Based on the concept of natural selection or survival of the fittest in Biology.
- Genetic Algorithm — Enable the most appropriate rules for the solution of a problem and select it. So, they send their ‘genetic material’ to ‘child’ rules. We will learn about them in details below.
Gradient Descent
We use the gradient descent algorithm to find the local smallest of a function. The Neural Network Algorithm converges to the local smallest. By approaching proportional to the negative of the gradient of the function. To find local maxima, take the steps proportional to the positive gradient of the function. This is a gradient ascendant process.
In linear models, the error surface is well defined and well known mathematical object in the shape of a parabola. Then find the least point by calculation. Unlike linear models, neural networks are complex nonlinear models. Here, the error surface has an irregular layout, crisscrossed with hills, valleys, plateau, and deep ravines. To find the last point on this surface, for which no maps are available, the user must explore it.
In this Neural Network Algorithm, you move over the error surface by following the line with the greatest slope. It also offers the possibility of reaching the lowest possible point. You then have to work out at the optimal rate at which you should travel down the slope.
The correct speed is proportional to the slope of the surface and the learning rate. Learning rate controls the extent of modification of the weights during the learning process.
Evolutionary Algorithms
This algorithm based on the concept of natural selection or *survival of the fittest* in Biology. The concept of natural selection states that — *for a given population, environment conditions use a pressure that results in the rise of the fittest in that population*.
To measure fittest in a given population, you can apply a function as an abstract measure.
In the context of evolutionary algorithms, refer recombination to as an operator. Then apply it to two or more candidates known as parents, and result in one or more new candidates known as children. Apply the mutation on a single candidate and results in a new candidate. By applying recombination and mutation, we can get a set of new candidates to place in the next generation based on their fittest measure.
Genetic Algorithm
Genetic algorithms, developed by John Holland’s group from the early 1970s. It enables the most appropriate rules for the solution of a problem to be selected. So that they send their ‘genetic material’ (their variables and categories) to ‘child’ rules.
Here refer a like a set of categories of variables. For example, customers aged between 36 and 50, having financial assets of less than $20,000 and a monthly income of more than $2000.
A rule is the equal of a branch of a decision tree; it is also analogous to a gene. You can understand genes as units inside cells that control how living organisms inherit features of their parents. Thus, Genetic algorithms aim to reproduce the mechanisms of natural selection. By selecting the rules best adapted to prediction and by crossing and mutating them until getting a predictive model.
Together with neural networks, they form the second type of algorithm. Which mimics natural mechanisms to explain phenomena that are not necessarily natural.
ANN Layers
The artificial Neural network is typically organized in layers. Layers are being made up of many interconnected ‘nodes’ which contain an ‘activation function’. A neural network may contain the following 3 layers:
a. Input layer
The purpose of the input layer is to receive as input the values of the explanatory attributes for each observation. Usually, the number of input nodes in an input layer is equal to the number of explanatory variables. ‘input layer’ presents the patterns to the network, which communicates to one or more ‘hidden layers’.
The nodes of the input layer are passive, meaning they do not change the data. They receive a single value on their input and duplicate the value to their many outputs. From the input layer, it duplicates each value and sent to all the hidden nodes.
b. Hidden Layer
The Hidden layers apply given transformations to the input values inside the network. In this, incoming arcs that go from other hidden nodes or from input nodes connected to each node.
It connects with outgoing arcs to output nodes or to other hidden nodes. In a hidden layer, the actual processing is done via a system of weighted ‘connections’.
There may be one or more hidden layers. The values entering a hidden node multiplied by weights, a set of predetermined numbers stored in the program. The weighted inputs are then added to produce a single number.
c. Output layer
The hidden layers then link to an ‘output layer‘. Output layer receives connections from hidden layers or from the input layer. It returns an output value that corresponds to the prediction of the response variable. In classification problems, there is usually only one output node. The active nodes of the output layer combine and change the data to produce the output values.
The ability of the neural network to provide useful data manipulation lies in the proper selection of the weights. This is different from conventional information processing.
Implementation:
model= keras.Sequential([
keras.layers.Dense(10, input_shape=(10,), activation=’relu’), #input layer
keras.layers.Dense(300, activation=’relu’),
keras.layers.Dropout(rate=0.4),
keras.layers.Dense(100, activation=’relu’),
keras.layers.Dropout(rate=0.4), #hidden layers
keras.layers.Dense(1, activation=’sigmoid’), #output layer
])
model.compile(optimizer=’adam’,
loss=’binary_crossentropy’,
metrics=[‘accuracy’]) #model compilation
model.fit(X_train,y_train, epochs=100) #training the model with train data
Comments