Neural Network
As predictor, we use ensemble of feedforward neural networks with backpropagation learning:
 The ensemble has:
 the population number of networks from genetic algorithm,
 each network has its own fitness value, only those having
fitness > 1
are included into ensemble,  the final prediction is weighted average of all network predictions:
final = (w_{0}*p_{0} + w_{1}*p_{1} + ..)/(w_{0} + w_{1} + ..)
but excluding outliers using weighted median absolute deviation method, and  the weight is defined as logarithm of fitness:
ln(fitness)
.
 The neural network has:
 an input layer with
WNDOW.length
nodes,  a hidden layer with
HIDDENS
nodes, and  an output layer with
1
node.
 an input layer with
 As input it uses:
WNDOW.length
points window of previous normalized rates in specified positions, for example, in position
1, 2, 5, 8, 10
before predicted point, etc.  hence, input layer has
WNDOW.length
nodes.
 The normalized rate of point
i
is defined asnorm[i] = ln(rate[i]/rate[i1])
 The training of neural network is performed with:
 AdaDelta algorithm with
L2 decay = L2_DECAY
, and batch size = BATCH_SIZE ‰
(permil) of training size, trained with latest data set from the past exchange rates, with
 training size is
SIZE_K x neural network size
, and  no validation is performed to save time.
 AdaDelta algorithm with
 Neural network size can be calculated as:
net size = (input nodes + 1)*(hidden nodes) + (hidden nodes + 1)*(output nodes)
 For example, with above nodes:
net size = (WNDOW.length + 1)*HIDDENS + (HIDDENS + 1)*1
training size = SIZE_K * net size
batch size = BATCH_SIZE * training size / 1000
 For example, with above nodes:
Genetic Algorithm
Further, we use genetic algorithm optimization technique to find the best parameters for above neural network:
 Parameters to optimize are:
 Number of nodes in input layer (also window size and positions). [3 ~ 13]
 Number of nodes in hidden layer. [3 ~ 23]
 Value of L to define L2 decay as
10^{L}
. [0 ~ 5]  Value of
batch size
in ‰ (permil) of training size. [1 ~ 999]  Multiplier of training size to network size. [3 ~ 13]
 The chromosome type is binary chromosome, such that all parameters that are integer, are going to be converted to binary representation back and forth during evolution.
 The crossover type is one point crossover with crossover rate
0.9
.  The mutation policy is binary mutation with mutation rate
0.15
.  The selection policy is tournament selection with tournament size
3
.  The population is elitistic population with population size
100
and elitism rate0.1
.  And for stopping condition, fixed generation count is used.
 For fitness function, we use Standard Error of the Estimate (SEE) of the forecast:
SEE = √(^{Σ(rate  prediction)2}⁄_{N})
 The better prediction, the lower SEE value.
 So we inverse it to get fitness value.
Best Parameters
Genetic algorithm is used to find best parameters of neural networks, specifically for 5 top currencies and bases below.
 a. SIZE_K
 b. WNDOW
 c. HIDDENS
 d. L2_DECAY
 e. BATCH_SIZE
 f. SEE
 g. GENERATION
 h. net size
 i. training size
 j. batch size
_{cur}＼^{base}  USD  EUR  GBP  JPY  AUD 
USD 



 
EUR 



 
GBP 



 
JPY 



 
AUD 




0 comments :
Post a Comment