The homework for this week:
Using the Multilayer Perceptron implementation I’ve provided (or your own), identify a learning problem and dataset that is of interest to you, determine how to integrate this data with the MLP implementation and write any code necessary to train and test the MLP on your data
After browsing the UCI dataset repository, I decided to use the Wine Quality dataset. Each row in the dataset corresponds to 11 measurements of wine, together with a 12th value of “quality” of the wine. Therefore for this task I would be solving a regression problem: given a set of measurements, what is the numerical quality of the wine?
Getting the CSV data into python was relatively easy thanks to the csv module documentation. However I had to try different versions of algorithms to load it successfully (e.g. first I was loading separately the inputs and outputs, but when realizing that shuffling them would mess everything up, I had to change to a general load, shuffle, and then separation into inputs and outputs).
I struggled for a while with the array dimensions. At the end I had to do a weird reshape+transposing so that the “shape” of the outputs, which should have two dimensions but one of them with a size of 1, effectively showed that size. I suspect there would be another way of achieving it, but I couldn’t find the wording of my problem and/or a good answer.
I adapted the code from the examples given; the one in the Github repository had strange dependencies, and the other didn’t have the function to evaluate the error so that a testing set could be used.
Once everything was apparently working, I made some tests with different learning rates to see how the training error and the testing error would change. The full results are here, but below is a table as a summary.
|Experiment||Learning Rate||Final Training error||Final Testing error|
|#1, 1 hidden layer, 100k epochs||0.01||4.817587||24.049231|
|#2, 1 hidden layer, 20k epochs||0.01||4.817587||24.049233|
|#3, 1 hidden layer, 20k epochs||0.10||4.820666||23.893846|
|#4, 1 hidden layer, 20k epochs||0.90||4.812007||24.211538|
|#5, 1 hidden layer, 20k epochs||0.05||4.823552||23.825385|
|#6, 2 hidden layers, 20k epochs||0.05||4.819319||23.893077|
The lowest training error corresponded to the learning rate of 0.9, and it also had the worst testing error. On the other hand, the lowest testing error corresponded to the learning rate of 0.05. I tried with the same learning rate and another hidden layer, but the improvement was reflected in the training error and not in the testing error. In any case I still don’t completely understand the units of the error and what they imply when trying with a real world case. At the end I wanted to include an explicit test of one example to compare the given output with the expected output, but I had more problems with the dimensions and I had to let it go (?).
The code that I wrote/remixed can be found here.
import numpy as np import csv # # Activation function definitions: # def sigmoid_fn(x): return 1.0 / ( 1.0 + np.exp( -x ) ) # def sigmoid_dfn(x): y = sigmoid_fn( x ) return y * ( 1.0 - y ) # def tanh_fn(x): return np.sinh( x ) / np.cosh( x ) # def tanh_dfn(x): return 1.0 - np.power( tanh_fn( x ), 2.0 ) # # Remix between https://github.com/Hebali/learning_machines/blob/master/hyperparameter_hunt/Supervised.py and http://www.patrickhebron.com/learning-machines/code/mlp.txt # MLP Layer Class: # class MlpLayer: def __init__(self,input_size,output_size): self.weights = np.random.rand( output_size, input_size ) * 2.0 - 1.0 self.bias = np.zeros( ( output_size, 1 ) ) # # MLP Class: # class Mlp: def __init__(self,layer_sizes,activation_fn_name): # Create layers: self.layers =  for i in range( len( layer_sizes ) - 1 ): self.layers.append( MlpLayer( layer_sizes[ i ], layer_sizes[ i + 1 ] ) ) # Set activation function: if activation_fn_name == "tanh": self.activation_fn = tanh_fn self.activation_dfn = tanh_dfn else: self.activation_fn = sigmoid_fn self.activation_dfn = sigmoid_dfn # def predictSignal(self,input): # Setup signals: activations = [ input ] outputs = [ input ] # Feed forward through layers: for i in range( 1, len( self.layers ) + 1 ): # Compute activations: curr_act = np.dot( self.layers[ i - 1 ].weights, outputs[ i - 1 ] ) + self.layers[ i - 1 ].bias # Append current signals: activations.append( curr_act ) outputs.append( self.activation_fn( curr_act ) ) # Return signals: return activations, outputs # def predict(self,input): # Feed forward: activations, outputs = self.predictSignal( input ) # Return final layer output: return outputs[ -1 ] # def getErrorRate(self, labels, guesses): return np.mean( np.square( labels - guesses ) ) # def trainEpoch(self,input,target,learn_rate): num_outdims = target.shape[ 0 ] num_examples = target.shape[ 1 ] # Feed forward: activations, outputs = self.predictSignal( input ) # Setup deltas: deltas =  count = len( self.layers ) # Back propagate from final outputs: deltas.append( self.activation_dfn( activations[ count ] ) * ( outputs[ count ] - target ) ) # Back propagate remaining layers: for i in range( count - 1, 0, -1 ): deltas.append( self.activation_dfn( activations[ i ] ) * np.dot( self.layers[ i ].weights.T, deltas[ -1 ] ) ) # Compute batch multiplier: batch_mult = learn_rate * ( 1.0 / float( num_examples ) ) # Apply deltas: for i in range( count ): self.layers[ i ].weights -= batch_mult * np.dot( deltas[ count - i - 1 ], outputs[ i ].T ) self.layers[ i ].bias -= batch_mult * np.expand_dims( np.sum( deltas[ count - i - 1 ], axis=1 ), axis=1 ) # Return error rate: return ( np.sum( np.absolute( target - outputs[ -1 ] ) ) / num_examples / num_outdims ) # def train(self,input,target,validation_samples,validation_labels,learn_rate,epochs,batch_size = 10,report_freq = 10): num_examples = target.shape[ 1 ] # Iterate over each training epoch: print ("Epoch,\tTraining Error,\tValidation Error") # for epoch in range( epochs ): error = 0.0 # Iterate over each training batch: for start in range( 0, num_examples, batch_size ): # Compute batch stop index: stop = min( start + batch_size, num_examples ) # Perform training epoch on batch: batch_error = self.trainEpoch( input[ :, start:stop ], target[ :, start:stop ], learn_rate ) # Add scaled batch error to total error: error += batch_error * ( float( stop - start ) / float( num_examples ) ) # Report error, if applicable: if epoch % report_freq == 0: validation_guesses = self.predict( validation_samples ) validation_error = self.getErrorRate( validation_labels, validation_guesses ) # Print report: print ("%d,\t%f,\t%f" % ( epoch, error, validation_error )) # # Testing with Wine Quality (?) # https://archive.ics.uci.edu/ml/datasets/Wine+Quality #Hyperparameters number_of_samples = 6497 input_size = 11 hidden_size = 20 hidden_size_2 = 20 output_size = 1 percentage_training_data = 0.8 batch_size = 100 epoch_cnt = 20000 report_freq = 100 learn_rate = 0.05 # training_size = int(number_of_samples*percentage_training_data) testing_size = number_of_samples - training_size # # Create MLP mlp = Mlp([input_size, hidden_size, output_size],"tanh") # # dataset # Create empty matrix for inputs data = np.zeros( (number_of_samples, input_size+output_size) ) counter = 0 # CSV reference: https://docs.python.org/3/library/csv.html with open("winequality-red.csv",newline='') as csvfile: reader = csv.reader(csvfile, delimiter=';') for index,row in enumerate(reader): if(index != 0): # If it is not the first row data[counter] = row counter += 1 # with open("winequality-white.csv",newline='') as csvfile: reader = csv.reader(csvfile, delimiter=';') for index,row in enumerate(reader): if(index != 0): # If it is not the first row data[counter] = row counter += 1 # # After the data has been loaded, shuffle it np.random.shuffle(data) # # Separate the data in inputs and outputs inputs = data[:, 0:-1] outputs = data[:, -1] # # Separate further into training and testing training_inputs = inputs[0:training_size] training_outputs = outputs[0:training_size] # # Adjusting dimensions (?) training_inputs = training_inputs.T training_outputs = np.reshape(training_outputs, (training_outputs.shape,1)).T # testing_inputs = inputs[training_size:] testing_outputs = outputs[training_size:] # testing_inputs = testing_inputs.T testing_outputs = np.reshape(testing_outputs, (testing_outputs.shape,1)).T # # Train #mlp.train( training_inputs, training_outputs, learn_rate, epoch_cnt, batch_size, report_freq ) mlp.train( training_inputs, training_outputs, testing_inputs, testing_outputs, learn_rate, epoch_cnt, batch_size, report_freq ) # Predict (not working) #print("Given output") #print(mlp.predict(testing_inputs)) #print("Expected output") #print(testing_outputs)