Colin Kern

From Earlham CS Department
Jump to navigation Jump to search

Abstract

The problem of pattern recognition, especially in complicated grammars such as natural languages, is difficult to solve using algorithmic methods. In a human child, however, this is one of the first problems that is solved. Therefore, emulating the human brain with a neural net is an effective method of solving the grammar recognition problem. This paper describes using a neural net with a perceptron architecture, which is well suited to pattern recognition, along with the back-propogation learning algorithm to recognize a number of simple grammars. A>s more layers are added to the perceptron, the complexity of the grammars able to be recognized increased. The Octave language and the Netlab plugin were used to simulate and analyze these neural nets.


Journal

November 26, 2005

I discovered an error in my script which explained why the error percentage was remaining around 50%. It was always doing only 1 iteration. I fixed it and tested some more. The error rate on the training data converges to 0%, but the network still performs around 50% error for the test data. The network seems to be learning the training data, but not adapting to recognize the pattern outside of the training data.

November 24, 2005

I decided to make a training file with equal amounts of positive and negative tests. The error percentage was around 50%, regardless of number of iterations.

November 14-19, 2005

I had a few tests this week, and I also needed to get some stuff ready for graduate school applications. I did some reading about training regiments, but I didn't get any useful information.

November 13, 2005

I changed the training file so that it only included strings that were in the grammar. Now the network anwers close to 1 for every string. It seems like the network is only learning which answer is the most probable and not actually look at the strings.

November 12, 2005

I still haven't gotten word of the extra functions being installed on Cairo. In response to some suggestions in class, I changed the training and test files so that it alternates putting each string in either the test group or the training group (I still want to test with strings not in the training group). The network still maxes out at approx. 19% errors. I looked at what the network was responding and discovered that almost all of the network's output was in the range of 0.00 - 0.09, and all of it was less thatn 0.3. This means that the network was rejecting every string, and 19% of them happened to be in the grammar.

November 10, 2005

I found a problem with the Octave installation on Cairo. There are some additional functions that aren't installed. I put a tarball for the admins to use so that I can start running things on Cairo. While I'm not parallelizing the functions, I can still run multiple tests simultaneously, which will speed things up.

November 9, 2005

The error rate for |a| = |b| seems to stop going down at about 20%. After that, more iterations don't seem to improve the performance. I will modify my script to display which strings were incorrectly identified and try to see what is going on.

I noticed that while training, Octave only shows 50% CPU usage. Running the training script with time shows this ouput: real 123m25.420s user 23m1.108s sys 90m8.374s

I find it odd that so much of the time was in system mode.

Parallelizing seems like it may be too much work to be worth it (unless my senior project will just be parallelizing the algorithm). Maybe I will use Cairo to just run multiple tests at the same time.

October 25, 2005

I used the script to generate input data for the |a| = |b| language with maximum string length of 8. I started to experiment with training the net only using the string of length 4 or less, and then seeing how well it recognized the longer strings. I did 100 training iterations, recorded the error, then did 1000 iterations. The error with 1000 iterations was higher. I then ran the 100 iterations test multiple times and got different errors. I think that the weights are being initialized to random values, so I need to figure out how to save an untrained net and use it for each experiment.

The error I got with 100 iterations was 46%, the error with 100 is 60%. It seems like after even 100 iterations the initial weights shouldn't make that big of a difference if the training is working.

October 24, 2005

I wrote a Perl script that I give a maximum length and the number of symbols in an alphabet. The script will output all possible strings in the input style I am using for my neural nets.

October 22, 2005

Using Netlab, I created a 2-layer perceptron with 4 input nodes, 4 hidden nodes, and 1 output node. I trained it with all possible strings (max length 4) in the grammar where |a| = |b|. After 100 iterations, the net correctly recognized 87% of the strings. After 1000 iterations, the percentage increased to 97%.

Since the output node returned a floating point number, I rounded the values to determine the output on a "1 = accept, 0 = reject" scale. These decimals were generally between 0 and 1, however some were very small negative decimals, which rounded to 0. I didn't see any above 1.

October 17, 2005

I picked up Neural Networks for Pattern Recognition from Lilly.

October 15, 2005

Played around with Netlab. I was able to create a 2-layer perceptron and forward-propogate input values to get output values. I tried a few things with training functions, but they didn't behave as expected.

October 11, 2005

I filled out the form to get Neural Networks for Pattern Recognition through Interlibrary Loan.

October 10, 2005

I have reversed the order of this section so that now the most recent posts are at the top and the older ones are at the bottom. Studies show that this reduces inefficient scrolling by 83%.

The paper that the writers of Netlab used as their source turns out to be a 500 page book. It is Neural Networks for Pattern Recognition by Christopher M. Bishop. It is $53.37 on Amazon.com. I have to decide if it is worth buying or trying to get on Interlibrary loan.

Update: Google Print Beta seems to have scans of all the pages. This would be an inconvenient way to read it, though.

September 30, 2005

I have finished the first outline of my paper.

September 25, 2005

I searched Google for Octave neural net packages. I didn't see all that many, but I chose one called Netlab, which is actually a Matlab package. It is compatible with Octave, though. The next thing I have to do is actually learn Octave.

September 22, 2005

I have installed Octave in my Cygwin environment. Now I have to find the neural net plugin for it.

September 14, 2005

I just wrote some code for the basic data structures that would make up a neural net and some functions to set input and calculate the output. I need to add methods to create the network structure and the training algorithm. Those are going to be the hard parts.

September 13, 2005

I finished my abstract, but I'm having trouble getting excited about simply creating a neural net and teaching it to play Tic Tac Toe. I don't know how complicated that will turn out to be, so I don't know if I have the time to do anything more. I've had two ideas that might make the project more interesting.

First, I could try to make the neural net code I write able to support as many different kinds of neural nets as possible. I'd just write the basic data structures and algorithms, then write a program that takes a file specifying the shape of the neural net and creates the net from that. It can then take other input, such as the algorithm to use for training and a training script. A user would issue a command similar to "./neuralnet structure.txt algorithm.txt training.txt net.txt". This would create the neural net in structure.txt and train it using algorithm.txt and training.txt. The trained net would be output to net.txt. Another command, "./neuralnet net.txt", would load the trained net and stdin would be the input given to the net, whose output would be written to stdout.

The other idea is to experiment with a neural net's ability to learn grammars. Give it strings such as 'ab', 'aabb', 'aaabbb' that are in the grammar and 'a', 'abb', 'aba' that aren't, then see if it can correctly say whether other strings are also in the grammar. It would be interesting to see what grammars can be learned and how adding more layers to a perceptron would increase the complexity of the grammars that can be learned. A complication I see is how to input variable length strings. I can see either having a large set of input nodes, some of which aren't used, or giving the net the string one character at a time and the net saying "yes" or "no" for each character (and if it is still saying yes on the last character, it accepts the string).

I'd probably be more interested in doing the second of these ideas.

Feedback for Colin, 13 Sep 2005

Colin, I would be very interested to see how your NN works with grammars as you have described. Indeed, knowing next to nothing about your project, the second suggestion about grammars grabbed me much more than the first.

Again, I know not-a-lot about NN, but I can ostensibly see the training data for grammars being much easier to create on the fly or algorithmically.

A variation on the TTT idea might be to teach it 3d TTT. I've seen this both with sets of 3x3x3 and 4x4x4 boards. The way that Charlie described NNs last Wednesday, leads me to think that what you might do is let the NN take care of the individual steps, and just teach it what is a good or bad outcome. In this manner you could simply have it play a heck of a lot of games with another program that you write that simply goes through all the possible iterations of a game of TTT (2d or 3d), and returns to your NN if the NN has won or lost each game.

--hunteke 23:57, 13 Sep 2005 (EST)