Faculty & Staff

 

Graduate & Ph.D

 

Undergraduate

 

Summer Students

 

ASGC Scholars

 

ASGC Fellows

   
   
   
   
   
   
   
   
   
 

 

 

 

 

   
  Home
   
   
   
Font Size
medium
large largest
       

 

 

 
 

Philipia Simmons


my paper

Pulse Coupled Neural Networks and the lmage-to-Sound Converter

It is a well known fact that the human brain is a very sophisticated processing unit, capable of receiving, transmitting, processing, and storing tons of information. These abilities are due to the billions of individual processing units of the brain. These units are called neurons (nerve cells connected for the purpose of processing information) and they are responsible for virtually all of the pattern recognition of which humans are capable. It has been of interest for some years now to simulate neural networks in order to create "intelligent" machines, such as computers, capable of processing information in much the same manner as the human brain. This interest has developed into a variety of studies including digital image processing, parallel distributed processing and the one of interest here at AAMU, pattern recognition.

The concept behind pattern recognition is simple. How do we recognize someone's handwriting, or a face in a partially damaged picture? We draw on those images with which we are already familiar. The same is true with pattern recognition. The task is simple enough for humans because we possess somewhere on the order of 1E9 and 1E12 neurons a lot of room for storing information. But artificial neural networks have a considerable amount less (the average artificial network of X neurons has only X2 (square of X) connections). If drawing on images already recognized there would only room for a few to be "programmed" into the system, which is not good in this vast world in which we live, where the same chair would look different if it were turned upside down, not to mention all the other different kinds of chairs there are in the world. Even small variations would warrant a whole new pattern.

However, there now exist pulse-coupled neural networks (PCNNs). PCNNs can be thought of as a combination of two kinds of pattern recognition: statistical pattern recognition (in which a set of features is extracted from the pattern, grouped into a feature classes, and recognition is based upon the partitioning of the feature space in such a manner that new views are classified properly) and syntactic pattern recognition (which deals with the relationship between the features as well as the features themselves, so that the face in that partially damaged picture would still be recognizable). The problem with statistical recognition is its partially limited processing ability, and the problem with syntactic lies in the difficulty of gaining real time information (it's very slow). However, when the two are combined, the end result is a system that comes extremely close to obtaining the level of pattern recognition that humans possess. The development of PCNNs are the center of ongoing research here at AAMU.

My personal research involves the image to sound conversion of various objects recognized by the PCNNs. The basis behind this extension of PCNN research is the belief that patterns can be recognized not only by visual means, but by auditory means as well. To create the image to sound conversion, we simply gather the data produced by the PCNN (usually in the form of icons that can be represented by only a few bits) and encode it for the sound generator. However, since there are several arbitrary steps involved, we can not say that a sound directly resembles a particular scene. At best, we believe that similar scenes produce similar sounds. So far, the research (still in its initial stages) has yielded the following results (based on sounds created from handwritten A's and M's from two different people, a computer generated T and a computer generated +): the A's, which sound very similar but are easily distinguished from the M's, T, and +, are distinguishable once the subject has listened to them more than once (likewise for the M's); the handwritten images have a more complex sound than do the computer generated images; and there are some similarities in the letters that were written by the same person.

Why do all this? Well, the "perfect" image-to-sound converter will allow pilots to transfer some of the burden from all those gauges, dials, and meters from their eyes to their ears; it will allow the visually impaired to be able to "see" without a cane or a trained dog. These are only two of the applications for the image-to-sound converter. As technology advances, the world will probably see the need for more and more such devices.

 

 

 

COPYRIGHT 2004 - 2005 ALABAMA A & M UNIVERSITY HJF-CIM