Introduction
A neural network refers to an information-processing model operating in a manner that is similar to the natural nervous systems of the human brain (Stergiou and Siganos, 1997). The novel structure is the primary component of this model comprising numerous integrated processing components (neurons) that function together to offer solutions to specific issues. Generally, an artificial neural network is specifically designed for certain uses like detection of pattern or data categorization via a learning progression. Neural networks are used as statistical components in several disciplines like psychology, statistics, medicine, and physics, among others. Besides, both neural and cognitive scientists apply them as representations of mental processes. Essentially, neural networks are developed from simple elements called neurons. Weight connections are the ones that interlink these elements. Modification of these connection weights is what enhances learning. The units are organized in layers (Abdi, Valentin and Edelman, 1999, p. 1).
Apparently, artificial neural networks simulation is the latest innovation. However, the system was introduced even prior to the introduction of computers, and it has existed for quite a long time despite having at least one main shortcoming. Cheap computer emulations have really favored several imperative computer emulations. During the previous duration of interest, the discipline encountered a phase of disappointment and disrepute. At such a period, both financial and professional support was the least and crucial innovations were conducted by a handful of researchers. These starting researchers were able to come up with advanced technology that was an improvement of the previous shortcomings that had been detected by Minsky and Papert. Of late, the neural network discipline does not only attract enthusiasm but financial support as well (Stergiou and Siganos, 1997). This work thus elaboratively describes the architecture of neural networks, their applications, and their shortcomings.
The architecture of Neural Networks
Feed-Forward Networks
Feed–forward networks enable signals to be conveyed in an exclusively single direction. It has several layers; the input and output from the first and last layers respectively, where examinable information is processed by initially being conveyed to the neurons of the former layer, followed by the latter. In this case, feedback does not take place, implying that any resulting output from one layer is moved forth to the next layer successively (Stergiou and Siganos, 1997). Feed–forward networks are uncomplicated networks that tend to correlate inputs with outputs. Their application is mainly in pattern recognition.
The feed-forward neural network is among the most popular neural network architectures. The primary challenge of using the feed-forward network involves establishing network architecture. Generally, a small network may not be capable of learning the problem effectively. On the other hand, a network that is extra large may result in both over-fitting and improper unspecific performance. Due to this, algorithms that can locate suitable network architecture instantly are therefore greatly preferred. The three main appropriate algorithms in solving this problem are the pruning, constructive and regularization algorithms (Wang, 2006, p. 524). Pruning algorithms have lately attracted interest. They begin with an oversize network before proceeding to eliminate unwanted network parameters in the course of training or after meeting the local minimum.
Feedback Networks
These are designed in a way that allows them to transmit signals back and forth (Mandic and Chambers, 2001, p. 43); however, they are very strong and can be very sophisticated. They are transitional, with their state changing perpetually until they attain a point of equilibrium. After reaching the equilibrium point, no adjustments are witnessed until such a time that there is a change in input necessitating the creation of another focal point. Enhancement of feedback within a recurrent neural network may be through a local or global way. Whereas placing feedback within the hidden layer produces local feedback, connecting a network output to the network input produces global feedback (Mandic and Chambers, 2001, p. 43). Interconnections among neurons can also occur in the hidden layer.
Network Layers
Three main layers involved include the following: input layer reflects raw data introduced; a hidden layer that follows from the former layer; and the output layer that normally operates based on the working of input elements and heaviness of links connecting the layer’s elements and the hidden elements. However, this network is non-complex given that the hidden elements are not tied up; instead, they are allowed the freedom to create unique input images. Moreover, the hidden elements operate through the influence of the weight between input and hidden elements, and thus by adjusting the weight, each element can choose what it stands for. There are single-layer and multilayer architectures. The former has all units linked to each other and it consists of a more general case. It has a pronounced probable computational power compared to the multi-layer organization. In multi-layer structures, there is less reliance on universal numbering; instead, more emphasis is put on using layers as a numbering model.
Applications of Neural Networks
Character Recognition
Character recognition is the process of identifying a character, which may be in form of letters or images. Primarily, this recognition entails the categorization of specific input on the basis of the available characters, thus making it one of the most quickly growing fields. Neural networks are one of the best ways of gaining a better understanding of the information related to the past raw facts, in which case the application of the algorithm to fresh input becomes important. Therefore, the fundamental method here involves breaking the available sample such that it forms a grid whose every element represents either a 0 or 1. This is basically determined by whether something was specifically imprinted on such a spot or not. It is also important to note that neural networks are normally used on the inner layer while the results emanating wherefrom are usually provided through the outer layer (Yu, He and Zang, 2009, p. 822). After ascertaining the class to which the neural network belongs, the artificial neural network of a certain language becomes vital as it forms the basis upon which the network is applied. For instance, English lower case and upper case characters seem to share the same network, thus necessitating the application of a number of distinct neural networks to various points of the image, that is, for the top region, the central area, and the bottom part. The right and the left segment has only two possibilities. A vertical line is either present or not, and hence a neural network is not created for the two segments (Yu, He and Zang, 2009, p. 828).
Image Compression
Image processing utilizes a very vital network referred to as the Multilevel Perception (MIP) network. The MIP network comprises three main layers – the input, hidden and output layers. Generally, the amount of neurons present in the hidden layer has a significant influence on image compression (Annadurai, 2007, p. 187). The input image is divided into several pixels such that the input layer’s neurons equal the number of pixels present in the block. In addition, to ensure that the best possible output values are obtained, designing the compression should be highly appropriate; this will ensure minimizing the quadratic error that may exist within the two extremes of the layers (Annadurai, 2007, p. 187).
One component of the compression process is normalization, which is normally done on the input and output values of the pixels. This is followed by two phases of compression: instructing and encoding. In the first stage, the network is trained using several image samples through backpropagation learning regulation. In this regulation, every input vector is applied as the required output; this is similar to “compressing the input into the narrow channel represented by the hidden layer and then reconstructing the input from the hidden to the output layer” (Annadurai, 2007, p. 187). The second phase entails the “binary coding of the state vector in the hidden layer” (Annadurai, 2007, p. 187).
Stock Market
Neural networks are designed in a manner that allows the realization of non-linear relationships in input data. This property makes them suitable to be used in non-linear dynamic schemes like the stock market. Several network configurations have been created to represent the stock market. The purpose of these systems involves both verifications of the validity of EMH, as well as relating them to statistical approaches like regression (Lawrence, 1997, p. 10). Normally, data from both technical and fundamental analysis is the one that is utilized by these networks. Most of these networks aim at reaching a decision regarding the appropriate time to make purchase decisions of securities through the reliance on past information of the stock market, that is, the stock market trend in the past years. The main hurdle lies in selecting the indicators and input data to be used which should be followed by collecting sufficient data necessary for effective training; the input data could comprise raw data like quantity, price, and daily variations. However, it may also constitute both technical parameters such as the moving average and fundamental parameters like the fiscal environment.
The first procedure in network training is finding suitable input data. The second procedure is feeding the input data to the network in a manner that will enable it to learn well and not overtrain (Lawrence, 1997, p. 10). Multilayer feed-forward network is the most often applied network architecture in fiscal neural networks. It is trained by the use of backpropagation. Other than backpropagation, other stock market prediction neural networks have been actuated through genetic algorithms, recurrent and modular networks (Lawrence, 1997, p. 15). Generally, in applying neural networks in the stock market, networks are examined based on their environment and training data, organization and performance (Lawrence, 1997, p. 10).
Food Processing
In the recent past, there have been increased publications with regard to the application of neural networks in food processing. Many studies have applied feed-forward multi-layer neural networks whose training is via backpropagation. There are four main groups of ways that neural networks may be applied in food processing. The first is product grading and classification where both food and agricultural products are placed into various grades and categories by use of the machine vision. Secondly, there is food quality evaluation where neural networks are used to assess the quality of food in industries. Third, there is food process modeling that is used in cases where conventional food processing methods failed. This includes areas such as drying of cooked rice and thermal processing of canned meals. Fourth, it is used in the control of food processing operations (Irudayaraj, 2001, p. 310-312).
Medicine
In recent years, these networks have been among the most researched areas in medicine. Presently, various studies are done through scanning on the basis of their application in living beings and establishing related illnesses (Stergiou and Siganos, 1997). Neural networks learn the detection of diseases through the use of examples rather than algorithms. The second application is in modeling the cardiovascular system of the human body. Here, diagnosis is done by relating actual information derived from an ailing person to a prototype image of the system and then establishing any variance between them (Stergiou and Siganos, 1997). Thirdly, artificial neural networks are empirically applied in implementing electronic noses. Electronic noses find their use in telemedicine, a situation that involves a long-term performance of medicine through a communication channel; the electronic noses then “identify odors in the remote surgical environment” (Stergiou and Siganos, 1997).
Target Recognition
Most radar target recognition problems have been solved using neural networks. For instance, aircraft, ships, and vehicles have been categorized using multi-layered, feedforward, radial basis and fuzzy neural networks (Wang, 2006, p. 370). The backpropagation algorithm is applied with various variations. However, in classifying aircraft, the SARPROP is used. It applies both simulated annealing and weight decay approaches to the training of the neural network. This application does not only show good performance in the use of the pattern categorization but also ensures that the network has escaped from the local minima.
Machine Diagnostics
Most scholars have suggested the use of artificial neural networks in performing automated machine diagnostics, which can be achieved from learning the aspects that characterize the specific kinds of machine faults. However, to effectively characterize each of the situations and thus classify, neural networks ought to be trained using a substantial quantity of data. Practically, the only condition that is likely to avail enough data for this case is the normal condition. Therefore, artificial neurons can be applied in both the detection of this condition and a departure from the same (Randall, 2009, p.1).
The most effective way of using artificial neural networks in machine diagnosis is to train them through the use of simulated signals. Generally, simulations are supposed to cover diverse kinds of situations. However, there are cases that require simulations to match the needs of specific machines. Other than this, simulation methods also may be used to give signals that are crucial in testing and comparing several diagnostic methods. In addition to this, simulations of machine defects are used in providing a proper understanding of signals that have been applied. For instance, nonlinearities may provide interactions that are hard to predict, which may be applied in anticipating anomalies (Randall, 2009, p.1).
Signature Analysis
Signature recognition is a field that has attracted a lot of attention in the previous years mainly due to its significance in the financial sector. However, there has been no dependable reliable signature recognition method especially with rising cases of forgery. The signature recognition exercise has several limitations that impede its identification. For one, the signature is only an image that is devoid of meaning in itself, and in most cases, the signature is in printed form, which makes it lack information regarding the way in which it was appended. Neural networks are therefore applied in the verification of signatures. This makes their use to be suitable in signature segmentation, static and changing signature confirmation. Moreover, they are more advantageous as they have the ability to be trained into detecting diverse patterns from their features (Mira, 2001, p. 192-193).
In the proposed signature recognition system, we have the characteristic extraction and identification phases. The characteristic extraction phase obtains from the printed signature, the upper, lower, central, and the two side parts of the signature envelopes. It also extracts other statistical components that are relevant in identification. These components are fed to the identification phase that comprises multi-layered neural networks that function in parallel. The multi-layered neural networks are trained to identify the signs that are being analyzed. Their outputs are then linked to a double-layer neural network that makes the final decision (Mira, 2001, p. 193).
Monitoring
The use of neural networks in monitoring has been manifested in several cases such as sensor network analysis and user context recognition. After training the neural network, it is possible for it to recognize conditions that were previously unknown to it. An issue of paramount importance is how to train a neural network effectively and appropriately. For machine monitoring, this is greatly influenced by the parameters chosen, which reveal a lot about the state of the machine. Numerous parameters are likely to enhance the efficacy of the machine by increasing the intricacy of both the network design and the computational load (Ablameyko, 2003, p. 169). On the contrary, a reduced number of parameters will not give accurate information regarding the system for the neural network to depend on.
Limits of Neural Networks
Despite their described diverse applications, neural networks have several shortcomings. First, building a model requires a very large quantity of data. This may be exorbitant to afford. Second, every information and activity regarding the functioning of the neural networks must be obtained from within the data present within the model. No data extrapolations can be made outside the data used in constructing the model. Third, neural networks rely on historical data that could be either uneven or chaotic. Moreover, the selection of neural network structure and the input and output parameters are likely to influence the outcome (Irudayaraj, 2001, p. 310).
Conclusion
Artificial neural networks normally represent models that demonstrate how other processors of information such as the computer and the human brain do their work. To carry out its functions, it has to be trained by using several algorithms such as backpropagation and many others. Its architecture comprises network layers, feedforward, and feedback networks. It has numerous applications in medicine, the stock market, food processing, target and character recognition, signature analysis, image compression, Machine diagnostics and monitoring. Despite all these uses, neural networks have their own shortcomings.
References
Abdi, H., Valentin, D. and Edelman, B. (1999). Neural Networks, Issue 124. London: Sage Publications.
Ablameyko, S. (2003). Neural networks for instrumentation, measurement and related industrial applications. New York: IOS Press.
Annadurai. (2007). Fundamentals of Digital Image Processing. New Delhi: Dorling Kindersley.
Irudayaraj, J.M. (2001). Food Processing Operations Modeling Design and Analysis, Second Edition. New York: Marcel Dekker, Inc.
Lawrence, R. (1997). Using Neural Networks to Forecast Stock Market Prices. Department of Computer Science, University of Manitoba. Web.
Mira, J. (2001). Bio-inspired applications of connectionism: 6th International Work-Conference on Artificial and Natural Neural Networks, IWANN 2001, Granada, Spain, 2001: proceedings. New York: Springer Verlag.
Randall, R. B. (2009). The Application of Fault Simulation to Machine Diagnostics and Prognostics. Web.
Stergiou, C. and Siganos, D. (1997). Neural Networks. Web.
Wang, J. (2006). Advances in neural networks: ISNN 2006: Third International Symposium on Neural Networks, Chengdu, China, 2006: proceedings. New York: Springer Verlag.
Yu, W., He, H. and Zang, N. (2009). Advances in Neural Networks – ISNN 2009: 6th International Symposium on Neural Networks, ISNN 2009, Wuhan, China, 2009 Proceedings, Part 2. New York: Springer Verlag.