Analytical Technology

What is Analytical Technology

Analytical Technology design a technique that based on known data, models, algorithms and mathematical theorems allow to estimate values of the unknown parameters and characteristics.

Methods with which the human brain processes information are an interesting example of analytical techniques. Even a child's brain can solve problems inaccessible to modern computers, such as recognition of familiar faces in the crowd, or effective management of several dozen of muscles when playing football.

The uniqueness of the brain is that it is able to learn solving new problems: playing chess, driving a car, etc. However, the brain is poorly adapted to handling large amounts of digital information - a person cannot even find the square root of the number 3010225 without using a calculator or a calculation algorithm column. In practice, often the problem of numbers is much more complex than the root extraction. Thus, to solve such problems people need additional methodologies and tools.

For whom are analytical technologies designed?

Analytical technologies are primarily necessary to individuals that make important decisions - managers, analysts, experts and consultants. The revenue of the company is largely determined by the quality of these solutions and the accuracy of forecasts and optimally chosen strategies.

Forecast:

Optimization:

Usually, real-world business problems and forecasts have no clear solution algorithms. Previously, leaders and experts solved such problems based entirely on their personal experience. The analytical technologies enable the creation of systems allowing to significantly improve the efficiency of solutions.

Traditional technologies (Deterministic technology)

Analytical technologies are used by man for many centuries. During this time a huge number of formulas, theorems and algorithms were created to solve the classical problems - to determine the volume, solve systems of linear equations, and find the roots of polynomials. Sophisticated and effective methods were developed to solve optimal management problems and differential equations, etc. But in order for an algorithm to be applicable, the given problem must be completely described by a certain deterministic model (some set of known functions and parameters). In this case, the algorithm will give a precise answer.

Probabilistic technologies

In practice, we often see problems associated with the observation of random variables - for example, the problem of forecasting the stock price. For such problems, it is not possible to build a deterministic model, so a fundamentally different, a probabilistic approach is applied. Parameters of probabilistic models consist in the distribution of random variables, their value, variance, etc. Typically, these parameters are initially unknown, and their evaluation requires statistical methods applied to the sample of observed values (historical data). But such methods also request the existence of some known probabilistic model of the problem. For example, in the problem of forecasting a stock price we can assume that the future stock price depends only on the price for the last 2 days (autoregressive model). If this is true, then the observation of the price for a few months provides a good estimation of the coefficients of this relationship and predicts the future course.

Disadvantages of traditional technologies

Unfortunately, classical methods are ineffective in many practical problems. This is due to the fact that it is impossible to adequately describe reality using a small number of parameters in the model or requires too much time and resources.

Very often none of the functions are known exactly in the real problem - only approximate values or expected profits. In order to get rid of the uncertainty, we have to fix the function and by doing so lower the accuracy of the description of the problem. A deterministic algorithm for finding the optimal solution (simplex method) is only applicable when all of the given functions are linear. In real business problems, this condition is not satisfied. Although these functions can be approximated by linear ones but in this case the solution will not be optimal.

New technologies

The disadvantages of traditional methods stimulate the development with varying success of a new type of analytical systems. At their base lie artificial intelligence technologies that mimic natural processes, such as the activities of brain neurons or the process of natural selection.

The most popular and proven of these technologies are neural networks and genetic algorithms. The first commercial implementation based on them appeared in the 80s and proliferated wildly in developed countries.

Neural networks are in some ways imitations of the brain, so use them successfully to solve a variety of “unclear" problems – the recognition of images, speech, handwriting, identification, classification, and prediction of patterns. In such problems, traditional technologies are powerless and neural networks often serve as the only effective method of solution.

Genetic algorithms consist in a special technique for finding optimal solutions, which has been successfully applied to various fields of science and business. These algorithms use the idea of natural selection among living organisms in nature, for this reason they are called genetic. Genetic algorithms are often used in conjunction with neural networks, allowing the creation of extremely flexible, fast and efficient data analysis tools.

Neural networks

Biological neural network

Nervous system and human brain consist of neurons interconnected by nerve fibers. Nerve fibers can transmit electrical impulses between neurons. All transmission processes from the irritations of our skin, eyes and ears, to the thinking and action management processes – All this is realized in a living organism as the transmission of electrical impulses between neurons. Let’s consider the structure of a biological neuron. Each neuron has nerve fiber outgrowths of two types: dendrites, which receive impulses, and one axon, through which neurons transmit impulses. Axon contacts with the dendrites of other neurons through special formations - synapses that affect the strength of the impulse.

We can assume that during the passage the synapse the strength of the pulse varies a certain number of times, which is called weight of synapse. Pulses received by neuron simultaneously through several dendrites are summed. If the total impulse exceeds a certain level, the neuron is stimulated, creates its own pulses, and transmits it further along the axon. Importantly, the synapse weights may evolve over time, affecting the behavior of the corresponding neuron.

According to some experts human cerebral cortex contains approximately 100 billion neurons, each of which is connected with other 1000-10000 neurons thereby we obtain about 1014 to 1015 interconnections.

Artificial neural networks

Artificial neural networks (ANN) are mathematical models, as well as their software or hardware implementation, based on the principle of functioning and organization of biological neural networks ie the networks of nerve cells of a living organism. This concept arose in the study of the processes occurring in the brain, and the attempt to simulate these processes. McCulloch and Pitts neural networks were the first such attempt [1]. Later, after the development of learning algorithms the obtained model started being used in the practical purposes: in problems of forecasting, pattern recognition, management, etc.

ANN is a system of connected and interacting simple processors (artificial neurons). Such processors are usually quite simple, especially in comparison with the processors used in personal computers. Each processor of such a network has to deal only with signals that it receives and sends periodically to other processors. Nevertheless, connected in a large network with a controlled interaction, such locally simple processors can together perform some quite complex tasks.

Training

Neuron networks are not programmed in the conventional sense of the word, they are trained. The possibility of learning is one of the main advantages of neural networks over conventional algorithms. Technical training consists in finding the connection coefficients between neurons. During training, the neural network is able to identify complex relationships between inputs and outputs, as well as perform generalization. This means that in the case of successful learning the network will be able to produce a correct result based on the data which was absent in the training set, as well as incomplete and/or "incomplete" or partly distorted data.

Train a neural network means telling it what we are expecting of it. This process is very similar to teaching a child the alphabet. Showing a child the image of the letter "A" we ask him, “What letter is this?” If the answer is wrong, we tell him the correct answer that we want to get from him: "This is the letter “A". The child memorizes this example along with the correct answer, introducing in his memory some changes in the right direction. We will repeat the process of presenting letters over and over again until all 33 letters get firmly memorized. This process is called "supervised learning".

It turns out that after the presentation of multiple examples the weight of neural network gets stable, and the neural network starts giving correct answers to all ( or nearly all) examples from the database. In this case we say that “the neural network has learned all of the examples" or”the neural network is trained". Software implementations show that during the learning process, the frequency of errors (sum of squared errors of all outputs) gradually decreases. When the magnitude of the error reaches zero or an acceptable low level, the training is stopped and the resulting neural network is considered trained and ready to be used for new data.

It is important to note that all the information about the problem available to the neural network is contained in a set of examples. Therefore, the quality of learning of the neural network depends on the number of examples in the training set, as well as on how well these examples describe the problem. For example, using a neural network to predict the financial crisis is pointless, if the training set does not contain any crises examples. It is believed that to be fully trained the neural network requires at least a few dozens (and even hundreds) of examples.

Once again, training a neural network is a complex and knowledge-intensive process. Learning algorithms for neural networks have different settings and options; the management of these options requires understanding their functions.

Once the neural network is trained, it can be used to solve useful problems. The most important feature of the human brain is that once trained to a specific process, it can act in situations not encountered during the learning process. For example, we can read almost any handwriting, even when seeing it for the first time. Similarly, a properly trained neural network can be reasonably expected to react to new never encountered before data. For example, we can draw the letter "A” with a different handwriting, and then offer our neural network to classify a new image. A trained neural network can store a lot of information about the similarities and the differences between letters, so we can expect getting the correct answer for the new version of the image.

Application areas

Pattern recognition and classification

Different by nature objects can serve as patterns: text characters, images, sounds and samples, etc. When training a network we present a variety of samples of images indicating the class to which they belong. The sample is typically represented as a vector of feature values. At the same the set of all features should uniquely identify the class to which relates the sample. In case of lack of features the network may relate the same sample to several classes, which is wrong. On completion of training it is possible to give the network a previously unknown pattern and get the answer on its belonging to a certain class.

Decision-making and management

This problem is similar to the classification problem. Situations whose characteristics are input to the neural network are subject to classification. At the same time at the output of the network the taken decision should be identifiable. At the same time various criteria describing the state of the managed system are used as input signals.

Clusterization

Clusterization refers to partitioning a plurality of input signals into classes, though neither the quantity nor the characteristics of classes are known beforehand. After training, the network is able to determine to what class are the input signal belongs. The network may also notify that the input signal does not belong to any of the classes; this would be the sign of a new data missing from the learning samples. In this way, such a network can identify new, previously unknown classes of signals. The conformity between the classes, highlighted by the network, and the classes that exist in the subject domain, is to be set by the user. The clusterization is for example carried out by Kohonen neural networks.

Forecasting

The forecasting ability of a neural network comes directly from its ability to synthesize and isolate hidden dependencies between the input and output data. After training the network is able to predict the future value of a certain sequence based on several previous values and/or any currently existing factors. It should be mentioned that forecasting is only possible when previously occurred changes did to some extent determine the future. For example, forecasting the stock price on the basis of last week quotations could be a success (or not), whereas predicting the results of tomorrow's lottery based on data for the last 50 years will almost certainly not give any results.

Approximation

Neural networks can approximate continuous functions. From the choice of a nonlinear function depends the complexity of a particular network, but despite any nonlinearity the network will still remain a universal approximator and provided a correct choice of the structure it can approximate with sufficient accuracy the continuous functioning of any machine.

Data compression and Associative Memory

The ability of neural networks to identify the relationships between various parameters enables us to express more compactly the data of large size, if the data are closely interlinked with each other. The reverse process of restoring the original data set from pieces of information is called (auto) associative memory. Associative memory can also restore the original signal / image from a noisy / corrupted input data.

The technologies of neural networks and genetic algorithms are virtually applicable to any field. In some applications, such as forecasting prices or recognizing patterns neural networks have become a common tool. There is no doubt that the widespread penetration of new technologies in other areas is only a matter of time.