A simple note on Artificial Intelligence, Machine Learning, and stuff.

Date : 7 September, 2018.
Version: 0.0
By: Albert van der Sel.
Status: Just started.
Remark: The most simple note on AI, DL, ML, and stuff, in this Universe (and others too).




I like to try a note on these subjects. It's exciting material in itself, but that is ofcourse no garantee
that I will produce "something useful" ofcourse.
Indeed, it's only "Albert" at work here.....



Artificial Intelligence (AI) is usually regarded to be a broader domain (or area), than Machine Learning (ML),
or Deep Learning (DL). Indeed, since the very early times of computing, ideas of AI slumbered around,
which probably condensed into more substance since the second half of the '50's.

So, AI started out decades ago. There are an immense number of subfields, as you may also think of sensors, interfaces
from/to sensors, algolrithms to discriminate essential information from bulk data, algolrithems which
emulate a human interaction, Robotics etc..

It's quite reasonable to say that ML/DL evolved from AI. Hopefully, that will become clear later this text.
ML/DL is much more recent than AI, where the interest probably was much boosted since the '90's,
and it seemed to have accelerate since 2005 (or so). Although some early ideas also originates many decades ago (50's/60's).

In general, a consensus exist that the three area's (more or less) relate to each other
as shown in figure 1.

Fig. 1: Highlevel view on how AI, ML, DL relate to each other (or their scopes, so to speak).




Ofcourse, a reasonable description of AI/ML/DL should be tried to formulate right now. However, I wait a bit.

One main theme in ML is, that a "machine" can find a solution for a new situation, based on former "samples",
or former experiences. This then, can indeed be regarded as "learning".


Example:

Let's start with a very simple example of ML, as it can be implemented in code, or hardware.
Ofcourse, later on, we will see some facts on (un-)supervised modes, vectors, predictive functions, matrices and stuff.
Maybe it surprises you, but matrix calculus can also be one such important component.

Here, the term "machine" must be interpreted in the most widest sense. That is: a machine (computer),
or just some code, etc..

Many techniques are used, and under development in ML/DL. One such arena is "neural networks".

By the way, referring to figure 1, you might infer that AI does not equate to a neural network.
A neural network is "just" one technique used in AI, and especially in ML/DL.


One simplified implementation might be a sort of mesh of nodes, organized in layers.
Each node, or also called "neuron", has a number of inputs, and a number of outputs.
Usually, a variable "weight" is associated with each input. A specific weight attributed to a specific input,
determines the relative importance of that particular input, for the determination of the output.
The weights are tunable, and might be properties of the neuron itself, or determined by an external agency.

So, an example neuron has 3 inputs, and one output. Keep in mind that in general a neuron
might have "n" inputs and "m" outputs, like n=3 and m=1, or n=1 and m=4, or n=3 and m=20 etc..
With n=1 and m=1, it seems quite hard to knot neurons together in a mesh.
However, at this moment in this text, there are no constraints in effect.

Usually, the output of a neuron is thus determined by the weights of the inputs (in some way), but apart from that,
it's possible that a "treshold" is defined too. This means that the combined inputs must be equal or over
a certain treshold (a certain real value), before any output is activated.

Such a treshold is often "rearranged", and renamed, in something called a "bias". It's sort of the opposite
of a treshold. If a treshold is low, it's very likely that there will be output, having certain inputs.
Using the new convention: if the Bias is high, it's very likely that there will be output having certain inputs.
Just like a treshold, the Bias is a measure for the value, determining that output will occur.

Now, there usually are several layers in such neural network, and often you can find an input layer,
one or more "in-between layer(s)", and an output layer.
Again, at this moment in this text, there are no constraints in effect.


Fig. 2: Simple illustration of a neuron mesh. It could be hundreds or thousends of nodes.




The simple figure above, suggest a "one way" direction of signals. This is not always true,
depending on the type of neural network.

As it turned out, with the right setup, it is possible to "feed" the network with training examples,
while carefully tuning the weights and biases in the process. If this worked out, it is possible
that the system from then on, take correct decisions on it's own, when different inputs are provided.

There exists many code examples, e.g. in Python, which demonstrate a neural network
in a way (more or less) as described above.
Ofcourse, also large libraries exists with all sorts of implementations, under various development platforms.

Now: what makes it work? The learning that is...

There exists quite a few neural network variants. This is a topic for a later chapter.
They also differ in the effiency in "learning" for certain tasks, and how it is implemented.

One implementation is using "backpropagation". We knows that the nodes have "weights" defined,
which are tunable (or adjustable) parameters.

If we indeed started out with feeding "training samples" to the system, then we know of the inputs and the desired outputs.
This makes it possible to compare the states of the inputs/outputs compared to what we actually hoped to find.
One method can be this: By applying corrective "differentials" to the weighting functions, it is possible
to let the system work more accurately, in an iterative way, as we work through our training samples.

Those differentials are just those from mathematics. A differential already means the ratio of a small variation
in a variable, and the function which depends on that variable, like in "dy/dx".

In other words, by applying those corrective "differentials", we get closer and closer to our goal.

Then, hopefully (and already proven in many setups), we have learned the system how to tackle new problems.

All mechanics here, need to be explained ofcourse. But in a nutshell, we have an idea how a AI system
might work in reality. The example above, is counted to be in the realm of ML, which ofcourse itself is a part of AI.
It's true that neural nets/learning capacity is very important, in many fields of AI.
Or, maybe it's more correct to say that machine learning, thus ML, is very important, in many fields of AI.

To get into AI is impossible in a simple short note as this one. My goal is then only, to provide a high-level
overview on a selection of essentials in AI.

About this note:
If you are already quite familiar to AI, then this note is of no relevance to you.
However, if you are "new", this note might provide some relevant pointers (I hope).


1. Short overview, or outline: