Thursday 19 June 2014

On Losses, Pauses and Stuff

This presentation is about my work that I did in France Telecom R&D as a postdoctoral researcher. Its presentation can be viewed from slideshare.net and also on Vimeo. Please follow the links below.

On Losses, Pauses and Jumps from PsyopsPrime on Vimeo.

Creative Commons License
Psyops Concepts by Psyops is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Based on a work at http://psyopsconcepts.blogspot.dk/.
Permissions beyond the scope of this license may be available at http://psyopsconcepts.blogspot.dk/.

Tuesday 17 June 2014

A Reflection on My Research Proposals

I have been writing a number of research proposals over the past some time. I thought that it would be a nice idea to reflect on them through a presentation. The presentation can be found below.











Creative Commons License
Psyops Concepts by Psyops is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Based on a work at http://psyopsconcepts.blogspot.dk/.
Permissions beyond the scope of this license may be available at http://psyopsconcepts.blogspot.dk/.

Hashing

Hashing is an incredibly common concept that pops up in the study of data structures and algorithms. Hash tables are quite popular for their applications in problems that require computational efficiency and fast lookups. Hash tables are used to implement caches, and memory efficient structures. The attached lecture slides reflect on the various concepts and techniques of hashing. In particular, separate chaining and open addressing are allocated more focus. Two interesting applications of hashing are also discussed. The first one is about how dropbox prevents itself from sharing copyrighted stuff without actually look at the stuff. This is accomplished through hashing. The other one is from the domain of symbolic regression. In this paper Maarten Keijzer proposes a technique to store GP trees in a subtree cache for faster lookups.



 Creative Commons License
Psyops Concepts by Psyops is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Based on a work at http://psyopsconcepts.blogspot.dk/.
Permissions beyond the scope of this license may be available at http://psyopsconcepts.blogspot.dk/.

Lots of Hash by mikecogh, on Flickr
Creative Commons Attribution-Share Alike 2.0 Generic License  by  mikecogh 

Sunday 22 December 2013

Program Space

Program Space
A Program Space is Really Messed Up
After having conceived the notion of search space, it may be quite useful to understand what is a program space. If you are familiar with genetic programming (GP), you may have heard about the term program space or the phrase that the search space in GP is composed of all the possible computer programs. So what is program space? Let us try to extrapolate about program space through our imagination a little bit.

Program space is also a search space. The search space in numerical optimization may be a real-valued hyper-plane. Obviously it depends on whether the coefficients you are trying to tune assume real values. In the case of integer programming, for instance, the search space is an integer-valued hyper-plane. As discussed elsewhere, the search space in GP corresponds to all the possible computer programs that might be the candidate solutions to a problem.

However, it not possible to make a mental picture of the program space in one's mind. The mere reason is that human mind can not visualize beyond three dimensions. However, we can try to understand the program space by listing the same reasons due to which it cannot be visualized.

GP normally utilizes a quality of computer programs that they can be represented by their syntax tree. This makes a lot of things easier for GP practitioners. For instance, it makes implementation, adaptation, execution and manipulation of GP programs easier if they are represented as syntax trees.

GP also normally restricts the size of a computer program to 17. One can have a program represented as an n-ary tree in GP. This means that the various branches of the tree can have either one, two or three leaves (or children). A branch normally does not have more than three leaves, however, there is no restriction. Moreover, a tree in GP may have many nodes either representing a coefficient, a function, an operand or a system variable.

But why do we care about all of this information? The reason is that knowing all of this we may get an idea that how much flexibility is allowed to GP in developing a tree that represents a computer program or a mathematical model. Let us call this as degree of freedom of some sort. What this tells us that a tree in GP is not restricted to acquiring a form that may be captured with a few number of parameters. We have the depth, the variables, the various functions and their various arities and coefficients that can assume random values. Moreover, all of the building blocks of a tree are again not expected to occupy a certain fixed position, they may occupy different positions.

All of these factors literally obfuscate the problem of visualizing a GP tree. In simple words it may be said that all of these factors and various combinations of these factors alone require different planes for themselves for developing concrete mathematical notions of program space. In other words, a program space is increasingly hyper-planar to the extent that it cannot be visualized easily. However, discussing it somehow is beneficial in the sense that we can get an idea that how difficult it actually is to conceive a GP program space. Undoubtedly, appreciating this difficulty fully in itself is synonymous to achieving a great milestone.

 Creative Commons License
Psyops Concepts by Psyops is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Based on a work at http://psyopsconcepts.blogspot.dk/.
Permissions beyond the scope of this license may be available at http://psyopsconcepts.blogspot.dk/.

Search Space

Search Space
Would a Search Space Look Somewhat Like This?
In machine learning (ML) we often hear about the term search space. The term is frequently used in describing almost any ML algorithm. Talk about any ML algorithm, like genetic algorithms, artificial neural networks, bio-inspired models, the term search space would somehow turn up in the discourse to haunt the student. A novice in the subject of machine learning may find it difficult to understand the meaning of the term, let alone conceiving the whole idea that it actually refers to.

So what does the term search space actually refer to. In the context of mathematical optimization it refers to a region that contains all the possible solutions of a problem. To this end, it is also expected to contain the optimal solution or solutions that may be considered to be optimal, sub-optimal or near optimal. But even this much of an explanation is a little bit vague. We can try to develop more concrete notions of search space with the help of the following example.

Suppose that you are given a mathematical model. You somehow know that the model is supposed to solve a well known regression problem. For instance, the problem may be find prices of different types of clothes to near perfection. The model that you are given and that is supposed to find the prices is a function of a few variables. These may be the color, the texture, the material and so on of the clothes. You may find the values of these variables from a relevant corpus. Now, the model also has a few coefficients. Depending upon the model there can be two, three or even a lot more coefficients. In numerical optimization you are normally expected to find the values of these coefficients.  

In the context of numerical optimization, the search space corresponds to the values of  all the possible combinations of these coefficients. For instance, if your model has six coefficients and if they are expected to assume real numbered values, then the search space is the six dimensional real plane. 

It may help a lot if we have a way to visualize the six dimensional real plane. This is however not easy. It is not possible to visualize a space beyond three dimensions. So for the sake of ease of understanding we can restrict out search space to comprise of two unknown coefficients. That is to say that we can assume that our model has two coefficients and we have to find values of them. 

Visualization now becomes easier since our search space is comprised of the two dimensional real plane. We can reserve the third dimension of the three dimensional real plane for a so called error term. The error term tells us on as to how well the model performs if we tweak the values of the coefficients a little bit here and there. This may be the error between the output of our model and some known reference model to which we are trying to map our model.

Now, normally the search space starts to look more like a mountain range with peaks and valleys here and there. This way of thinking about the search space makes it easier to conceive it. As we choose different combinations of the coefficient values, we may hit somewhere close to a peak or to a valley depending upon what we have chosen. 

Normally the goal in numerical optimization is to hit the bottom of the lowest valley. This is specifically the case if our objective is error minimization as mentioned above. Most of the numerical optimization algorithms are designed by keeping this view of the search space in mind. Indeed, it is a goal of the algorithm to be able to traverse the whole search space nicely, easily and to be able to reach a globally optimal solution if at all.

Creative Commons License
Psyops Concepts by Psyops is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Based on a work at http://psyopsconcepts.blogspot.dk/.
Permissions beyond the scope of this license may be available at http://psyopsconcepts.blogspot.dk/.

Genetic Programming

A Typical GP Breeding Cycle
Genetic programming (GP) is another very important machine learning (ML) algorithm. It is also a very elegant algorithm and possibly the most powerful too. It belongs to the class of evolutionary algorithms (EAs). To this end, it is also a coarse adaptation of Darwinian evolution and tries to manifest the themes of natural selection and the survival of fittest in problem solving as they are  perceived about natural  biological systems.

In traditional ML algorithms the search space is comprised of all the possible solutions in terms of the possible values for the various coefficients for a known or a priori fixed mathematical model. The  underlying mathematical model may have been chosen to solve any real world problem. For instance, the mathematical model may be required for weather forecasting, financial bidding or robotic control of a machine. A traditional ML algorithm is run on such a model so as to tune its coefficients so that it starts performing optimally in some meaningful sense.

GP differs from traditional ML algorithms that simply aim at tuning the coefficients of a well known model. In GP the search space is comprised of all the possible computer programs or mathematical models that may be candidate solutions of an underlying problem. To this end, while other ML algorithms search the so called coefficient space for possible solutions of a problem, GP tries to find an optimal solution from the whole program space, whatever that may be.

Running GP is quite simple and there are loads of open source software that are available online. A typical breeding cycle of GP is also quite similar to a typical EA. Following is the pseudo code for a typical breeding cycle of GP. It has been taken from a discourse on genetic algorithms

  1. Create an initial population of individuals of size n. 
  2. Evaluate each individual on the given problem and assign it a rank R. 
  3. Randomly choose pairs of individuals as some function of R. 
  4. Apply genetic operators of crossover and mutation on the chosen individuals to form a pair of offspring. 
  5. Repeat step 4 until a new population is formed.  
  6. Go to step 2 and keep repeating till this point until the desired solution is found.

 Creative Commons License
Psyops by PsyopsPrime is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Based on a work at http://www.psyops.tk/.
Permissions beyond the scope of this license may be available at http://www.psyops.tk/.

Friday 13 December 2013

Simulated Annealing


Simulated annealing is one of the most elegant algorithms that belongs to the family of meta heuristics. It is quite easy to understand, simple to implement and run. It also has a very small memory footprint which basically means that it consumes a very small amount of CPU memory while running. This is specially considerable when we compare it to other compute intensive algorithms like genetic algorithms and genetic programming. 

Like many meta heuristic algorithms simulated annealing has been drawn as an analogy from materials science. The word annealing means to soften up something. In materials science and specially in metallurgical engineering the term annealing is basically used in the context of heat treatment of alloys. More specifically annealing is a process in which a metal or an allow is cooled down from a very high temperature at a very slow rate. The metal may be in a molten state. Cooling at slow rate allows the metal to form coarse grains and to settle in the states of lower energy. Although such grains can be observed only under a microscope, the result is that the metal becomes soft due to annealing. Hence the term annealing. The opposite of annealing is quenching. In this the metal is cooled at a very fast rate, preferably by soaking it in water. Hence the name quenching. Cooling at a much faster rate does not allow the grains of the metal to become bigger and to acquire lower energy levels. Hence, the metal becomes hard or brittle, or acquires these kind of properties.

In simulated annealing algorithm tries to look for a globally optimal solution for a problem just like other meta heuristic algorithms. In the context of numerical optimization it is normally sought to find a global minimum in a search space of candidate solutions. In order to achieve this the algorithm initially picks an initial solution randomly. After that it iteratively chooses better solutions in order to descend through the search space. The algorithm also allows itself to randomly choose bad solutions at times in the hope of hitting a trajectory on the search space that may lead to a globally optimal solution. The concern that how much should it be allowed to choose bad solutions depends on a term called the annealing temperature of the algorithm. 

The annealing temperature is initially chosen to have a high value. As the iterations proceed, the annealing temperature keeps on decreasing so as to restrict the use of bad solutions. The overall working of the algorithm can be explained with the help of the following pseudo-code.

  1. Choose a solution S for the problem randomly.
  2. Set the annealing temperature T to some sensible value.
  3. Pick another candidate solution S' .
  4. If S' is better than S then set S=S'. That is choose S' to be the new choice for a possible solution.
  5. If S' is not better than S then choose S' as the new solution with a certain probability that is a function of the annealing temperature T.
  6. Decrement T.
  7. Repeat from step 3 till this point until a certain desired solution is met, or a certain number of iterations have elapsed or a certain user defined objective has been met.
  8. Terminate the algorithm and keep S as the desired solution.



 Creative Commons License
Psyops by PsyopsPrime is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Based on a work at http://www.psyops.tk/.
Permissions beyond the scope of this license may be available at http://www.psyops.tk/.