Counting beans in Octave

Today we are going to see what I consider a neat implementation of count.

The problem, part 1

Suppose you have a polynomial like $P(x) = x^2-x-1$ , how many roots does it have and how are they grouped? This information is important, mostly in linear algebra, as the roots of the determinantal polynomial are the eigenvalues of the matrix. Still with me?

Linear algebra 101

In a vector space elements are vectors, like $(1,0,1)$ , and there are functions that operates on vectors. Some functions behave linearly with respect to vector sum and scalar multiplication, those function are called linear operator and can be identified with matrixes.

Sometimes for a linear operator $f$ there exists a value $\lambda$ and a vector $v$ such that

$f(v) = \lambda v$

The couple $\lambda,v$ is the eigenvalue, eigenvector.

We have algorythms we can use to prove that the eigenvalues exists and to find them, they are extremely useful with analysis of a large spectrum of problems in physics, mathematics, computer science, engineering and whenever there is some linear algebra to work with.

It is important to know that in a n-dimensional space for every linear operator there can be up to n eigenvalues, and some can be repeated, as an example in a 3 dimensional space the eigenvalues of an operator could be $1,1,2$ .

The problem, part 2

I’m working on an assignments where I have to compute some space of solutions to a problem and one of the very first things to do is to find the roots of a polynomial $P$ and find if they are duplicated. As an example:

$P(x) = (x + 3)^2 = x^2 + 6x + 9$

Has 2 roots $-3, -3$ , duplicated.

Working with octave is fairly easy, numbers comes in, numbers comes out and sometimes there is a graph, and it has built-in functions for things like finding roots.

% x^2 + 6 x + 9 => [1, 6, 9]
octave:> roots([1,6,9])
ans =

  -3.0000 + 0.0000i
  -3.0000 - 0.0000i

Wonderful, now we can go home, or not? Unfortunately the roots list is provided but without counting the occurrences of the roots.

Counting things

In octave (and matlab) there is an hack to count the occurences of something in a list

% given x, a list of something that can be ordered
function [count_x, unique_x] = count(x)
  % Return the unique elements of X sorted in ascending order.
  unique_x = unique(x);
  % count the occurrences of every element from unique_x in x
  count_x = histc(x, unique_x);
endfunction

but it does not work on things that can’t be ordered, like complex numbers (╯°□°）╯︵ ┻━┻ .

So here I am reimplementing something to count things

% given x, a vector of something that can be compared for equality
function [count_x, unique_x ] = count (x)
  [unique_x, to_unique, from_unique] = unique(x);

  order = lenght(unique_x);

  count_x = zeros(order,1);

  parfor i = 1:order
    count_x(i) = sum(from_unique == i);
  endparfor
endfunction

The real magic is in the 3 return values from unique.

unique_x is the list of unique elements in x, ‘nuff said
to_unique is a list of indexes that references elements in x
from_unique is a list of indexes that references elements in unique_x

and their meaning is that they are “filters”, getting the elements pointed by to_unique from x gives you the elements in unique_x and getting the elements pointed by from_unique from unique_x gives you x.

Let’s see an example

[unique_x, to_unique, from_unique] = unique([1,10,10,1,100])
unique_x =

     1    10   100

to_unique =

     4   3   5

from_unique =

     1   2   2   1   3

To obtain unique_x from x we select the fourth, third and fifht elment from x and obtain 1, 10, 100.

To obtain x from unique_x we select the first, second, second, first and third elment from unique_x and obtain 1, 10, 10, 1, 100.

Must be magic!

Ok now back to general, x has n elements, unique_x has k elements, how to count things?

There are k elements and hence there will be the numbers from 1 to k in from_unique, I can compare them with the number j and obtain a boolean vector, example:

% j = 1
octave:> [1, 10, 10, 1, 100] == 1
ans =
  [1, 0, 0, 1, 0]
octave:> sum(ans)
ans =
   2

% j = 3
octave:> [1,10,10,1,100] == 3
ans = 
  [0, 0, 0, 0, 0]
octave:> sum(ans)
ans = 
   0

A 1 appear in the position where the equality evaluates true, 0 otherwise, you can see the pattern and the fact that the sum of the 1 and 0 is the number of occurrences of j in x?

This is the last part of our count function

for i = 1:order
  count_x(i) = sum(from_unique == i);
endfor

for every element i from 1 to the number of unique elements we are going to compare it with the from_unique list and set the i-th element in count_x to the sum of the ones and zeros.

But wait, this is a process that can be parallelized so let’s do this in parallel

parfor i = 1:order
  count_x(i) = sum(from_unique == i);
endparfor

Conclusion

So that’s it, mostly, and the only question about this implementation is on using from_unique == i instead of x == e where e is an element of the list unique_x.

That’s an interesting question (have a cookie 🍪) because we have to count things, elements, not their indexes, even if the answer is the same.

The answer is that I am lazy and that this worked on first try but I could say something about comparing int is faster or maybe your comparison costs more than comparing an int.

Think in complex numbers, we have to check real and imaginary part for every number, that’s twice as expensive as one check, maybe you heard of quaternions and their crazy 3 imaginary units? Four times as expensive. This is definitely that time where by being lazy it happens that you do less work and better. Sounds crazy right?.