Friday, August 30, 2013

Proving GCD is a kernel

At MLSS yesterday Bernard Scholkopf presented an interesting question:

"Is gcd(m,n) a valid kernel"

To me this does not seem difficult. Below is a rather simple proof to show it is a kernel.

Step 1: To prove this, we first realize that it is sufficient to prove log(gcd(m,n)) is a kernel since we know that the exponent of a kernel is a kernel.

Step 2: I will prove that it is a kernel by explicitly constructing a vector space under which log(gcd(m,n) is a dot product.

CONSTRUCTION:

FEATURES: There is a single feature for each prime p (e.g. 2,3,5,7 ..) as well as every higher power of p as well. So for prime number p, there is a feature for p, p2, p3 .... (e.g. 5,25,125,....)

FEATURE VALUE: If the feature for prime p or any of its powers (p2, p3 etc..) is "on" (i.e. non-zero) then the value of the feature is always √log(p).

Lastly we have to describe the feature vector for an arbitrary number n

FEATURE VECTOR: For a number n with prime factorization n = p1q1 * p2q2 * ... pkqk. Then the feature vector for this number will only have these features turned on/non-zero: (∀ i) pi1,pi2,...,piqi

---

It is not hard to see that the dot product of two feature vectors in this space is the log of the gcd of the two numbers.

To see this consider a common prime factor p which has multiplicity q1 in the first number and multiplicity q2 (>= q1 w.l.o.g.) in the second number. Now note the two feature vectors will have p1,p2,...pq1 as common non-zero elements. Hence this contributes a factor of q1*log(p)=log(pq1) towards the dot product, which is exactly the contribution of this prime factor p towards the log(gcd).

---

There are other alternate proofs I can think of (including one which proves that the log gcd is a positive definite matrix). However this seems like the simplest and easiest to understand.

If you have an alternate proof that is even simpler please let me know.


No comments:

Post a Comment