Jan 11, 2010

Winter School on Algorithms and Combinatorics

Attending a very interesting winter school organized by KAIST professors.  It's refreshing to brush up on theoretical concepts.

(1) Talk by Heekap Ahn (Postech) on Voronoi Diagram
Computational efficiency assumes that basic operations (+,-,*,assign,etc) take constant time O(1) and tries to count, given an input size n, how many basic operations is needed.

Convex hull is a set S of points in the plane that is the smallest convex set containing S: A naive algorithm will incur n^3 (n^2 to pick two points and see if it's at the edge) , but a smart algorithm will scan the network in clockwise direction (called the plane sweeping technique) and solve in nlogn (sort by x-axis) + 2n (clockwise step)

Line segment intersection has the worst case of n^2, but an output-sensitive algorithm (relative to the number of intersection) runs in (2n*2+2k)*logn = nlogn + klogn

(2) Talk by Kyomin Jung (KAIST) on NP-Completeness
Interesting observation about how our brains do arithmetics: We remember the addition of two numbers between 0-9, like a turing machine storing 10^2 combinations!, then extends this knowledge to do more complicated calculations.   Turing machine is a device with a finite amount of read-only "hard" memory (states) and an unbounded amount of read/write tape-memory.  변하지 않는 부분의 메모리 (하드웨어) + 연습장 (소프트웨어)

Notes on P and NP: (a) P is the set of problems that can be solved in polynomial time, given an input size n  (b) NP is the set of problems that can be *verified* in polynomial time, Given candidate solutions, one can verify whether a solution is right or wrong in P time  (c) P is a subset of NP   (d) The question of P=NP is related to the meaning of intelligence. (Let's say you've solved NP. If P=NP, then there might not be intelligence in the brain that solved it)  (e) Cryptography relies on P!=NP. Public keys are given out, but the combinations to generate a public key is practically hard to get.  (f) Unless we know P=NP, it is important to develop approximation algorithms!

Hamiltonian Path Problem (HP) - visit each node exactly once
Traveling salesman problem (TSP) - find the shortest path that is HP

If P (HP) reduces in polynomial time to Q (TSP), then P is no harder to solve than Q.   (a) NP-hard: if all problems all R\in NP are reducible to P, then P is NP-hard; NP중에서 가장 어려운 것  (b) NP-complete: if P is NP-hard and P\in NP, P is NP-complete (NP중에서 가장 쉬울 수 있다)   (c) Cook-Levin theorem: the first real case of NP-complete

Combinatorial optimization, max cut, underlying assumption is again P!=NP:  (a)  rho (ALGO, G) : given a graph G, how efficient is algorithm ALGO?   (b)  rho (ALGO) = min_G ( rho(ALGO, G) ): given any G instances, what is the best algorithm to solve a problem?  (c) rho (maxcut) = max_algo ( rho(algo) ): how hard is the problem itself?

PTAS (polynomial time approximation scheme): 1+/-e of the optimal
FPTAS (fully PTAST)

(3) Talk by Jinwoo Shin (MIT) on Metropolis-Hastings Rule
Suppose we only know local information, how could you sample users uniformly random? This problem is to find random walk without bias.  (a)  Uniformly random property is p_ij = p_ji   (b)  Adding a self loop can fix this easily, but could be prone to even-odd number oscillations.   (c)  A seminar work: Metropolis-Hastings rule since 1950s


(4) Talk by Sang-il Oum (KAIST) on Parameterized Complexity
Concepts: (a) Vertex cover : a set of vertices meeting all edges  (b) vertex cover problem (G, k): does G have a vertex cover of size <= k?  (c) Minor of a graph: a graph that could be obtained by contracting nodes or deleting nodes or edges

FTP (fixed parameter tractable): a problem with a parametr k is FPT if it can be answered in time O(f(k) n^c) for a computable function f and a fixed c. that is exponential on a fixed parameter k, but not in the input size n. Rod Downey & Michael Fellows

Kernelization: A decidable problem is FTP if and only if it has a kernel
문제가 input 사이즈와 상관없이 사이즈 k에 대해 변형될 수있을때

(5) Talk by Heekap Ahn (Postech) on Geometric Graphs
Notations:  (a) A graph without cycle is a forest; a connected forest is a tree.  (b) A separating set of G will make the graph disconnected.  (c) k-connected, if separating set has cardinality of at least k.  (d) A graph is planar if it can be drawn in the plane wihtout crossing edges.  That is, a planer graph has a crossing number of zero.  (e) Geometric graph is a subset of topological graph.

Main definition: The crossing number, denoted by cr(G), of a graph G is the least possible number of pairs of crossing edges in a graph of G.  (a) Finding the crossing number is NP-hard.  (b) There exist efficient algorithms to find crossing number of smaller than k; that is the problem is fixed parameter tractable.

(6) Talk by Kyomin Jung on Randomized Algorithms
Define a turing machine that can through a binary random coin, called a probabilistic turing machine.  (a) What is good about it? Smoothes the "worst case input distribution" into "randomness of algorithm." (b) Law of large number property means that independent random samplings lead to an average that is close to the expectation, e.g., Erdos-Renyi random graph G(n,p) (when does a giant component appears?), voting poll (how to find good sample for a poll?).

Chernoff Bound:  (a) Suppose we have a coin with a probability of a head p, then the expected number of heads is 1*p + 0*(1-p) = p.  (b) After flipping a coin m times, the error rate is <= exp ( - lamda^2*m
) where lambda is an error.  (c) Las Vegas algorithm is a randomized algorithm that always gives correct answer.  (d) Monte Carlo algorithm has deterministic running time, but its output could be correct with a certain probability.

BPP (bounded-error, probabilistic, polynomial time): the set of problems that have Monte Carolo algorithms.  (a) Is BPP=P (deterministic language)?  (b) Pseudo-random generator (PRG) picks a random number not purely random, but is very hard to determine its randomness in polynomial time.

Randomized min cut algorithm by David Karger: Repeat until |V|<=2, pick a random edge and contract the edge. The two remaining nodes represent the cut points.

No comments: