2005 International Conference on Analysis of Algorithms
Conrado Martínez (ed.)
DMTCS Conference Volume AD (2005), pp. 267274
author:  Gahyun Park and Wojciech Szpankowski 

title:  Analysis of biclusters with applications to gene expression data 
keywords:  Random matrix, twodimensional patterns, bicluster, microarray data, biclique. 
abstract: 
For a given matrix of size
n × m
over a finite alphabet
A
, a bicluster is a submatrix composed of selected
columns and rows satisfying a certain property. In
microarrays analysis one searches for largest biclusters in
which selected rows constitute the same string (pattern);
in another formulation of the problem one tries to find a
maximally dense submatrix. In a conceptually similar
problem, namely the bipartite clique problem on graphs, one
looks for the largest binary submatrix with all `
1
'. In this paper, we assume that the original matrix
is generated by a memoryless source over a finite alphabet
A
. We first consider the case where the selected
biclusters are square submatrices and prove that with high
probability (whp) the largest (square) bicluster having the
same rowpattern is of size
log
where
Q
2
n m
Q
is the (largest) probability of a symbol. We observe,
however, that when we consider any submatrices
(not just square submatrices), then the largest
area of a bicluster jumps to
1
A n
(whp) where
A
is an explicitly computable constant. These findings
complete some recent results concerning maximal biclusters
and maximum balanced bicliques for random bipartite graphs.

If your browser does not display the abstract correctly (because of the different mathematical symbols) you may look it up in the PostScript or PDF files.  
reference:  Gahyun Park and Wojciech Szpankowski (2005), Analysis of biclusters with applications to gene expression data , in 2005 International Conference on Analysis of Algorithms, Conrado Martínez (ed.), Discrete Mathematics and Theoretical Computer Science Proceedings AD, pp. 267274 
bibtex:  For a corresponding BibTeX entry, please consider our BibTeXfile. 
ps.gzsource:  dmAD0124.ps.gz (92 K) 
pssource:  dmAD0124.ps (216 K) 
pdfsource:  dmAD0124.pdf (153 K) 
The first source gives you the `gzipped' PostScript, the second the plain PostScript and the third the format for the Adobe accrobat reader. Depending on the installation of your web browser, at least one of these should (after some amount of time) pop up a window for you that shows the full article. If this is not the case, you should contact your system administrator to install your browser correctly.
Due to limitations of your local software, the two formats may show up differently on your screen. If eg you use xpdf to visualize pdf, some of the graphics in the file may not come across. On the other hand, pdf has a capacity of giving links to sections, bibliography and external references that will not appear with PostScript.
Automatically produced on Di Sep 27 10:09:34 CEST 2005 by gustedt