Statistical analysis of large genomic data sets