SAGE (Serial Analysis of Gene Expression) can be used to estimate the number of unique transcripts in a transcriptome. A simple estimator that corrects for sequencing and sampling errors was applied to a SAGE library (137 832 tags) obtained from mouse embryonic stem cells, and also to Monte Carlo simulated libraries generated using assumed distributions of 'true' expression levels consistent with the data.
Laboratory of Cardiovascular Science, Gerontology Research Center, National Institute on Aging, NIH, 5600 Nathan Shock Drive, Baltimore, MD 21224, USA.