Source:http://linkedlifedata.com/resource/pubmed/id/20394581
Switch to
Predicate | Object |
---|---|
rdf:type | |
lifeskim:mentions | |
pubmed:issue |
7
|
pubmed:dateCreated |
2010-6-30
|
pubmed:abstractText |
The transcription factor (TF) is a protein that binds DNA at specific site to help regulate the transcription from DNA to RNA. The mechanism of transcriptional regulatory can be much better understood if the category of transcription factors is known. We introduce a system which can automatically categorize transcription factors using their primary structures. A feature analysis strategy called "mRMR" (Minimum Redundancy, Maximum Relevance) is used to analyze the contribution of the TF properties towards the TF classification. mRMR is coupled with forward feature selection to choose an optimized feature subset for the classification. TF properties are composed of the amino acid composition and the physiochemical characters of the proteins. These properties will generate over a hundred features/parameters. We put all the features/parameters into a classifier, called NNA (nearest neighbor algorithm), for the classification. The classification accuracy is 93.81%, evaluated by a Jackknife test. Feature analysis using mRMR algorithm shows that secondary structure, amino acid composition and hydrophobicity are the most relevant features for classification. A free online classifier is available at http://app3.biosino.org/132dvc/tf/.
|
pubmed:language |
eng
|
pubmed:journal | |
pubmed:citationSubset |
IM
|
pubmed:chemical | |
pubmed:status |
MEDLINE
|
pubmed:month |
Jul
|
pubmed:issn |
1875-5305
|
pubmed:author | |
pubmed:issnType |
Electronic
|
pubmed:volume |
17
|
pubmed:owner |
NLM
|
pubmed:authorsComplete |
Y
|
pubmed:pagination |
899-908
|
pubmed:dateRevised |
2010-11-18
|
pubmed:meshHeading |
pubmed-meshheading:20394581-Algorithms,
pubmed-meshheading:20394581-Amino Acid Sequence,
pubmed-meshheading:20394581-Amino Acids,
pubmed-meshheading:20394581-Cysteine,
pubmed-meshheading:20394581-Hydrophobic and Hydrophilic Interactions,
pubmed-meshheading:20394581-Molecular Sequence Data,
pubmed-meshheading:20394581-Pattern Recognition, Automated,
pubmed-meshheading:20394581-Software,
pubmed-meshheading:20394581-Transcription Factors,
pubmed-meshheading:20394581-Tryptophan
|
pubmed:year |
2010
|
pubmed:articleTitle |
Classification of transcription factors using protein primary structure.
|
pubmed:affiliation |
CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China.
|
pubmed:publicationType |
Journal Article
|