Anti-RNP sera were used to isolate a cDNA clone for the largest polypeptide of the U1 snRNP, a protein of mol. wt 70 kd designated 70K, from a human liver cDNA library constructed in the expression vector pEX1. The cro-beta-galactosidase-70K fusion protein reacted with various anti-RNP patient sera, a rabbit anti-70K antiserum, as well as with a monoclonal antibody specific for this protein. The sequences of four 70K peptides were determined and they match parts of the deduced amino acid sequence of the 1.3 kb insert of p70.1 indicating that it is a genuine 70K cDNA. Screening of a new cDNA library constructed from polysomal mRNA of HeLa cells with the p70.1 clone yielded an overlapping clone, FL70K, which was 2.7 kb long and covered the complete coding and 3'-untranslated sequence of the 70K protein in addition to 680 nucleotides upstream of the putative initiation codon, The predicted mol. wt of the encoded protein is approximately 70 kd. Amino acid analysis of the purified HeLa 70K protein yielded values close or identical to those deduced from the nucleotide sequence of the full-length cDNA. The 70K protein is rich in arginine (20%) and acidic amino acids (18%). Extremely hydrophilic regions containing mixed-charge amino acid clusters have been identified at the carboxyl-terminal half of the protein, which may function in RNA binding. A sequence comparison with two recently cloned RNA binding proteins revealed homology with one region in the U1 RNP 70K protein. This domain may also be responsible for RNA binding.