C. neoformans JEC21 microarray GO annotation


Assigning GO and goslim terms to the JEC21 microarray probes.
UPDATE: The GO association scripts were run May27, 2008 using seqdblite_go_database downloaded in March, 2008.

The files from April 2006 are available upon request.

Step 1: Assigning a GO term. The proteins represented on the JEC21 microarray were used in a BLASTP analysis of the gene ontology (GO) database. The annotation of the top matches (with an E-value of <10-20) were transferred to the JEC21 proteins.

Step 2: Assigning the ontology tree. The GO associations were used as the starting point for assigning the full tree from each of the three ontologies (biological process, molecular function, cellular component) for each GO term assigned in step 1.

Step 3: Assigning a goslim term. Each ontology tree report was parsed for terms that were found in the generic goslim ontology.

The GO assignments were done 5/28/08.

MasterID file: This Excel file contains the array index number from the CN_oligos_ctrls_stats.xls and a modified ID name which matches the IDs given in the GO and goslim files. The IDs were modified in order to match the protein IDs in the fasta files. The data in the GO files can be matched to the CN_oligos_ctrls_stats.xls or
ssgiles-v1_3_gal.csv files via the Index number. Filename: CryptoMA_masterIDs.xls (2.4 MB)

GO association file. This is a Excel file with the first column having the probe ID and the second column containing a matched GO term (there may be more than 1 match/ID. This can be imported into an analysis package capable of GO analysis, such as Spotfire. It may have to be saved in tab-delimited format before it can be imported into another application. Filename: CryptoMA_GOassociation.xls (2.6 MB)

GO term files. These are text files which are in GO flat file format. They represent the GO terms annotated to the JEC21 proteins for each of the 3 ontologies.
CryptoMA_component.txt
CryptoMA_process.txt
CryptoMA_function.txt

GOslim files: There are 3 Excel files, representing the 3 ontologies. Not all the probes on the microarray were assigned a goslim term, either because the probe did not have a match in the GO database or the GO term assigned to the probe did not have a match in the goslim file. The goslim terms were based on the generic_goslim file. It's probable using the yeast_goslim file would give different results, but I felt the generic_goslim file represented a larger dataset. The three files are:
CryptoMA_component_goslim.xls (2.1 MB)
CryptoMA_process_goslim.xls (2.1 MB)
CryptoMA_function_goslim.xls (2.1 MB)

Here is a summary of the number of times each GOslim term was observed.


For questions or problems with the GO files, contact:  Maureen Donlin

Last updated: May 29, 2008