The InterProScan_Results_Function app will parse InterProScan XML result files and generate readable tables and a gene association file (GAF) file that can be used in subsequent GO enrichment analyses and functional data count files. For this app to work, InterProScan must have been run with the GO annotation and pathways annotation parameters checked (default setting).
- To use InterProScan_Results_Function, import your xml output file from InterProScan.
All tables are tab separated, with multiple values separated by a semi-colon. Tables are txt files that may be opened in text-editors or loaded into Excel.
The tables produced are:
This table includes input accessions, number of InterPro IDs for each accession, InterPro IDs assigned to each sequence and the InterPro ID name.
ENSGALP00000006626 1 IPR006121 DOMAIN:HeavyMe-assoc_HMA
ENSGALP00000004419 2 IPR016135;IPR017986 DOMAIN:UBQ-conjugating_enzyme/RWD;DOMAIN:WD40_repeat_dom
This table includes input accessions, the number of GO IDs assigned to each accession and GO ID names. GO IDs are split into BP (Biological Process), MF (Molecular Function) and CC (Cellular Component).
ENSGALP00000043106 1 GO:0008270 zinc ion binding
ENSGALP00000006626 2 GO:0030001 metal ion transport GO:0046872 metal ion binding
ENSGALP00000034620 3 GO:0042773;GO:0055114 ATP synthesis coupled electron transport;oxidation-reduction process GO:0016651 oxidoreductase activity, acting on NAD(P)H
This table includes input accessions, number of pathway IDs for the accession and the patheway names. GMultiple values are separated by a semi-colon.
ENSGALP00000002985 1 Reactome: REACT_14797 Signaling by GPCR
ENSGALP00000020373 2 KEGG: 00920+220.127.116.11;MetaCyc: PWY-5350 Sulfur metabolism;Thiosulfate disproportionation III (rhodanese)
This table follows the formatting of a gene association file (gaf) and can be used in GO enrichment analyses. However the exact format that enrichment tools use varies, so please check these requirements prior to use. For more information about the gaf format please see:
This table counts the numbers of sequences assigned to each GO ID so that the user can quickly identify all genes assigned to a particular function.
GO:0000381 regulation of alternative mRNA splicing, via spliceosome Biological_Process 1 ENSGALP00000001460
GO:0006421 asparaginyl-tRNA aminoacylation Biological_Process 2 ENSGALP00000004871;ENSGALP00000027851
This table counts the numbers of sequences assigned to each InterPro ID so that the user can quickly identify all genes with a particular motif.
IPR019495 FAMILY:EXOSC1 1 ENSGALP00000032597
IPR026622 FAMILY:Mxra7 2 ENSGALP00000002786;ENSGALP00000042423
This table counts the numbers of sequences assigned to each Pathway ID so that the user can quickly identify all genes assigned to a pathway.
KEGG: 00232+18.104.22.168 Caffeine metabolism 1 ENSGALP00000014144
MetaCyc: PWY-6369 Inositol pyrophosphates biosynthesis 2 ENSGALP00000013649;ENSGALP00000007450
This file will list any sequences that were not able to be analyzed by InterProScan. Examples of sequences that will cause an error are sequences with a lrge run of Xs and sequences >10,000 aa.