Combining learning and constraints for genome-wide protein annotation



This paper provides a framework for correct and high quality machine-generated annotated genomic protein annotation. This papers focuses on cell life such as protein function and protein protein interaction (PPI) that are tightly connected. Protein function is defined as characterization of Protein behavior as formalized by Gene Ontology Consortium (GO). In order to help Genome-wide prediction such as protein function and interaction which involves inferring the annotations of all proteins in a genome one can benefit from prior biological knowledge to reconcile prediction accuracy. OCELOT instantiates one task for every candidate GO term, i.e., deciding whether a given protein should be annotated with that term, plus a separate task for deciding whether a given protein pair interacts. The overall, genome-wide annotations are obtained by imposing consistency across the predictions of all tasks.