26 May 2022
Enzyme discovery using in silico data mining has become more attractive recently due to more affordable computing power, increased availability of sequencing data and better data analysis software. A pipeline for enzyme discovery has been developed at the Department of Biotechnology and Nanomedicine, SINTEF Industry, and applied successfully in different projects to discover a wide range of novel enzymes for different applications, including, among others, biomass (lignocellulose, xylan) degradation, plastic (PET) degradation, and molecular diagnostics (DNA-acting enzymes). The pipeline was built based on a variety of open-source tools, such as HMMER, BLAST and eggnog-mapper, which were integrated into automated workflows using in-house scripts or workflow languages such as nextflow, and deployed on a parallel High-Performance Computing (HPC) platform. The pipeline has successfully identified novel enzymes in which some examples will be presented together with the pipeline descriptions and in-house unique genetic data sources.