Algorithms in Bioinformatics

LOCAS low-coverage short-read assembler

LOCAS information and download site.

LOCAS is a programm to assemble short reads of second generation sequencing technologies. It explicitly handles low coverage data by allowing mismatches in the overlap alignment of reads.

An extra modul, called SUPERLOCAS, provides some additional features for resequencing projects. In a resequencing project reads are mapped onto a closely related reference genome and a consensus from the mapped reads is calculated as an approximation of the new genome sequence. (Highly polymorphic regions and insert sites are not covered with this consenus.) SUPERLOCAS can be used to incorporate unmapped reads into the assembly of mapped regions and elongate this consenus. Further, SUPERLOCAS takes advantage of given mapping positions of reads. Both tools are written in C++.

We provide binaries for LOCAS and SUPERLOCAS. They are compiled on a LINUX system (Redhat, 64-bit). In addition also the source code is available.




Simulated read set for Arabidopsis thaliana Col-0:  


Simulated read set for an artificial strain of Arabidopsis thaliana


Real world data set for Arabidopsis thaliana Ler:  



