PASTASpark is a tool that uses the Big Data engine Apache Spark to boost the performance of the alignment phase of PASTA (Practical Alignments using SATé and TrAnsitivity). PASTASpark guarantees scalability and fault tolerance, and allows to obtain MSAs from very large datasets in reasonable time.
Citation: José M. Abuín, Tomás F. Pena and Juan C. Pichel. PASTASpark: multiple sequence alignment meets Big Data.
Bioinformatics, Vol. 33, Issue 18, pp. 2948-2950, 2017.
SparkBWA is a new tool that exploits the capabilities of a Big Data technology as Apache Spark to boost the performance of one of the most widely adopted DNA sequence aligner, the Burrows-Wheeler Aligner (BWA).
Citation: José M. Abuín, Juan C. Pichel, Tomás F. Pena and Jorge Amigo. SparkBWA: Speeding Up the Alignment of High-Throughput DNA Sequencing Data.
PLoS ONE, Vol. 11, Issue 5, pp. 1-21, 2016.
Citation: José M. Abuín, Juan C. Pichel, Tomás F. Pena and Jorge Amigo. BigBWA: Approaching the Burrows-Wheeler Aligner to Big Data Technologies.
Bioinformatics, Vol. 31, Issue 24, pp. 4003-4005, 2015.
Perldoop automatically translates Hadoop-ready Perl scripts into its Java counterparts, which can be directly executed on a Hadoop cluster while improving their performance significantly.
Citation: José M. Abuín, Juan C. Pichel, Tomás F. Pena, Pablo Gamallo and Marcos García. Perldoop: Efficient Execution of Perl Scripts on Hadoop Clusters. IEEE Int. Conference on Big Data (IEEE Big Data), pp. 766-771, 2014.