[Bio-Linux] Blasting Multiple Fasta Files

Prash prash at bioclues.org
Tue May 5 10:57:28 EDT 2015


Dear Zain

It would still take time. Should you use queuing or mpich, it should make
your tasks done easy.  Above all, it all depends on how good the
configuration is.

Regards
Prash

On Tuesday, May 5, 2015, Zain A Alvi <zain.alvi at student.shu.edu> wrote:

>  Hi Marty,
>
> I apologize for the confusion. I am splitting a fasta file that contains
>  approximately 100,000 fasta sequences to 100 fasta files that contains
> 1000 sequences each.  I am hoping this will expedite the BLASTx process.
>
>
>  Kind regards,
>
>
>  Zain
>  ------------------------------
> *From:* Martin Gollery <mgollery at unr.edu
> <javascript:_e(%7B%7D,'cvml','mgollery at unr.edu');>>
> *Sent:* Tuesday, May 5, 2015 10:23 AM
> *To:* Bio-Linux help and discussion
> *Subject:* Re: [Bio-Linux] Blasting Multiple Fasta Files
>
>  Running a million BLASTX jobs on one sequence each is not going to save
> you time. It is better to run one BLASTX job on a million sequences.
>
>  -Marty
>
>
> On Tue, May 5, 2015 at 7:00 AM, Zain A Alvi <zain.alvi at student.shu.edu
> <javascript:_e(%7B%7D,'cvml','zain.alvi at student.shu.edu');>> wrote:
>
>>  Dear Sir or Madam,
>>
>>
>>  I hope everything is well. I have downloaded all the viral protein
>> sequences from the NCBI refseq database using their script from their
>> E-book.  I have de-novo assembled some viral genomes and I know BLASTX
>> takes a long time if the fasta is large.  I have been able to split the
>> large fasta file based on an user specified contig number in each new fasta
>> file.
>>
>>
>>  I was wondering is there a method to run BLASTX automatically on each
>> of the fasta files one at a time so that it will be able to complete in a
>> "shorter" amount of time as compared to BLASTing the whole large de-novo
>> assembled fasta file.  Then I was hoping to concatenate all the results
>> into one file.
>>
>>
>>  Sincerely,
>>
>>
>>  Zain
>>
>> _______________________________________________
>> Bio-Linux mailing list
>> Bio-Linux at nebclists.nerc.ac.uk
>> <javascript:_e(%7B%7D,'cvml','Bio-Linux at nebclists.nerc.ac.uk');>
>> http://nebclists.nerc.ac.uk/mailman/listinfo/bio-linux
>>
>>
>
>
>  --
> --
> Martin Gollery
> Senior Bioinformatics Scientist
> Tahoe Informatics
> www.bioinformaticist.biz
> www.hiddenmarkovmodels.com
>
>

-- 


Sent from iPad Mini
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bio-linux-list/attachments/20150505/793c2a7e/attachment.html>


More information about the Bio-linux-list mailing list