Run parallel BLAST for set of sequences
Usage
parallel_blast(
asvs,
db_path,
out_file,
out_RDS,
num_threads,
blast_type,
total_cores,
perc_id,
perc_qcov_hsp,
num_alignments,
verbose = FALSE,
env_name = "blast-env"
)
Arguments
- asvs
Character vector with sequences
- db_path
Complete path do formatted BLAST database.
- out_file
Complete path to output .csv file on an existing folder.
- out_RDS
Complete path to output RDS file on an existing folder.
- num_threads
Number of threads to run BLAST on. Passed on to BLAST+ argument
-num_threads
.- blast_type
BLAST+ executable to be used on search.
- total_cores
Total available cores to run BLAST in parallel. Check your max with future::availableCores()
- perc_id
Lowest identity percentage cutoff. Passed on to BLAST+
-perc_identity
.- perc_qcov_hsp
Lowest query coverage per HSP percentage cutoff. Passed on to BLAST+
-qcov_hsp_perc
.- num_alignments
Number of alignments to retrieve from BLAST. Max = 6.
- verbose
Should condathis::run() internal command be shown?
- env_name
The name of the conda environment with the parameter (i.e. "blast-env")
Examples
if (FALSE) { # \dontrun{
blast_res <- BLASTr::parallel_blast(
asvs = ASVs_test, # vector of sequences to be searched
db_path = "/data/databases/nt/nt", # path to a formatted blast database
out_file = NULL, # path to a .csv file to be created with results (on an existing folder)
out_RDS = NULL, # path to a .RDS file to be created with results (on an existing folder)
perc_id = 80, # minimum identity percentage cutoff
perc_qcov_hsp = 80, # minimum percentage coverage of query sequence by subject sequence cutoff
num_threads = 1, # number of threads/cores to run each blast on
total_cores = 8, # number of total threads/cores to allocate all blast searches
# maximum number of alignments/matches to retrieve results for each query sequence
num_alignments = 3,
blast_type = "blastn" # blast search engine to use
)
} # }