Using the docker image locally without a cloud account.

I can use the image to create the relative directories and download the example database but I can not download the query db.
$ docker run --rm ncbi/blast efetch -db protein -format fasta \
    -id P01349 > queries/P01349.fsa
bash: queries/P01349.fsa: Permission denied
If the problem is related to not having a cloud account, then my question is why must I have a paid account in order to use the ncbi/blast docker image?
Thanks

However the following works the way I like!

docker ps

docker ps
CONTAINER ID   IMAGE          COMMAND   CREATED         STATUS         PORTS     NAMES
6437c2fc9c4f   4afb1f585a96   "bash"    4 minutes ago   Up 4 minutes             friendly_swartz


docker exec -it friendly_swartz bash
root@6437c2fc9c4f:/blast# ls
bin  blastdb  blastdb_custom  lib
root@6437c2fc9c4f:/blast# update_blastdb.pl --showall pretty --source gcp

Connected to GCP
BLASTDB                                                      DESCRIPTION                                                                                                              SIZE (GB)      LAST_UPDATED
swissprot                                                    Non-redundant UniProtKB/SwissProt sequences                                                                                 0.3573      2023-04-29
nr                                                           All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects        364.0284      2023-04-27
refseq_protein                                               NCBI Protein Reference Sequences                                                                                          144.4754      2023-05-05
landmark                                                     Landmark database for SmartBLAST                                                                                            0.3817      2023-04-25
pdbaa                                                        PDB protein database                                                                                                        0.1951      2023-04-29
nt                                                           Nucleotide collection (nt)                                                                                                303.7546      2023-04-30
pdbnt                                                        PDB nucleotide database                                                                                                     0.0143      2023-04-23
patnt                                                        Nucleotide sequences derived from the Patent division of GenBank                                                           15.7333      2023-04-28
refseq_rna                                                   NCBI Transcript Reference Sequences                                                                                        46.6038      2023-05-01
ref_prok_rep_genomes                                         Refseq prokaryote representative genomes (contains refseq assembly)                                                        19.6809      2023-04-29
ref_viruses_rep_genomes                                      Refseq viruses representative genomes                                                                                       0.1320      2023-04-29
ref_viroids_rep_genomes                                      Refseq viroids representative genomes                                                                                       0.0001      2022-06-25
ref_euk_rep_genomes                                          RefSeq Eukaryotic Representative Genome Database                                                                          350.4509      2023-04-13
split-cdd                                                    CDD split into 32 volumes                                                                                                   4.5709      2022-12-18
cdd                                                          CDD.v3.20                                                                                                                   3.7088      2022-09-21
GCF_000001405.39_top_level                                   Homo sapiens GRCh38.p13 [GCF_000001405.39] chromosomes plus unplaced and unlocalized scaffolds                              1.1572      2021-06-02
GCF_000001635.27_top_level                                   Mus musculus GRCm39 [GCF_000001635.27] chromosomes plus unplaced and unlocalized scaffolds                                  3.6543      2021-06-02
16S_ribosomal_RNA                                            16S ribosomal RNA (Bacteria and Archaea type strains)                                                                       0.0179      2023-04-15
18S_fungal_sequences                                         18S ribosomal RNA sequences (SSU) from Fungi type and reference material                                                    0.0023      2023-05-04
28S_fungal_sequences                                         28S ribosomal RNA sequences (LSU) from Fungi type and reference material                                                    0.0053      2023-05-04
ITS_RefSeq_Fungi                                             Internal transcribed spacer region (ITS) from Fungi type and reference material                                             0.0067      2022-10-28
ITS_eukaryote_sequences                                      ITS eukaryote BLAST                                                                                                         0.0331      2023-05-01
env_nt                                                       environmental samples                                                                                                      48.8039      2023-04-05
Betacoronavirus                                              Betacoronavirus                                                                                                            54.0961      2023-05-06
pataa                                                        Protein sequences derived from the Patent division of GenBank                                                               1.8011      2023-04-30
refseq_select_prot                                           RefSeq Select proteins                                                                                                     34.3461      2023-04-30
refseq_select_rna                                            RefSeq Select RNA sequences                                                                                                 0.0656      2023-04-30
env_nr                                                       Proteins from WGS metagenomic projects (env_nr).                                                                            3.9459      2023-04-30
LSU_eukaryote_rRNA                                           Large subunit ribosomal nucleic acid for Eukaryotes                                                                         0.0053      2022-12-05
LSU_prokaryote_rRNA                                          Large subunit ribosomal nucleic acid for Prokaryotes                                                                        0.0041      2022-12-05
SSU_eukaryote_rRNA                                           Small subunit ribosomal nucleic acid for Eukaryotes                                                                         0.0063      2022-12-05
mito                                                         NCBI Genomic Mitochondrial Reference Sequences                                                                              0.1252      2023-04-20
tsa_nr                                                       Transcriptome Shotgun Assembly (TSA) sequences                                                                              5.1253      2023-04-30
tsa_nt                                                       Transcriptome Shotgun Assembly (TSA) sequences                                                                              6.3491      2023-04-27
nt_euk                                                       Eukaryota nt                                                                                                              197.6799      2023-04-26
nt_prok                                                      Prokaryota (bacteria and archaea) nt                                                                                       51.1781      2023-05-01
nt_viruses                                                   Viruses nt                                                                                                                 51.2685      2023-05-01
nt_others                                                    Artificial and other seqs nt                                                                                                0.7473      2023-05-01
taxdb                                                        Taxonomy database                                                                                                           0.1670      2021-06-07


root@6437c2fc9c4f:/blast# efetch -db protein -format fasta \
    -id P01349 > queries/P01349.fsa
    
root@6437c2fc9c4f:/blast# ls
bin  blastdb  blastdb_custom  fasta  lib  queries  results

root@6437c2fc9c4f:/blast# cd queries

root@6437c2fc9c4f:/blast/queries# ls
P01349.fsa

root@6437c2fc9c4f:/blast/queries# cd ..

root@6437c2fc9c4f:/blast# efetch -db protein -format fasta \
    -id Q90523,P80049,P83981,P83982,P83983,P83977,P83984,P83985,P27950 \
    > fasta/nurse-shark-proteins.fsa
    
root@6437c2fc9c4f:/blast# ls
bin  blastdb  blastdb_custom  fasta  lib  queries  results

root@6437c2fc9c4f:/blast# makeblastdb -in /blast/fasta/nurse-shark-proteins.fsa -dbtype prot \
    -parse_seqids -out nurse-shark-proteins -title "Nurse shark proteins" \
    -taxid 7801 -blastdb_version 5


Building a new DB, current time: 05/07/2023 05:51:28
New DB name:   /blast/nurse-shark-proteins
New DB title:  Nurse shark proteins
Sequence type: Protein
Keep MBits: T
Maximum file size: 3000000000B
Adding sequences from FASTA; added 7 sequences in 0.000834227 seconds.


root@6437c2fc9c4f:/blast# blastdbcmd -entry all -db nurse-shark-proteins -outfmt "%a %l %T"
Q90523.1 106 7801
P80049.1 132 7801
P83981.1 53 7801
P83977.1 95 7801
P83984.1 190 7801
P83985.1 195 7801
P27950.1 151 7801

root@6437c2fc9c4f:/blast# blastdbcmd -list /blast/blastdb -remove_redundant_dbs  ##### this does not work

root@6437c2fc9c4f:/blast# blastp -query /blast/queries/P01349.fsa -db nurse-shark-proteins \
    -out /blast/results/blastp.out
    
root@6437c2fc9c4f:/blast# cd results

root@6437c2fc9c4f:/blast/results# ls
blastp.out


root@6437c2fc9c4f:/blast/results# cat blastp.out
BLASTP 2.14.0+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.

Database: Nurse shark proteins
           7 sequences; 922 total letters

Query= sp|P01349.2|RELX_CARTA RecName: Full=Relaxin; Contains: RecName:
Full=Relaxin B chain; Contains: RecName: Full=Relaxin A chain

Length=44
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

P80049.1 RecName: Full=Fatty acid-binding protein, liver; AltName...  14.2    0.96 


>P80049.1 RecName: Full=Fatty acid-binding protein, liver; AltName: Full=Liver-type 
fatty acid-binding protein; Short=L-FABP
Length=132

 Score = 14.2 bits (25),  Expect = 0.96, Method: Compositional matrix adjust.
 Identities = 3/9 (33%), Positives = 6/9 (67%), Gaps = 0/9 (0%)

Query  2    LCGRGFIRA  10
            +C R ++R 
Sbjct  123  VCTREYVRE  131

Lambda      K        H        a         alpha
   0.334    0.143    0.520    0.792     4.96 

Gapped
Lambda      K        H        a         alpha    sigma
   0.267   0.0410    0.140     1.90     42.6     43.6 

Effective search space used: 22680

  Database: Nurse shark proteins
    Posted date:  May 7, 2023  5:51 AM
  Number of letters in database: 922
  Number of sequences in database:  7

Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Neighboring words threshold: 11
Window for multiple hits: 40

root@6437c2fc9c4f:/blast/results# 






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using the docker image locally without a cloud account. #29

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Using the docker image locally without a cloud account. #29

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions