Skip to content

Error: both alignments contain the same sets of sequences #46

@brmagalis

Description

@brmagalis

Hello!

After modifying my alignments as best as possible to match the data.frame format of the example dataset ("reference_alignment"), I am receiving an error stating that the alignments do not contain the same set of sequences, even though they do. Below is the code I used to covert the alignments to data frames, and I have attached the fasta files (/Data) I would like to compare. Any help would be appreciated!
Data.zip

## R script ######
library("AlignStat")
library("phylotools")
library("stringr")

fas_dir <- file.path("~/Desktop/test")

fas_files <- list.files(path=fas_dir, pattern="*fasta")

list_df<-lapply(setNames(,fas_files), function(x,y,z,a) {
  y<-as.data.frame(phylotools::read.fasta(x))
  z<-stringr::str_split(y$seq.text, "")
  a<-data.frame(matrix(unlist(z), nrow=length(z), byrow=T))
  rownames(a)<-as.character(unlist(y[,1]))
  as.data.frame(t(a), row.names=F, stringsAsFactors=T)
})

PAC.vm<-compare_alignments (list_df[["HIVenv_valign_cut.fasta"]], 
                            list_df[["HIVenv_malign_cut.fasta"]])

plot_similarity_summary (PAC.vm, scale=TRUE, CS=FALSE, cys=FALSE, display=TRUE)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions