#
JSeqArray.seqOpen — Function.
seqOpen(filename, readonly, allow_dup)
Opens a SeqArray GDS file.
Arguments
filename::String: the file name of a SeqArray filereadonly::Bool=true: if true, the file is opened read-only; otherwise, it is allowed to write data to the fileallow_dup::Bool=false: if true, it is allowed to open a GDS file with read-only mode when it has been opened in the same session
Examples
julia> f = seqOpen(seqExample(:kg))
julia> f
julia> seqClose(f)#
JSeqArray.seqClose — Method.
seqClose(file)
Closes a SeqArray GDS file which is open.
Arguments
file::TSeqGDSFile: a SeqArray julia object
#
JSeqArray.seqFilterSet — Method.
seqFilterSet(file; sample_id, variant_id, intersect, verbose)
Sets a filter to sample and/or variant.
Arguments
file::TSeqGDSFile: a SeqArray julia objectsample_id::Union{Void, Vector}=nothing: sample ID to be selected, ornothingfor no actionvariant_id::Union{Void, Vector}=nothing: variant ID to be selected, ornothingfor no actionintersect::Bool=false: if false, the candidate samples/variants for selection are all samples/variants; if true, the candidate samples/variants are from the selected samples/variants defined via the previous callverbose::Bool=true: if true, show information
Examples
julia> f = seqOpen(seqExample(:kg));
julia> sid = seqGetData(f, "sample.id");
julia> vid = seqGetData(f, "variant.id");
julia> seqFilterSet(f, sample_id=sid[4:10], variant_id=vid[2:6])
Number of selected samples: 7
Number of selected variants: 5
julia> seqClose(f)#
JSeqArray.seqFilterSet2 — Method.
seqFilterSet2(file; sample, variant, intersect, verbose)
Sets a filter to sample and/or variant.
Arguments
file::TSeqGDSFile: a SeqArray julia objectsample::Union{Void, Vector{Bool}, Vector{Int}, UnitRange{Int}}=nothing: sample(s) to be selected, ornothingfor no actionvariant::Union{Void, Vector{Bool}, Vector{Int}, UnitRange{Int}}=nothing: variant(s) to be selected, ornothingfor no actionintersect::Bool=false: if false, the candidate samples/variants for selection are all samples/variants; if true, the candidate samples/variants are from the selected samples/variants defined via the previous callverbose::Bool=true: if true, show information
Examples
julia> f = seqOpen(seqExample(:kg));
julia> seqFilterSet2(f, sample=4:10, variant=2:6)
Number of selected samples: 7
Number of selected variants: 5
julia> seqClose(f)#
JSeqArray.seqFilterSplit — Method.
seqFilterSplit(file, index, count; verbose)
Splits the variants into multiple parts equally and selects the specified part.
Arguments
file::TSeqGDSFile: a SeqArray julia objectindex::Int: selects theindexth part (starting from 1)index::Int: the total number of non-overlapping partsverbose::Bool=true: if true, show information
Details
Users can define a subset of variants before calling seqFilterSplit() and split the selection of variants into multiple parts.
Examples
julia> f = seqOpen(seqExample(:kg));
julia> seqFilterSplit(f, 2, 5)
Number of selected variants: 3,954
julia> seqClose(f)#
JSeqArray.seqFilterReset — Method.
seqFilterReset(file; sample, variant, verbose)
Resets the sample and variant filters.
Arguments
file::TSeqGDSFile: a SeqArray julia objectsample::Bool=true: if true, resets the sample filtervariant::Bool=true: if true, resets the variant filterverbose::Bool=true: if true, show information
#
JSeqArray.seqFilterPush — Function.
seqFilterPush(file, reset)
Pushes the sample and variant filters to the stack for future uses.
Arguments
file::TSeqGDSFile: a SeqArray julia objectreset::Bool=false: if true, reset the sample and variant filters
#
JSeqArray.seqFilterPop — Method.
seqFilterPop(file)
Uses the last sample and variant filters saved in the stack, and removes them from the stack.
Arguments
file::TSeqGDSFile: a SeqArray julia object
#
JSeqArray.seqFilterGet — Function.
seqFilterGet(file, sample)
Gets the filter of samples and variants.
Arguments
file::TSeqGDSFile: a SeqArray julia objectsample::Bool=true: if true, returns a logical vector for the sample filter; otherwise, returns a logical vector for the variant filter
#
JSeqArray.seqGetData — Method.
seqGetData(file, name)
Gets data from a SeqArray GDS file.
Arguments
file::TSeqGDSFile: a SeqArray julia objectname::String: the variable name, see the details
Details
The variable name should be
- "sample.id", "variant.id", "position", "chromosome", "allele"
- "genotype" for 3-dim UInt8 array (ploidy, sample, variant) where 0 is the reference allele, 1 is the first alternative allele, 0xFF is missing value
- "annotation/id", "annotation/qual", "annotation/filter", "annotation/info/VARIABLE_NAME", "annotation/format/VARIABLE_NAME"
- "#dosage" for a dosage matrix (sample, variant) of reference allele (UInt8: 0, 1 and 2 for diploid genotypes, 0xFF for missing values)
- "#num_allele" returns an integer vector with the numbers of distinct alleles
Examples
julia> f = seqOpen(seqExample(:kg));
julia> pos = seqGetData(f, "position"); println(typeof(pos), ", ", length(pos))
Array{Int32,1}, 19773
julia> geno = seqGetData(f, "genotype"); println(typeof(geno), ", ", size(geno))
Array{UInt8,3}, (2,1092,19773)
julia> dosage = seqGetData(f, "#dosage"); println(typeof(dosage), ", ", size(dosage))
Array{UInt8,2}, (1092,19773)
julia> seqClose(f)#
JSeqArray.seqApply — Method.
seqApply(fun, file, name, args...; asis, bsize, verbose, kwargs...)
Applies the user-defined function over array margins.
Arguments
fun::Function: the user-defined functionfile::TSeqGDSFile: a SeqArray julia objectname::Union{String, Vector{String}}: the variable name(s), see the detailsargs: the optional arguments passed to the user-defined functionasis::Symbol=:none::none(no return),:unlist(returns a vector which contains all the atomic components) or:list(returns a vector according to each block)bsize::Int=1024: block size for the number of variants in a blockverbose::Bool=true: if true, show progress informationkwargs: the keyword optional arguments passed to the user-defined function
Details
The variable name should be
- "sample.id", "variant.id", "position", "chromosome", "allele"
- "genotype" for 3-dim UInt8 array (ploidy, sample, variant) where 0 is the reference allele, 1 is the first alternative allele, 0xFF is missing value
- "annotation/id", "annotation/qual", "annotation/filter", "annotation/info/VARIABLE_NAME", "annotation/format/VARIABLE_NAME"
- "#dosage" for a dosage matrix (sample, variant) of reference allele (UInt8: 0, 1 and 2 for diploid genotypes, 0xFF for missing values)
- "#num_allele" returns an integer vector with the numbers of distinct alleles
The algorithm is highly optimized by blocking the computations to exploit the high-speed memory instead of disk.
Examples
julia> f = seqOpen(seqExample(:kg));
julia> s = seqApply(f, "genotype", asis=:unlist, verbose=false) do geno
return sum(geno)
end;
julia> Int(sum(s))
3083127
julia> seqClose(f)#
JSeqArray.seqParallel — Method.
seqParallel(fun, file, args...; split, combine, kwargs...)
Applies a user-defined function in parallel.
Arguments
fun::Function: the user-defined functionfile::TSeqGDSFile: a SeqArray julia objectargs: the optional arguments passed to the user-defined functionsplit::Symbol=:byvariant::nonefor no split,:byvariantfor spliting the dataset by variant according to multiple processescombine::Union{Symbol, Function}=:unlist::none(no return),:unlist(returns a vector which contains all the atomic components) or:list(returns a vector according to each process)kwargs: the keyword optional arguments passed to the user-defined function
Details
Examples
#
JSeqArray.seqAttr — Method.
seqAttr(file, name)
Applies a user-defined function in parallel.
Arguments
file::TSeqGDSFile: a SeqArray julia objectname::Symbol: the symbol name for a specified attribute
Details
name::Symbol =
:nsampe- the total number of samples:nselsamp- the number of selected samples:nvar- the total number of variants:nselvar- the number of selected variants:ploidy- the number of sets of chromosomes
Examples
julia> f = seqOpen(seqExample(:kg));
julia> seqFilterSet2(f, sample=5:10, variant=31:40)
Number of selected samples: 6
Number of selected variants: 10
julia> seqAttr(f, :nsamp)
1092
julia> seqAttr(f, :nselsamp)
6
julia> seqAttr(f, :nvar)
19773
julia> seqAttr(f, :nselvar)
10
julia> seqAttr(f, :ploidy)
2
julia> seqClose(f)#
JSeqArray.seqExample — Method.
seqExample(file)
Returns the example SeqArray file.
Arguments
file::Symbol: specify which SeqArray file, it should be :kg for 1KG_phase1_release_v3_chr22.gds
Examples
julia> fn = seqExample(:kg);
julia> basename(fn)
"1KG_phase1_release_v3_chr22.gds"#
Base.show — Method.
show(io, file; attr, all)
Applies a user-defined function in parallel.
Arguments
io::: I/O streamfile::TSeqGDSFile: a SeqArray julia objectattr::Bool=false: if true, shows all attributesall::Bool=false: if true, show all GDS nodes including hidden nodes