Package htsjdk.variant.variantcontext
Class GenotypesContext
- java.lang.Object
-
- htsjdk.variant.variantcontext.GenotypesContext
-
- All Implemented Interfaces:
Serializable,Iterable<Genotype>,Collection<Genotype>,List<Genotype>
- Direct Known Subclasses:
LazyGenotypesContext
public class GenotypesContext extends Object implements List<Genotype>, Serializable
Represents an ordered collection of Genotype objects- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static GenotypesContextNO_GENOTYPESstatic constant value for an empty GenotypesContext.protected ArrayList<Genotype>notToBeDirectlyAccessedGenotypesAn ArrayList of genotypes contained in this context WARNING: TO ENABLE THE LAZY VERSION OF THIS CLASS, NO METHODS SHOULD DIRECTLY ACCESS THIS VARIABLE.protected List<String>sampleNamesInOrdersampleNamesInOrder a list of sample names, one for each genotype in genotypes, sorted in alphabetical orderprotected Map<String,Integer>sampleNameToOffseta map optimized for efficient lookup.static longserialVersionUID
-
Constructor Summary
Constructors Modifier Constructor Description protectedGenotypesContext()Create an empty GenotypeContextprotectedGenotypesContext(int n)Create an empty GenotypeContext, with initial capacity for n elementsprotectedGenotypesContext(ArrayList<Genotype> genotypes)Create an GenotypeContext containing genotypesprotectedGenotypesContext(ArrayList<Genotype> genotypes, Map<String,Integer> sampleNameToOffset, List<String> sampleNamesInOrder)Create a fully resolved GenotypeContext containing genotypes, sample lookup table, and sorted sample names
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidadd(int i, Genotype genotype)booleanadd(Genotype genotype)Adds a single genotype to this context.booleanaddAll(int i, Collection<? extends Genotype> genotypes)booleanaddAll(Collection<? extends Genotype> genotypes)Adds all of the genotypes to this context Seeadd(Genotype)for important information about this functions constraints and performance costsvoidcheckImmutability()voidclear()booleancontains(Object o)booleancontainsAll(Collection<?> objects)booleancontainsSample(String sample)booleancontainsSamples(Collection<String> samples)static GenotypesContextcopy(GenotypesContext toCopy)Create a freshly allocated GenotypeContext containing the genotypes in toCopystatic GenotypesContextcopy(Collection<Genotype> toCopy)Create a GenotypesContext containing the genotypes in iteration order contained in toCopystatic GenotypesContextcreate()Basic creation routinestatic GenotypesContextcreate(int nGenotypes)Basic creation routinestatic GenotypesContextcreate(Genotype... genotypes)Create a fully resolved GenotypeContext containing genotypesstatic GenotypesContextcreate(ArrayList<Genotype> genotypes)Create a fully resolved GenotypeContext containing genotypesstatic GenotypesContextcreate(ArrayList<Genotype> genotypes, Map<String,Integer> sampleNameToOffset, List<String> sampleNamesInOrder)Create a fully resolved GenotypeContext containing genotypes, sample lookup table, and sorted sample namesprotected voidensureSampleNameMap()protected voidensureSampleOrdering()Genotypeget(int i)Genotypeget(String sampleName)Gets sample associated with this sampleName, or null if none is foundprotected ArrayList<Genotype>getGenotypes()intgetMaxPloidy(int defaultPloidy)What is the max ploidy among all samples? Returns defaultPloidy if no genotypes are presentSet<String>getSampleNames()List<String>getSampleNamesOrderedByName()GenotypesContextimmutable()intindexOf(Object o)protected voidinvalidateSampleNameMap()protected voidinvalidateSampleOrdering()booleanisEmpty()booleanisLazyWithData()booleanisMutable()Iterable<Genotype>iterateInSampleNameOrder()Iterate over the Genotypes in this context in their sample name order (A, B, C) regardless of the underlying order in the vector of genotypesIterable<Genotype>iterateInSampleNameOrder(Iterable<String> sampleNamesInOrder)Iterate over the Genotypes in this context in the order specified by sampleNamesInOrderIterator<Genotype>iterator()intlastIndexOf(Object o)ListIterator<Genotype>listIterator()ListIterator<Genotype>listIterator(int i)Genotyperemove(int i)Note that remove requires us to invalidate our sample -> index cache.booleanremove(Object o)See for important warningremove(int)booleanremoveAll(Collection<?> objects)Genotypereplace(Genotype genotype)Replaces the genotype in this context -- note for efficiency reasons we do not add the genotype if it's not present.booleanretainAll(Collection<?> objects)Genotypeset(int i, Genotype genotype)intsize()List<Genotype>subList(int i, int i1)GenotypesContextsubsetToSamples(Set<String> samples)Return a freshly allocated subcontext of this context containing only the samples listed in samples.Object[]toArray()<T> T[]toArray(T[] ts)StringtoString()-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface java.util.Collection
parallelStream, removeIf, stream, toArray
-
Methods inherited from interface java.util.List
equals, hashCode, replaceAll, sort, spliterator
-
-
-
-
Field Detail
-
serialVersionUID
public static final long serialVersionUID
- See Also:
- Constant Field Values
-
NO_GENOTYPES
public static final GenotypesContext NO_GENOTYPES
static constant value for an empty GenotypesContext. Useful since so many VariantContexts have no genotypes
-
sampleNamesInOrder
protected List<String> sampleNamesInOrder
sampleNamesInOrder a list of sample names, one for each genotype in genotypes, sorted in alphabetical order
-
sampleNameToOffset
protected Map<String,Integer> sampleNameToOffset
a map optimized for efficient lookup. Each genotype in genotypes must have its sample name in sampleNameToOffset, with a corresponding integer value that indicates the offset of that genotype in the vector of genotypes
-
-
Constructor Detail
-
GenotypesContext
protected GenotypesContext()
Create an empty GenotypeContext
-
GenotypesContext
protected GenotypesContext(int n)
Create an empty GenotypeContext, with initial capacity for n elements
-
GenotypesContext
protected GenotypesContext(ArrayList<Genotype> genotypes)
Create an GenotypeContext containing genotypes
-
GenotypesContext
protected GenotypesContext(ArrayList<Genotype> genotypes, Map<String,Integer> sampleNameToOffset, List<String> sampleNamesInOrder)
Create a fully resolved GenotypeContext containing genotypes, sample lookup table, and sorted sample names- Parameters:
genotypes- our genotypes in arbitrarysampleNameToOffset- map optimized for efficient lookup. Each genotype in genotypes must have its sample name in sampleNameToOffset, with a corresponding integer value that indicates the offset of that genotype in the vector of genotypessampleNamesInOrder- a list of sample names, one for each genotype in genotypes, sorted in alphabetical order.
-
-
Method Detail
-
create
public static final GenotypesContext create()
Basic creation routine- Returns:
- an empty, mutable GenotypeContext
-
create
public static final GenotypesContext create(int nGenotypes)
Basic creation routine- Returns:
- an empty, mutable GenotypeContext with initial capacity for nGenotypes
-
create
public static final GenotypesContext create(ArrayList<Genotype> genotypes, Map<String,Integer> sampleNameToOffset, List<String> sampleNamesInOrder)
Create a fully resolved GenotypeContext containing genotypes, sample lookup table, and sorted sample names- Parameters:
genotypes- our genotypes in arbitrarysampleNameToOffset- map optimized for efficient lookup. Each genotype in genotypes must have its sample name in sampleNameToOffset, with a corresponding integer value that indicates the offset of that genotype in the vector of genotypessampleNamesInOrder- a list of sample names, one for each genotype in genotypes, sorted in alphabetical order.- Returns:
- an mutable GenotypeContext containing genotypes with already present lookup data
-
create
public static final GenotypesContext create(ArrayList<Genotype> genotypes)
Create a fully resolved GenotypeContext containing genotypes- Parameters:
genotypes- our genotypes in arbitrary- Returns:
- an mutable GenotypeContext containing genotypes
-
create
public static final GenotypesContext create(Genotype... genotypes)
Create a fully resolved GenotypeContext containing genotypes- Parameters:
genotypes- our genotypes in arbitrary- Returns:
- an mutable GenotypeContext containing genotypes
-
copy
public static final GenotypesContext copy(GenotypesContext toCopy)
Create a freshly allocated GenotypeContext containing the genotypes in toCopy- Parameters:
toCopy- the GenotypesContext to copy- Returns:
- an mutable GenotypeContext containing genotypes
-
copy
public static final GenotypesContext copy(Collection<Genotype> toCopy)
Create a GenotypesContext containing the genotypes in iteration order contained in toCopy- Parameters:
toCopy- the collection of genotypes- Returns:
- an mutable GenotypeContext containing genotypes
-
immutable
public final GenotypesContext immutable()
-
isMutable
public boolean isMutable()
-
checkImmutability
public final void checkImmutability() throws UnsupportedOperationException- Throws:
UnsupportedOperationException
-
invalidateSampleNameMap
protected void invalidateSampleNameMap()
-
invalidateSampleOrdering
protected void invalidateSampleOrdering()
-
ensureSampleOrdering
protected void ensureSampleOrdering()
-
ensureSampleNameMap
protected void ensureSampleNameMap()
-
isLazyWithData
public boolean isLazyWithData()
-
clear
public void clear()
-
size
public int size()
-
isEmpty
public boolean isEmpty()
-
add
public boolean add(Genotype genotype) throws UnsupportedOperationException
Adds a single genotype to this context. There are many constraints on this input, and important impacts on the performance of other functions provided by this context. First, the sample name of genotype must be unique within this context. However, this is not enforced in the code itself, through you will invalid the contract on this context if you add duplicate samples and are running with CoFoJa enabled. Second, adding genotype also updates the sample name -> index map, so add() followed by containsSample and related function is an efficient series of operations. Third, adding the genotype invalidates the sorted list of sample names, to add() followed by any of the SampleNamesInOrder operations is inefficient, as each SampleNamesInOrder must rebuild the sorted list of sample names at an O(n log n) cost.- Specified by:
addin interfaceCollection<Genotype>- Specified by:
addin interfaceList<Genotype>- Parameters:
genotype-- Returns:
- Throws:
UnsupportedOperationException- if the context has been made immutable
-
addAll
public boolean addAll(Collection<? extends Genotype> genotypes)
Adds all of the genotypes to this context Seeadd(Genotype)for important information about this functions constraints and performance costs
-
addAll
public boolean addAll(int i, Collection<? extends Genotype> genotypes)
-
contains
public boolean contains(Object o)
-
containsAll
public boolean containsAll(Collection<?> objects)
- Specified by:
containsAllin interfaceCollection<Genotype>- Specified by:
containsAllin interfaceList<Genotype>
-
getMaxPloidy
public int getMaxPloidy(int defaultPloidy)
What is the max ploidy among all samples? Returns defaultPloidy if no genotypes are present- Parameters:
defaultPloidy- the default ploidy, if all samples are no-called- Returns:
-
get
public Genotype get(String sampleName)
Gets sample associated with this sampleName, or null if none is found- Parameters:
sampleName-- Returns:
-
lastIndexOf
public int lastIndexOf(Object o)
- Specified by:
lastIndexOfin interfaceList<Genotype>
-
listIterator
public ListIterator<Genotype> listIterator()
- Specified by:
listIteratorin interfaceList<Genotype>
-
listIterator
public ListIterator<Genotype> listIterator(int i)
- Specified by:
listIteratorin interfaceList<Genotype>
-
remove
public Genotype remove(int i)
Note that remove requires us to invalidate our sample -> index cache. The loop: GenotypesContext gc = ... for ( sample in samples ) if ( gc.containsSample(sample) ) gc.remove(sample) is extremely inefficient, as each call to remove invalidates the cache and containsSample requires us to rebuild it, an O(n) operation. If you must remove many samples from the GC, use either removeAll or retainAll to avoid this O(n * m) operation.
-
remove
public boolean remove(Object o)
See for important warningremove(int)
-
removeAll
public boolean removeAll(Collection<?> objects)
-
retainAll
public boolean retainAll(Collection<?> objects)
-
replace
public Genotype replace(Genotype genotype)
Replaces the genotype in this context -- note for efficiency reasons we do not add the genotype if it's not present. The return value will be null indicating this happened. Note this operation is preserves the map cache Sample -> Offset but invalidates the sorted list of samples. Using replace within a loop containing any of the SampleNameInOrder operation requires an O(n log n) resorting after each replace operation.- Parameters:
genotype- a non null genotype to bind in this context- Returns:
- null if genotype was not added, otherwise returns the previous genotype
-
toArray
public Object[] toArray()
-
toArray
public <T> T[] toArray(T[] ts)
-
iterateInSampleNameOrder
public Iterable<Genotype> iterateInSampleNameOrder(Iterable<String> sampleNamesInOrder)
Iterate over the Genotypes in this context in the order specified by sampleNamesInOrder- Parameters:
sampleNamesInOrder- a Iterable of String, containing exactly one entry for each Genotype sample name in this context- Returns:
- a Iterable over the genotypes in this context.
-
iterateInSampleNameOrder
public Iterable<Genotype> iterateInSampleNameOrder()
Iterate over the Genotypes in this context in their sample name order (A, B, C) regardless of the underlying order in the vector of genotypes- Returns:
- a Iterable over the genotypes in this context.
-
getSampleNames
public Set<String> getSampleNames()
- Returns:
- The set of sample names for all genotypes in this context, in arbitrary order
-
getSampleNamesOrderedByName
public List<String> getSampleNamesOrderedByName()
- Returns:
- The set of sample names for all genotypes in this context, in their natural ordering (A, B, C)
-
containsSample
public boolean containsSample(String sample)
-
containsSamples
public boolean containsSamples(Collection<String> samples)
-
subsetToSamples
public GenotypesContext subsetToSamples(Set<String> samples)
Return a freshly allocated subcontext of this context containing only the samples listed in samples. Note that samples can contain names not in this context, they will just be ignored.- Parameters:
samples-- Returns:
-
-