Descriptor Storage
DescriptorElement Interface
The DescriptorElement interface defines a standard for storing and
retrieving a descriptor vector and it’s associated UID.
Descriptors, also known as feature vectors, are defined here as
numpy.ndarray instances.
We do not constrain the vector data type at this level.
Descriptor elements are also associated with a UID. There is no standard for UID generation imposed here and is left to the user or generating algorithm to define UID attribution. Generally a UID, or unique identifier, “is an identifier that is guaranteed to be unique among all identifiers used for those objects and for a specific purpose.”
These are generally constrained to fit the python Hashable type definition.
Storing Many Elements
We provide an interface for storing groups of descriptor elements called the
DescriptorSet.
This provides an interface for storing and retrieving sets of
DescriptorElement instances, accessing by UID, and iterating over
contained elements.
Reference
- class smqtk_descriptors.interfaces.descriptor_element.DescriptorElement(uuid: Hashable)[source]
Abstract descriptor vector container.
This structure supports implementations that cache descriptor vectors on a per-UUID basis.
UUIDs must maintain unique-ness when transformed into a string.
Descriptor element equality based on vector equality. Two descriptor vectors that are generated by different types of descriptor generator should not be considered the same (though, this may be up for discussion).
Stored vectors should be effectively immutable.
- classmethod from_config(config_dict: Dict, uuid: Hashable, merge_default: bool = True) T[source]
Instantiate a new instance of this class given the desired type, uuid, and JSON-compliant configuration dictionary.
- Parameters:
uuid – Unique ID reference of the descriptor.
config_dict – JSON compliant dictionary encapsulating a configuration.
merge_default – Merge the given configuration on top of the default provided by
get_default_config.
- Returns:
Constructed instance from the provided config.
- classmethod get_default_config() Dict[str, Any][source]
Generate and return a default configuration dictionary for this class. This will be primarily used for generating what the configuration dictionary would look like for this class without instantiating it.
By default, we observe what this class’s constructor takes as arguments, aside from the first two assumed positional arguments, turning those argument names into configuration dictionary keys. If any of those arguments have defaults, we will add those values into the configuration dictionary appropriately. The dictionary returned should only contain JSON compliant value types.
It is not be guaranteed that the configuration dictionary returned from this method is valid for construction of an instance of this class.
- Returns:
Default configuration dictionary for the class.
- classmethod get_many_vectors(descriptors: Iterable[DescriptorElement]) List[ndarray | None][source]
Get an iterator over vectors associated with given descriptors.
- Note:
Most subclasses should override internal method _get_many_vectors rather than this external wrapper function. If a subclass does override this classmethod, it is responsible for appropriately handling any valid DescriptorElement, regardless of subclass.
- Parameters:
descriptors – Iterable of descriptors to query for.
- Returns:
Iterable of vectors associated with the given descriptors or None if the descriptor has no associated vector. Results are returned in the order that descriptors were given.
- abstract has_vector() bool[source]
- Returns:
Whether or not this container current has a descriptor vector stored.
- abstract set_vector(new_vec: ndarray) DescriptorElement[source]
Set the contained vector.
If this container already stores a descriptor vector, this will overwrite it.
- Parameters:
new_vec – New vector to contain.
- Returns:
Self.
- class smqtk_descriptors.interfaces.descriptor_set.DescriptorSet[source]
Index of descriptors, keyed and query-able by descriptor UUID.
Note that these indexes do not use the descriptor type strings. Thus, if a set of descriptors has multiple elements with the same UUID, but different type strings, they will bash each other in these indexes. In such a case, when dealing with descriptors for different generators, it is advisable to use multiple indices.
- abstract add_descriptor(descriptor: DescriptorElement) None[source]
Add a descriptor to this index.
Adding the same descriptor multiple times should not add multiple copies of the descriptor in the index (based on UUID). Added descriptors overwrite indexed descriptors based on UUID.
- Parameters:
descriptor – Descriptor to index.
- abstract add_many_descriptors(descriptors: Iterable[DescriptorElement]) None[source]
Add multiple descriptors at one time.
Adding the same descriptor multiple times should not add multiple copies of the descriptor in the index (based on UUID). Added descriptors overwrite indexed descriptors based on UUID.
- Parameters:
descriptors – Iterable of descriptor instances to add to this index.
- abstract descriptors() Iterator[DescriptorElement][source]
Return an iterator over indexed descriptor element instances.
- abstract get_descriptor(uuid: Hashable) DescriptorElement[source]
Get the descriptor in this index that is associated with the given UUID.
- Parameters:
uuid – UUID of the DescriptorElement to get.
- Raises:
KeyError – The given UUID doesn’t associate to a DescriptorElement in this index.
- Returns:
DescriptorElement associated with the queried UUID.
- abstract get_many_descriptors(uuids: Iterable[Hashable]) Iterator[DescriptorElement][source]
Get an iterator over descriptors associated to given descriptor UUIDs.
- Parameters:
uuids – Iterable of descriptor UUIDs to query for.
- Raises:
KeyError – A given UUID doesn’t associate with a DescriptorElement in this index.
- Returns:
Iterator of descriptors associated to given uuid values.
- get_many_vectors(uuids: Iterable[Hashable]) List[ndarray | None][source]
Get underlying vectors of descriptors associated with given uuids.
- Parameters:
uuids – Iterable of descriptor UUIDs to query for.
- Raises:
KeyError: When there is not a descriptor in this set for one or more input UIDs.
- Returns:
List of vectors for descriptors associated with given uuid values.
- abstract has_descriptor(uuid: Hashable) bool[source]
Check if a DescriptorElement with the given UUID exists in this index.
- Parameters:
uuid – UUID to query for
- Returns:
True if a DescriptorElement with the given UUID exists in this index, or False if not.
- abstract items() Iterator[Tuple[Hashable, DescriptorElement]][source]
Return an iterator over indexed descriptor key and instance pairs.
- iterdescriptors() Iterator[DescriptorElement][source]
Deprecated alias for descriptors
- iteritems() Iterator[Tuple[Hashable, DescriptorElement]][source]
Deprecated alias for items
- abstract keys() Iterator[Hashable][source]
Return an iterator over indexed descriptor keys, which are their UUIDs.
- class smqtk_descriptors.descriptor_element_factory.DescriptorElementFactory(d_type: Type[DescriptorElement], type_config: Dict[str, Any])[source]
Factory class for producing DescriptorElement instances of a specified type and configuration.
- classmethod from_config(config_dict: Dict, merge_default: bool = True) T[source]
Instantiate a new instance of this class given the configuration JSON-compliant dictionary encapsulating initialization arguments.
This method should not be called via super unless and instance of the class is desired.
- Parameters:
config_dict – JSON compliant dictionary encapsulating a configuration.
merge_default – Merge the given configuration on top of the default provided by
get_default_config.
- Returns:
Constructed instance from the provided config.
- get_config() Dict[str, Any][source]
Return a JSON-compliant dictionary that could be passed to this class’s
from_configmethod to produce an instance with identical configuration.In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s
from_configclass-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.- Returns:
JSON type compliant configuration dictionary.
- Return type:
dict
- classmethod get_default_config() Dict[str, Any][source]
Generate and return a default configuration dictionary for this class. This will be primarily used for generating what the configuration dictionary would look like for this class without instantiating it.
It is not be guaranteed that the configuration dictionary returned from this method is valid for construction of an instance of this class.
- Returns:
Default configuration dictionary for the class.
- new_descriptor(uuid: Hashable) DescriptorElement[source]
Create a new DescriptorElement instance of the configured implementation
- Parameters:
uuid – UUID to associate with the descriptor
- Returns:
New DescriptorElement instance