src.util.dataclass module

Extensions to dataclasses, for streamlined class definition.

class src.util.dataclass.RegexPatternBase[source]

Bases: object

Dummy parent class for RegexPattern and ChainedRegexPattern.

class src.util.dataclass.RegexPattern(regex, defaults=None, input_field=None, match_error_filter=None)[source]

Bases: collections.UserDict, src.util.dataclass.RegexPatternBase

Wraps re.Pattern with more convenience methods. Extracts values of named fields from a string by parsing it with a regex with named capture groups, and stores those values in a dict.

clear()[source]

Erase an existing match.

_update_fields()[source]
update_defaults(d)[source]

Update the default values used for the match with the values in d.

match(str_, *args)[source]
_validate_match(match_obj)[source]

Hook for post-processing of match, running after all fields are assigned but before final check that all fields are set.

_abc_impl = <_abc_data object>
class src.util.dataclass.RegexPatternWithTemplate(regex, defaults=None, input_field=None, match_error_filter=None, template=None, log=<Logger src.util.dataclass (WARNING)>)[source]

Bases: src.util.dataclass.RegexPattern

Adds formatted output to RegexPattern.

Parameters
  • template – str, optional. Template string to use for formatting contents of match in format() method. Contents of the matched fields will be subsituted using the {}-syntax of python string formatting.

  • arguments the same (Other) –

format()[source]
_abc_impl = <_abc_data object>
class src.util.dataclass.ChainedRegexPattern(*string_patterns, defaults=None, input_field=None, match_error_filter=None)[source]

Bases: src.util.dataclass.RegexPatternBase

Class which takes an ‘or’ of multiple RegexPatterns. Matches are attempted on the supplied RegexPatterns in order, with the first one that succeeds determining the returned answer. Public methods work the same as on RegexPattern.

property is_matched
property data
clear()[source]
_update_fields()[source]
update_defaults(d)[source]
match(str_, *args)[source]
format()[source]
src.util.dataclass._mdtf_dataclass_get_field_types(obj, f)[source]

Common functionality for _mdtf_dataclass_type_coercion() and _mdtf_dataclass_type_check(). Given a datacalsses.Field object f, return either a tuple of the type its value should be coerced to and a tuple of the valid types its value can have, or (None, None) to signal a case we don’t handle.

src.util.dataclass._mdtf_dataclass_type_coercion(self, log)[source]

Do type checking on all dataclass fields after the auto-generated __init__ method, but before any __post_init__ method.

Warning

Type checking logic used is specific to the typing module in python 3.7. It may or may not work on newer pythons, and definitely will not work with 3.5 or 3.6. See https://stackoverflow.com/a/52664522.

src.util.dataclass._mdtf_dataclass_type_check(self, log)[source]

Do type checking on all dataclass fields after __init__ and __post_init__ methods.

Warning

Type checking logic used is specific to the typing module in python 3.7. It may or may not work on newer pythons, and definitely will not work with 3.5 or 3.6. See https://stackoverflow.com/a/52664522.

src.util.dataclass.mdtf_dataclass(cls=None, **deco_kwargs)[source]

Wrap dataclass() class decorator to customize dataclasses to provide (very) rudimentary type checking and conversion. This is hacky, since dataclasses don’t enforce type annontations for their fields. A better solution would be to use a deserialization library like pydantic.

After the auto-generated __init__ and the class’ __post_init__, the following tasks are performed:

  1. Verify that mandatory fields have values specified. We have to work around the usual dataclass() way of doing this, because it leads to errors in the signature of the dataclass-generated __init__ method under inheritance (mandatory fields can’t come after optional fields.) Mandatory fields must be designated by setting their default to MANDATORY, and a DataclassParseError is raised here if mandatory fields are uninitialized.

  2. Check each field’s value to see if it’s consistent with known type info. If not, attempt to coerce it to that type, using a from_struct method if it exists. Raise DataclassParseError if this fails.

Warning

Unlike dataclass(), all fields must have a default or default_factory defined. Fields which are mandatory must have their default value set to the sentinel object MANDATORY.

src.util.dataclass.is_regex_dataclass(obj)[source]
src.util.dataclass._regex_dataclass_preprocess_kwargs(self, kwargs)[source]

Edit kwargs going to the auto-generated __init__ method of this dataclass. If any fields are regex_dataclasses, construct and parse their values first.

Raises a DataclassParseError if different regex_dataclasses (at any level of inheritance) try to assign different values to a field of the same name. We do this by assigning to a ConsistentDict.

src.util.dataclass.regex_dataclass(pattern, **deco_kwargs)[source]

Decorator for a dataclass that adds a from_string classmethod which creates instances of that dataclass by parsing an input string with a RegexPattern or ChainedRegexPattern. The values of all fields returned by the match() method of the pattern are passed to the __init__ method of the dataclass as kwargs.

Additionally, if the type of one or more fields is set to a class that’s also been decorated with regex_dataclass, the parsing logic for that field’s regex_dataclass will be invoked on that field’s value (ie, a string obtained by regex matching in this regex_dataclass), and the parsed values of those fields will be supplied to this regex_dataclass constructor. This is our implementation of composition for regex_dataclasses.

Note

Unlike mdtf_dataclass(), type coercion is done after __post_init__ for these dataclasses. This is necessary due to composition: if a regex_dataclass is being instantiated as a field of another regex_dataclass, all values being passed to it will be strings (the regex fields), and type coercion is the job of __post_init__.

src.util.dataclass.dataclass_factory(dataclass_decorator, class_name, *parents, **kwargs)[source]

Function that returns a dataclass (ie, a decorated class) whose fields are the union of the fields specified in its parent classes.

Parameters
  • dataclass_decorator – decorator to apply to the new class.

  • class_name – name of the new class.

  • parents – collection of other mdtf_dataclasses to inherit from. Order in the collection determines the MRO.

  • kwargs – optional; arguments to pass to dataclass_decorator when it’s applied to produce the returned class.

src.util.dataclass.filter_dataclass(d, dc, init=False)[source]

Return a dict of the subset of fields or entries in d that correspond to the fields in dataclass dc.

Parameters
  • d – (dict, dataclass or dataclass instance):

  • dc – (dataclass or dataclass instance):

  • init

    bool or ‘all’, default False:

    • If False: Include only the fields of dc (as returned by

      dataclasses.fields().)

    • If True: Include only the arguments to dc’s constructor (ie, include

      any init-only fields and exclude any of dc’s fields with init=False.

    • If ‘all’: Include the union of the above two options.

Returns: dict containing the subset of key:value pairs from d such that the

keys are included in the set of dc’s fields specified by the value of init.

src.util.dataclass.coerce_to_dataclass(d, dc, **kwargs)[source]

Given a dataclass dc (may be the class or an instance of it), and a dict, dataclass or dataclass instance d, return an instance of dc’s class with field values initialized from those in d, along with any extra values passed in kwargs.