caoscrawler.converters.transformer_converters module#
Converters for transforming text elements. Provide similar functions to the text transformers.
- class caoscrawler.converters.transformer_converters.SplitTextConverter(definition, *args, **kwargs)#
Bases:
_BaseTransformTextConverterSplits the given TextElement into a list of TextElements, based on the separator given in the definition. Valid keys for the separator are “sep”, “separator”, “marker”, and “split_on”. Example for usage:
- …
- text_to_split:
type: SplitTextConverter sep: “;” match_name: “ALIASES” match_value: (?P<text_to_split_value>.*) subtree:
- list_entry:
type: TextElement …
- create_children(generalStore: GeneralStore, element: StructureElement)#
- class caoscrawler.converters.transformer_converters.TransformTextConverter(definition, *args, **kwargs)#
Bases:
_BaseTransformTextConverterApplies the specified text transformer to the given TextElement. The transformer name should be given in options as “transformer”. If the transformer needs parameters, these may be supplied in one of “params”, “parameters”, or “arguments”. Example for usage:
- …
- text_to_transform:
type: TransformTextConverter transformer: “replace” parameters:
old: “;” new: “,”
match_name: “.*” match_value: (?P<original_text>.*) subtree:
- transformed_text:
type: TextElement …
- create_children(generalStore: GeneralStore, element: StructureElement)#