caoscrawler.transformers.substitution_transformers module#

Transformer functions for argument substitution.

See https://docs.linkahead.org for more information.

caoscrawler.transformers.substitution_transformers.contentless_list_to_none(in_value: list, params: dict)#

Checks the given list for meaningful content. If it only contains None, the empty string, or empty iterables (recursively) None is returned, otherwise the original value is returned.

Parameters:
  • in_value (list) – The list to be checked for content.

  • params (dict) – No parameters are expected.

Returns:

result – Original list if it has content which is not None, the empty string, or empty iterables. Otherwise None.

Return type:

list or None

caoscrawler.transformers.substitution_transformers.submatch(in_value: Any, in_parameters: dict)#

Alias for substitute_on_match.

caoscrawler.transformers.substitution_transformers.substitute_on_match(in_value: str, params: dict)#

If the given string matches params.pattern, returns params.substitute_with. Otherwise, the original input string is returned.

Parameters:
  • in_value (str) – Text to be matched.

  • params (dict) –

    “pattern”:

    The regex pattern to match against. Alternative keys: “match”

    ”substitute_with”:

    The value to return if the input value matches the given pattern. Alternative keys: “then”, “new_value”

Returns:

result – The substitute given in params if the input value matches the given pattern, otherwise the original input value.

Return type:

Any

caoscrawler.transformers.substitution_transformers.substitute_with_dict(in_value: Any, params: dict)#

If the given input is a valid key in params.mapping, returns params.mapping[in_value]. Otherwise, returns in_value.

Parameters:
  • in_value (Any) – The value to potentially replace.

  • params (dict) –

    “mapping”:

    Dict with the keys being possible in_values, and values being their replacements. Alternative keys: “vocabulary”, “dictionary”

Returns:

result – params.mapping[in_value] if in_value is in params.mapping, in_value otherwise.

Return type:

Any