`onconova.research.compilers`

`DATASET_ROOT_FIELDS` `module-attribute` ¶

`AggregationNode(key, annotation_nodes=list(), nested_aggregation_nodes=list(), aggregated_model=None, aggregated_model_parent_related_name=None)` `dataclass` ¶

Represents an aggregation node with a key and an associated list of annotation nodes and/or nested aggregation nodes.

Attributes:

Name	Type	Description
`key`	`str`	The unique identifier for the aggregation.
`annotation_nodes`	`List[AnnotationNode]`	The annotations associated with the aggregation.
`nested_aggregation_nodes`	`List[AggregationNode]`	The nested aggregations associated with the aggregation.
`aggregated_model`	`Model`	The Django model that the aggregation operates on.
`aggregations_model_related_name`	`str`	The related name of the model that the aggregation operates on.

`aggregated_model` `class-attribute` `instance-attribute` ¶

`aggregated_model_parent_related_name` `class-attribute` `instance-attribute` ¶

`aggregated_subquery` `property` ¶

`annotation_nodes` `class-attribute` `instance-attribute` ¶

`annotations` `property` ¶

Returns a dictionary of annotations for the aggregation.

The returned dictionary contains the keys of the annotations as specified in the annotation nodes and the values are the corresponding Django ORM expressions.

If the aggregation node has nested aggregation nodes, their annotations are also included in the returned dictionary.

`key` `instance-attribute` ¶

`nested_aggregation_nodes` `class-attribute` `instance-attribute` ¶

`subquery` `property` ¶

Returns a subquery that can be used to create a Django ORM expression which will annotate a queryset with the aggregated results of the annotation nodes.

The returned subquery aggregates the annotations of the annotation nodes and returns a single value of type JSONB which contains the aggregated results.

The subquery is constructed by annotating the aggregated model with a JSONB object that contains the aggregated results of the annotation nodes. The subquery is then filtered to only include the related objects specified by the aggregations_model_related_name.

The subquery is annotated with a single field named related_json_object which contains the JSONB object with the aggregated results.

If the aggregation node has nested aggregation nodes, their annotations are also included in the returned JSONB object.

Raises:

Type	Description
`AttributeError`	If the aggregation node's subquery cannot be constructed without an aggregated model and its related name.
`AttributeError`	If the aggregation node's subquery cannot be constructed without annotations.

`add_annotation_node(key, expression)` ¶

Adds an annotation node to the current aggregation node.

The annotation node is constructed from the given key and expression.

The added annotation node is included in the annotations of the current aggregation node.

Parameters:

Name	Type	Description	Default
`key` ¶	`str`	The key to use for the annotation node.	required
`expression` ¶	`Expression`	The expression to use for the annotation node.	required

Returns:

Type	Description
`None`	None

Source code in onconova/research/compilers.py

def add_annotation_node(self, key: str, expression: Expression) -> None:
    """
    Adds an annotation node to the current aggregation node.

    The annotation node is constructed from the given key and expression.

    The added annotation node is included in the annotations of the
    current aggregation node.

    Args:
        key: The key to use for the annotation node.
        expression: The expression to use for the annotation node.

    Returns:
        None
    """
    self.annotation_nodes.append(AnnotationNode(key, expression))

`add_nested_aggregation_node(node)` ¶

Adds a nested aggregation node to the current aggregation node.

The added nested aggregation node is included in the annotations of the current aggregation node.

Parameters:

Name	Type	Description	Default
`node` ¶	`AggregationNode`	The nested aggregation node to add.	required

Returns:

Type	Description
`None`	None

Source code in onconova/research/compilers.py

def add_nested_aggregation_node(self, node: "AggregationNode") -> None:
    """
    Adds a nested aggregation node to the current aggregation node.

    The added nested aggregation node is included in the annotations of the
    current aggregation node.

    Args:
        node: The nested aggregation node to add.

    Returns:
        None
    """
    self.nested_aggregation_nodes.append(node)

`AnnotationCompiler(rules)` ¶

Compiles a list of dataset rules into an aggregation tree and generates the corresponding Django ORM annotations.

The tree is built by grouping rules by their resource models and creating an AggregationNode for each group. Annotation nodes are then added to the corresponding AggregationNode. The tree is built recursively by processing child rules for each node.

Parameters:

Name	Type	Description	Default
`rules` ¶	`List[DatasetRule]`	A list of dataset rules	required

Source code in onconova/research/compilers.py

def __init__(self, rules: List[DatasetRule]):
    """
    Initializes the AnnotationCompiler with a list of dataset rules.

    Args:
        rules: A list of dataset rules
    """
    self.rules = [DatasetRuleProcessor(rule) for rule in rules]
    self.aggregation_nodes: List[AggregationNode] = self._build_aggregation_tree(
        self.rules
    )

`aggregation_nodes` `instance-attribute` ¶

`rules` `instance-attribute` ¶

`generate_annotations()` ¶

Generates the Django ORM annotations for the dataset.

The annotations are generated by traversing the aggregation tree and building a dictionary of annotations. The dictionary contains the annotations in one of three forms.

Case 1: PatientCase properties at root of dataset. The key is the name of the property and the value is the Django ORM expression for the property.

Case 2: Nested resources. The key is the name of the nested resource and the value is the subquery for the nested resource.

Case 3: Simple annotations. The key is the name of the annotation and the value is the Django ORM expression for the annotation.

Returns:

Type	Description
`tuple[dict, list]`	A tuple of two elements. The first element is a dictionary of annotations and the second element is a list of field names.

Source code in onconova/research/compilers.py

def generate_annotations(self) -> Tuple[Dict[str, Expression], List[str]]:
    """
    Generates the Django ORM annotations for the dataset.

    The annotations are generated by traversing the aggregation tree
    and building a dictionary of annotations. The dictionary contains
    the annotations in one of three forms.

    Case 1: PatientCase properties at root of dataset. The key is the name
    of the property and the value is the Django ORM expression for the
    property.

    Case 2: Nested resources. The key is the name of the nested resource and
    the value is the subquery for the nested resource.

    Case 3: Simple annotations. The key is the name of the annotation and the
    value is the Django ORM expression for the annotation.

    Returns:
        (tuple[dict, list]): A tuple of two elements. The first element is a dictionary of annotations and the second element is a list of field names.
    """
    annotations = {}
    queryset_fields = ["pseudoidentifier"]
    for aggregation_node in self.aggregation_nodes:
        # Case 1: PatientCase properties at root of dataset
        if not aggregation_node.key:
            for annotation_node in aggregation_node.annotation_nodes:
                if annotation_node.key not in DATASET_ROOT_FIELDS:
                    annotations[annotation_node.key] = annotation_node.expression

                if annotation_node.key not in queryset_fields:
                    queryset_fields.append(annotation_node.key)
        elif aggregation_node.annotations:
            aggregation_node.key = aggregation_node.key + "_resources"
            annotations[aggregation_node.key] = aggregation_node.aggregated_subquery
            queryset_fields.append(aggregation_node.key)
    # Remove duplicates
    return annotations, queryset_fields

`AnnotationNode(key, expression)` `dataclass` ¶

Represents an annotation node with a key and an associated expression.

Attributes:

Name	Type	Description
`key`	`str`	The unique identifier for the annotation.
`expression`	`Expression`	The Django ORM expression associated with the annotation.

`expression` `instance-attribute` ¶

`key` `instance-attribute` ¶

`DatasetRuleProcessingError` ¶

Bases: RuntimeError

`DatasetRuleProcessor(rule)` ¶

Processes individual dataset rules and extracts necessary query information.

Source code in onconova/research/compilers.py

def __init__(self, rule: DatasetRule):
    self.schema_field = rule.field
    # Get the schema specified by the rule
    schema = self._get_schema(rule.resource.value)
    self.resource_model = self._get_orm_model(schema)
    # Resolve the related
    if self.resource_model == PatientCase:
        self.parent_model = None
    elif hasattr(self.resource_model, "case"):
        self.parent_model = PatientCase
    else:
        self.parent_model = next(
            (
                field.related_model
                for field in self.resource_model._meta.get_fields()
                if field.related_model and hasattr(field.related_model, "case")
            )
        )
    # Get other values
    self.model_field_name = self._get_model_field_name(schema)
    self.model_field = self._get_model_field(
        self.resource_model, self.model_field_name
    )
    self.value_transformer = self._get_transformer(rule.transform)

`annotation_key` `property` ¶

Returns a unique key used in dataset query annotations.

`field_annotation` `property` ¶

Returns the Django ORM annotation for this dataset field.

`model_field` `instance-attribute` ¶

`model_field_name` `instance-attribute` ¶

`parent_model` `instance-attribute` ¶

`parent_related_name` `property` ¶

`query_lookup_path` `property` ¶

Generates the Django ORM lookup path for querying the dataset field.

`related_model_annotation_key` `property` ¶

Determines the Django ORM lookup for related models.

`resource_model` `instance-attribute` ¶

`schema_field` `instance-attribute` ¶

`value_transformer` `instance-attribute` ¶

`QueryCompiler(cohort, rules)` ¶

Compiles a dataset query based on user-defined rules

QueryCompiler takes a cohort and a set of rules as input, and returns a QuerySet representing the dataset for that cohort.

Attributes:

Name	Type	Description
`cohort`	`Cohort`	The cohort to generate the dataset for
`rules`	`List[DatasetRule]`	The user-defined rules for generating the dataset

Source code in onconova/research/compilers.py

def __init__(self, cohort, rules: List[DatasetRule]):
    self.cohort = cohort
    self.rule_compiler = AnnotationCompiler(rules)

`cohort` `instance-attribute` ¶

`rule_compiler` `instance-attribute` ¶

`compile()` ¶

Compiles a QuerySet based on the rules provided

Returns:

Name	Type	Description
`QuerySet`	`QuerySet`	The dataset for the cohort

Source code in onconova/research/compilers.py

def compile(self) -> QuerySet:
    """
    Compiles a QuerySet based on the rules provided

    Returns:
        QuerySet: The dataset for the cohort
    """
    annotations, queryset_fields = self.rule_compiler.generate_annotations()
    return self.cohort.valid_cases.annotate(**annotations).values(*queryset_fields)

`construct_dataset(cohort, rules)` ¶

Compiles a QuerySet based on the rules provided

Parameters:

Name	Type	Description	Default
`cohort` ¶	`Cohort`	The cohort to generate the dataset for	required
`rules` ¶	`List[DatasetRule]`	The user-defined rules for generating the dataset	required

Returns:

Name	Type	Description
`QuerySet`	`QuerySet`	The dataset for the cohort

Source code in onconova/research/compilers.py

def construct_dataset(cohort, rules: List[DatasetRule]) -> QuerySet:
    """
    Compiles a QuerySet based on the rules provided

    Args:
        cohort (onconova.cohorts.models.Cohort): The cohort to generate the
            dataset for
        rules (List[DatasetRule]): The user-defined rules for generating the
            dataset

    Returns:
        QuerySet: The dataset for the cohort
    """
    return QueryCompiler(cohort, rules).compile()

2025-10-172025-10-17runner

onconova.research.compilers

DATASET_ROOT_FIELDS module-attribute ¶

AggregationNode(key, annotation_nodes=list(), nested_aggregation_nodes=list(), aggregated_model=None, aggregated_model_parent_related_name=None) dataclass ¶

aggregated_model class-attribute instance-attribute ¶

aggregated_model_parent_related_name class-attribute instance-attribute ¶

aggregated_subquery property ¶

annotation_nodes class-attribute instance-attribute ¶

annotations property ¶

key instance-attribute ¶

nested_aggregation_nodes class-attribute instance-attribute ¶

subquery property ¶

add_annotation_node(key, expression) ¶

key ¶

expression ¶

add_nested_aggregation_node(node) ¶

node ¶

AnnotationCompiler(rules) ¶

rules ¶

aggregation_nodes instance-attribute ¶

rules instance-attribute ¶

generate_annotations() ¶

AnnotationNode(key, expression) dataclass ¶

expression instance-attribute ¶

key instance-attribute ¶

DatasetRuleProcessingError ¶

DatasetRuleProcessor(rule) ¶

annotation_key property ¶

field_annotation property ¶

model_field instance-attribute ¶

model_field_name instance-attribute ¶

parent_model instance-attribute ¶

parent_related_name property ¶

query_lookup_path property ¶

related_model_annotation_key property ¶

resource_model instance-attribute ¶

schema_field instance-attribute ¶

value_transformer instance-attribute ¶

QueryCompiler(cohort, rules) ¶

cohort instance-attribute ¶

rule_compiler instance-attribute ¶

compile() ¶

construct_dataset(cohort, rules) ¶

cohort ¶

rules ¶

`onconova.research.compilers`

`DATASET_ROOT_FIELDS` `module-attribute` ¶

`AggregationNode(key, annotation_nodes=list(), nested_aggregation_nodes=list(), aggregated_model=None, aggregated_model_parent_related_name=None)` `dataclass` ¶

`aggregated_model` `class-attribute` `instance-attribute` ¶

`aggregated_model_parent_related_name` `class-attribute` `instance-attribute` ¶

`aggregated_subquery` `property` ¶

`annotation_nodes` `class-attribute` `instance-attribute` ¶

`annotations` `property` ¶

`key` `instance-attribute` ¶

`nested_aggregation_nodes` `class-attribute` `instance-attribute` ¶

`subquery` `property` ¶

`add_annotation_node(key, expression)` ¶

`key` ¶

`expression` ¶

`add_nested_aggregation_node(node)` ¶

`node` ¶

`AnnotationCompiler(rules)` ¶

`rules` ¶

`aggregation_nodes` `instance-attribute` ¶

`rules` `instance-attribute` ¶

`generate_annotations()` ¶

`AnnotationNode(key, expression)` `dataclass` ¶

`expression` `instance-attribute` ¶

`key` `instance-attribute` ¶

`DatasetRuleProcessingError` ¶

`DatasetRuleProcessor(rule)` ¶

`annotation_key` `property` ¶

`field_annotation` `property` ¶

`model_field` `instance-attribute` ¶

`model_field_name` `instance-attribute` ¶

`parent_model` `instance-attribute` ¶

`parent_related_name` `property` ¶

`query_lookup_path` `property` ¶

`related_model_annotation_key` `property` ¶

`resource_model` `instance-attribute` ¶

`schema_field` `instance-attribute` ¶

`value_transformer` `instance-attribute` ¶

`QueryCompiler(cohort, rules)` ¶

`cohort` `instance-attribute` ¶

`rule_compiler` `instance-attribute` ¶

`compile()` ¶

`construct_dataset(cohort, rules)` ¶

`cohort` ¶

`rules` ¶