Skip to content

onconova.research.models.cohort

Cohort

Bases: BaseModel

Represents a cohort of patient cases within a research project.

Attributes:

Name Type Description
objects QueryablePropertiesManager

Custom manager for queryable properties.

name CharField

Name of the cohort.

cases ManyToManyField[PatientCase]

Patient cases composing the cohort.

include_criteria JSONField

JSON object defining inclusion criteria for cohort membership.

exclude_criteria JSONField

JSON object defining exclusion criteria for cohort membership.

manual_choices ManyToManyField[PatientCase]

Manually added patient cases.

frozen_set ManyToManyField[PatientCase]

Cases that are frozen and not updated by criteria.

population AnnotationProperty

Annotated count of cases in the cohort.

project ForeignKey[Project]

Project to which the cohort is associated.

cases class-attribute instance-attribute

description property

Returns a string describing the cohort, including its name and the number of cases.

Returns:

Type Description
str

A formatted string with the cohort name and case count.

exclude_criteria class-attribute instance-attribute

frozen_set class-attribute instance-attribute

include_criteria class-attribute instance-attribute

manual_choices class-attribute instance-attribute

name class-attribute instance-attribute

objects class-attribute instance-attribute

population class-attribute instance-attribute

project class-attribute instance-attribute

valid_cases property

Returns a queryset of cases with a consent status marked as VALID.

Filters the related cases to include only those where the consent_status is set to PatientCase.ConsentStatus.VALID.

Returns:

Type Description
QuerySet[PatientCase]

A queryset of valid PatientCase instances.

get_cohort_trait_average(cases, trait, **filters) staticmethod

Calculates the average and standard deviation of a specified trait for a given queryset of cases, optionally applying additional filters.

Parameters:

Name Type Description Default

cases

QuerySet[PatientCase]

A Django queryset representing the cohort of cases.

required

trait

str

The name of the trait/field to aggregate.

required

filters

dict

Optional keyword arguments to filter the queryset.

{}

Returns:

Type Description
Tuple[float, float] | None

Tuple[float, float] | None: A tuple containing the average and standard deviation of the trait,

Tuple[float, float] | None

or None if the filtered queryset is empty.

Source code in onconova/research/models/cohort.py
@staticmethod
def get_cohort_trait_average(
    cases, trait: str, **filters
) -> Tuple[float, float] | None:
    """
    Calculates the average and standard deviation of a specified trait for a given queryset of cases,
    optionally applying additional filters.

    Args:
        cases (QuerySet[PatientCase]): A Django queryset representing the cohort of cases.
        trait (str): The name of the trait/field to aggregate.
        filters (dict): Optional keyword arguments to filter the queryset.

    Returns:
        Tuple[float, float] | None: A tuple containing the average and standard deviation of the trait,
        or None if the filtered queryset is empty.
    """
    if filters:
        cases = cases.filter(**filters)
    if not cases.exists():
        return None
    queryset = cases.aggregate(Avg(trait), StdDev(trait))
    return queryset[f"{trait}__avg"], queryset[f"{trait}__stddev"]

get_cohort_trait_counts(cases, trait, anonymization=None, **filters) staticmethod

Calculates the counts and percentage distribution of a specified trait within a cohort of cases.

Parameters:

Name Type Description Default

cases

QuerySet[PatientCase]

A Django QuerySet or iterable of case objects.

required

trait

str

The name of the trait/field to count within the cases.

required

anonymization

callable

A function to anonymize trait values. Defaults to None.

None

filters

dict

Additional keyword arguments to filter the cases QuerySet.

{}

Returns:

Type Description
dict

An OrderedDict mapping trait values (as strings) to a tuple of (count, percentage), where percentage is rounded to 4 decimal places.

Source code in onconova/research/models/cohort.py
@staticmethod
def get_cohort_trait_counts(
    cases, trait: str, anonymization=None, **filters
) -> dict:
    """
    Calculates the counts and percentage distribution of a specified trait within a cohort of cases.

    Args:
        cases (QuerySet[PatientCase]): A Django QuerySet or iterable of case objects.
        trait (str): The name of the trait/field to count within the cases.
        anonymization (callable, optional): A function to anonymize trait values. Defaults to None.
        filters (dict): Additional keyword arguments to filter the cases QuerySet.

    Returns:
        (dict): An OrderedDict mapping trait values (as strings) to a tuple of (count, percentage),
              where percentage is rounded to 4 decimal places.
    """
    if filters:
        cases = cases.filter(**filters)
    if not cases:
        return OrderedDict()
    values = cases.annotate(trait=F(trait)).values_list("trait", flat=True)
    if anonymization:
        values = [anonymization(value) if value else value for value in values]
    return OrderedDict(
        [
            (str(key), (count, round(count / len(values) * 100.0, 4)))
            for key, count in Counter(values).items()
        ]
    )

get_cohort_trait_median(cases, trait, **filters) staticmethod

Calculates the median and interquartile range (IQR) for a specified trait within a cohort.

Parameters:

Name Type Description Default

cases

QuerySet[PatientCase]

A Django QuerySet representing the cohort of cases.

required

trait

str

The name of the trait/field to compute statistics for.

required

filters

dict

Optional keyword arguments to filter the cases QuerySet.

{}

Returns:

Type Description
Tuple[float, Tuple[float, float]] | None

Optional[Tuple[float, Tuple[float, float]]]: A tuple containing the median value and a tuple of the 25th and 75th percentiles (IQR) for the trait. Returns None if no cases match the filters.

Source code in onconova/research/models/cohort.py
@staticmethod
def get_cohort_trait_median(
    cases, trait: str, **filters
) -> Tuple[float, Tuple[float, float]] | None:
    """
    Calculates the median and interquartile range (IQR) for a specified trait within a cohort.

    Args:
        cases (QuerySet[PatientCase]): A Django QuerySet representing the cohort of cases.
        trait (str): The name of the trait/field to compute statistics for.
        filters (dict): Optional keyword arguments to filter the cases QuerySet.

    Returns:
        Optional[Tuple[float, Tuple[float, float]]]: 
            A tuple containing the median value and a tuple of the 25th and 75th percentiles (IQR) for the trait.
            Returns None if no cases match the filters.
    """
    if filters:
        cases = cases.filter(**filters)
    if not cases.exists():
        return None
    queryset = cases.aggregate(
        Median(trait), Percentile25(trait), Percentile75(trait)
    )
    median = queryset[f"{trait}__median"]
    iqr = (queryset[f"{trait}__p25"], queryset[f"{trait}__p75"])
    return median, iqr

update_cohort_cases()

Updates the cohort's cases based on inclusion and exclusion criteria.

  • If a frozen set of cases exists, returns those cases.
  • If neither inclusion nor exclusion criteria are provided, returns an empty list.
  • Otherwise, filters PatientCase objects according to the inclusion criteria, then excludes cases matching the exclusion criteria.
  • Manually selected cases are added to the cohort.
  • The resulting set of cases is assigned to the cohort.

Returns:

Type Description
QuerySet | list

The updated set of cohort cases.

Source code in onconova/research/models/cohort.py
def update_cohort_cases(self):
    """
    Updates the cohort's cases based on inclusion and exclusion criteria.

    - If a frozen set of cases exists, returns those cases.
    - If neither inclusion nor exclusion criteria are provided, returns an empty list.
    - Otherwise, filters PatientCase objects according to the inclusion criteria,
      then excludes cases matching the exclusion criteria.
    - Manually selected cases are added to the cohort.
    - The resulting set of cases is assigned to the cohort.

    Returns:
        (QuerySet | list): The updated set of cohort cases.
    """
    from onconova.research.schemas.cohort import CohortRuleset

    if self.frozen_set.exists():
        return self.frozen_set.all()

    if not self.include_criteria and not self.exclude_criteria:
        return []

    cohort = PatientCase.objects.all()

    if self.include_criteria:
        query = CohortRuleset.model_validate(
            self.include_criteria
        ).convert_to_query()
        cohort = cohort.filter(next(query)).distinct()

    if self.exclude_criteria:
        query = CohortRuleset.model_validate(
            self.exclude_criteria
        ).convert_to_query()
        cohort = cohort.exclude(next(query)).distinct()

    cohort = cohort.union(self.manual_choices.all())
    self.cases.set(cohort)

Dataset

Bases: BaseModel

Represents a dataset within a research project.

Attributes:

Name Type Description
name CharField

The name of the dataset.

summary TextField

A brief summary of the dataset (optional).

rules JSONField

Composition rules for the dataset, validated as a list.

project ForeignKey[Project]

Reference to the associated Project.

last_export AnnotationProperty

Timestamp of the last export event.

total_exports AnnotationProperty

Total number of export events.

cohorts_ids AnnotationProperty

List of cohort IDs associated with export events.

cohorts_ids class-attribute instance-attribute

description property

Returns a string describing the dataset.

Returns:

Type Description
str

A formatted description of the dataset.

last_export class-attribute instance-attribute

name class-attribute instance-attribute

project class-attribute instance-attribute

rules class-attribute instance-attribute

summary class-attribute instance-attribute

total_exports class-attribute instance-attribute

save(*args, **kwargs)

Saves the current instance after validating its rules.

This method performs the following steps: 1. Imports the DatasetRule schema for rule validation. 2. Ensures that the 'rules' attribute is a list; raises ValueError if not. 3. Validates each rule in the 'rules' list using DatasetRule.model_validate. 4. Calls the superclass's save method to persist the instance.

Parameters:

Name Type Description Default

args

list

Variable length argument list passed to the superclass save method.

()

kwargs

dict

Arbitrary keyword arguments passed to the superclass save method.

{}

Raises:

Type Description
ValueError

If 'rules' is not a list.

ValidationError

If any rule fails validation via DatasetRule.model_validate.

Source code in onconova/research/models/dataset.py
def save(self, *args, **kwargs):
    """
    Saves the current instance after validating its rules.

    This method performs the following steps:
    1. Imports the DatasetRule schema for rule validation.
    2. Ensures that the 'rules' attribute is a list; raises ValueError if not.
    3. Validates each rule in the 'rules' list using DatasetRule.model_validate.
    4. Calls the superclass's save method to persist the instance.

    Args:
        args (list): Variable length argument list passed to the superclass save method.
        kwargs (dict): Arbitrary keyword arguments passed to the superclass save method.

    Raises:
        ValueError: If 'rules' is not a list.
        ValidationError: If any rule fails validation via DatasetRule.model_validate.
    """
    from onconova.research.schemas.dataset import DatasetRule

    # Validate the rules
    if not isinstance(self.rules, list):
        raise ValueError("Rules must be a valid list")
    for rule in self.rules:
        DatasetRule.model_validate(rule)
    super().save(*args, **kwargs)
runner