onconova.research.schemas.analysis
AnalysisMetadata
¶
Bases: Schema
Schema representing metadata for an analysis performed on a cohort.
Attributes:
Name | Type | Description |
---|---|---|
cohortId |
str
|
The ID of the cohort for which the analysis was performed. |
analyzedAt |
datetime
|
The datetime at which the analysis was performed. |
cohortPopulation |
int
|
The effective number of valid patient cases in the cohort used for the analysis. |
AnalysisMetadataMixin
¶
Mixin class that provides metadata handling for analysis objects.
Attributes:
Name | Type | Description |
---|---|---|
metadata |
AnalysisMetadata | None
|
Metadata for the Kaplan-Meier curve, including cohort information and analysis timestamp. |
Methods:
Name | Description |
---|---|
add_metadata |
Cohort) -> Self: Populates the metadata attribute with information from the provided cohort, such as cohort ID, analysis time, and population size. |
metadata
class-attribute
instance-attribute
¶
add_metadata(cohort)
¶
Adds metadata information to the analysis based on the provided cohort.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
Cohort
|
The cohort object containing data to populate metadata fields. |
required |
Returns:
Type | Description |
---|---|
Self
|
The instance of the analysis with updated metadata. |
Source code in onconova/research/schemas/analysis.py
CategorizedSurvivals
¶
Bases: Schema
, AnalysisMetadataMixin
Schema for categorizing progression free survival (PFS) data within a cohort based on therapy-related groupings.
Attributes:
Name | Type | Description |
---|---|---|
survivals |
Dict[str, List[float]]
|
A dictionary mapping category names (e.g., drug combinations or therapy classifications) to lists of progression free survival values. |
survivals
instance-attribute
¶
calculate(cohort, therapyLine, categorization)
classmethod
¶
Calculates survival statistics for a given cohort based on therapy line and categorization.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
Cohort
|
The cohort of patients to analyze. |
required |
|
str
|
The therapy line to consider for the analysis. |
required |
|
str
|
The categorization method, either "drugs" or "therapies". |
required |
Returns:
Type | Description |
---|---|
CategorizedSurvivals
|
An instance of the class with calculated survivals based on the specified categorization. |
Notes
- If categorization is "drugs", survivals are calculated by combination therapy.
- If categorization is "therapies", survivals are calculated by therapy classification.
Source code in onconova/research/schemas/analysis.py
Distribution
¶
Bases: Schema
, AnalysisMetadataMixin
Represents a statistical distribution of trait counts within a cohort.
Attributes:
Name | Type | Description |
---|---|---|
items |
List[CohortTraitCounts]
|
The entries in the distribution, each representing a category and its associated counts and percentage. |
items
class-attribute
instance-attribute
¶
calculate(cohort, property)
classmethod
¶
Calculates the distribution of a specified property within a cohort.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
Cohort
|
The cohort to analyze. |
required |
|
str
|
The property to calculate distribution for. Supported properties include: - "age" - "ageAtDiagnosis" - "gender" - "neoplasticSites" - "vitalStatus" |
required |
Returns:
Type | Description |
---|---|
Distribution
|
An instance of Distribution containing items with category, counts, and percentage for each property value. |
Raises:
Type | Description |
---|---|
KeyError
|
If the specified property is not supported. |
Source code in onconova/research/schemas/analysis.py
KaplanMeierCurve
¶
Bases: Schema
, AnalysisMetadataMixin
Schema representing a Kaplan-Meier survival curve, including survival probabilities and confidence intervals.
Attributes:
Name | Type | Description |
---|---|---|
months |
List[float]
|
List of time points (in months) for survival probability estimates. |
probabilities |
List[float]
|
Survival probabilities at each time point. |
lowerConfidenceBand |
List[float]
|
Lower bound of the survival probability confidence interval at each time point. |
upperConfidenceBand |
List[float]
|
Upper bound of the survival probability confidence interval at each time point. |
lowerConfidenceBand
class-attribute
instance-attribute
¶
months
class-attribute
instance-attribute
¶
probabilities
class-attribute
instance-attribute
¶
upperConfidenceBand
class-attribute
instance-attribute
¶
calculate(survivals, confidence_level=0.95)
classmethod
¶
Performs Kappler-Maier analysis to estimate survival probabilities and 95% confidence intervals and initializes a Kaplan-Meier curve.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
List[float | None]
|
Array containing the number of months survived for each patient. |
required |
|
float
|
Confidence level for the confidence interval (0.95 default). |
0.95
|
Returns:
Name | Type | Description |
---|---|---|
KaplanMeierCurve |
KaplanMeierCurve
|
Instance containing the computed survival curve and confidence bands. |
Raises:
Type | Description |
---|---|
ValueError
|
If the input survivals list is empty or contains only None values. |
Notes:
Uses the analytical Kaplan-Meier estimator 1_ and computes the asymptotic 95%
confidence intervals 2_ using the log-log approach 3_.
References:
.. [1] https://en.wikipedia.org/wiki/Kapla-Meier_estimator
.. [2] Fisher, Ronald (1925), Statistical Methods for Research Workers, Table 1
.. [3] Borgan, Liestøl (1990). Scandinavian Journal of Statistics 17, 35-41
Source code in onconova/research/schemas/analysis.py
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 |
|
OncoplotDataset
¶
Bases: Schema
, AnalysisMetadataMixin
Schema representing the dataset required for generating an Oncoplot visualization.
Attributes:
Name | Type | Description |
---|---|---|
genes |
List[str]
|
List of the most frequently encountered gene names. |
cases |
List[str]
|
List of patient case identifiers. |
variants |
List[OncoplotVariant]
|
List of variant records included in the Oncoplot. |
cases
class-attribute
instance-attribute
¶
genes
class-attribute
instance-attribute
¶
variants
class-attribute
instance-attribute
¶
calculate(cases)
classmethod
¶
Calculates and returns an analysis summary for the given patient cases.
This method performs the following steps:
- Retrieves all GenomicVariant objects associated with the provided cases.
- Identifies the top 25 most frequently occurring genes among these variants.
- Filters variants to include only those associated with the top genes.
- Annotates each variant with relevant fields such as pseudoidentifier, gene name, HGVS expression, and pathogenicity.
- Constructs and returns an instance of the class with:
- The list of top genes.
- The pseudoidentifiers of the cases.
- A validated list of variant data for plotting or further analysis.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
QuerySet[PatientCase]
|
A queryset of patient cases to analyze. |
required |
Returns:
Type | Description |
---|---|
OncoplotDataset
|
An instance of the class containing the analysis results. |
Source code in onconova/research/schemas/analysis.py
OncoplotVariant
¶
Bases: Schema
Schema representing a variant entry for an oncoplot analysis.
Attributes:
Name | Type | Description |
---|---|---|
gene |
str
|
The gene symbol associated with the variant. |
caseId |
str
|
Unique identifier for the case, can be provided as 'caseId' or 'pseudoidentifier'. |
hgvsExpression |
str
|
HGVS expression describing the variant, can be provided as 'hgvsExpression' or 'hgvs_expression'. |
isPathogenic |
Optional[bool]
|
Indicates if the variant is pathogenic, can be provided as 'isPathogenic' or 'is_pathogenic'. |
TherapyLineCasesDistribution
¶
Bases: Distribution
Represents the distribution of cases in a cohort based on inclusion in a specific therapy line.
calculate(cohort, therapyLine)
classmethod
¶
Calculates the distribution of cases in a cohort based on inclusion in a specified therapy line.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
Cohort
|
The cohort containing valid cases to analyze. |
required |
|
str
|
The label of the therapy line to filter cases by. |
required |
Returns:
Type | Description |
---|---|
TherapyLineCasesDistribution
|
A Distribution object containing counts and percentages for cases included and not included in the specified therapy line. |
Notes:
- The percentages are rounded to four decimal places.
- Assumes `cohort.valid_cases` is a queryset-like object supporting `count()` and `filter()` methods.
Source code in onconova/research/schemas/analysis.py
TherapyLineResponseDistribution
¶
Bases: Distribution
Represents the distribution of treatment responses for a specific therapy line within a cohort.
calculate(cohort, therapyLine)
classmethod
¶
Calculates the distribution of treatment responses for a specified therapy line within a given cohort.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
Cohort
|
The cohort containing valid cases to analyze. |
required |
|
str
|
The label of the therapy line to filter cases. |
required |
Returns:
Type | Description |
---|---|
TherapyLineResponseDistribution
|
An object representing the distribution of treatment responses, including counts and percentages for each response category. |
Notes:
- Filters cases in the cohort by the specified therapy line.
- Annotates each case with its most recent treatment response during the therapy line period.
- Aggregates and calculates the percentage distribution of response categories.
- Categories with no response are labeled as "Unknown".