Skip to content

onconova.analytics.schemas

This module defines Pydantic schemas for representing key statistics, data completeness, and entity information within the Onconova data platform. The schemas are used for API serialization and validation of analytics-related data, including platform-wide statistics, monthly counts, entity-level statistics, and data completion metrics.

CountsPerMonth

Bases: Schema

Schema representing the cumulative count of entries per month.

Attributes:

Name Type Description
month date

The month (as a date object) representing the period of data aggregation.

cumulativeCount int

Total number of entries accumulated up to and including the given month.

cumulativeCount class-attribute instance-attribute

month class-attribute instance-attribute

DataCompletionStatistics

Bases: Schema

Schema representing statistics on data completion for patient cases.

Attributes:

Name Type Description
totalCases int

Total number of patient cases analyzed for data completeness.

overallCompletion float

Overall percentage of data categories completed across all cases.

mostIncompleteCategories List[IncompleteCategory]

List of the most common categories with missing data.

completionOverTime List[CountsPerMonth]

Historical trend of cumulative data completeness by month.

completionOverTime class-attribute instance-attribute

mostIncompleteCategories class-attribute instance-attribute

overallCompletion class-attribute instance-attribute

totalCases class-attribute instance-attribute

DataPlatformStatistics

Bases: Schema

Schema representing key statistics of the data platform.

Attributes:

Name Type Description
cases int

Total number of unique patient cases in the data platform.

primarySites int

Number of distinct primary anatomical sites represented.

projects int

Total number of research projects.

cohorts int

Number of defined cohorts in the platform.

entries int

Total number of individual data entries recorded.

mutations int

Total number of genetic mutations documented across all cases.

clinicalCenters int

Number of clinical centers contributing data.

contributors int

Total number of individual data contributors.

cases class-attribute instance-attribute

clinicalCenters class-attribute instance-attribute

cohorts class-attribute instance-attribute

contributors class-attribute instance-attribute

entries class-attribute instance-attribute

mutations class-attribute instance-attribute

primarySites class-attribute instance-attribute

projects class-attribute instance-attribute

EntityStatistics

Bases: Schema

Schema representing statistical data for a medical entity.

Attributes:

Name Type Description
population Optional[int]

Number of cases in the population.

dataCompletionMedian Optional[float]

Median percentage of case completion.

topographyCode Optional[str]

ICD-O-3 topography code of the entity.

topographyGroup Optional[str]

ICD-O-3 topography code of the entity group.

dataCompletionMedian class-attribute instance-attribute

population class-attribute instance-attribute

topographyCode class-attribute instance-attribute

topographyGroup class-attribute instance-attribute

IncompleteCategory

Bases: Schema

Schema representing a category of data incompleteness.

Attributes:

Name Type Description
category str

The data category with cases where it is incomplete.

cases int

Number of cases affected with the incomplete data category.

affectedSites List[CodedConcept]

List of anatomical sites affected by this data incompleteness.

affectedSites class-attribute instance-attribute

cases class-attribute instance-attribute

category class-attribute instance-attribute

runner