Skip to content

Data Protection and Security in Onconova

Why Data Protection Matters

Onconova handles sensitive clinical and research data, which may include Protected Health Information (PHI). Ensuring the privacy and security of this data is both an ethical obligation and a legal requirement under various regulations, such as:

  • GDPR (General Data Protection Regulation) in the European Union
  • HIPAA (Health Insurance Portability and Accountability Act) in the United States
  • Local and institutional data protection policies and ethics board guidelines

Failing to properly manage sensitive data can result in:

  • Legal penalties and regulatory sanctions
  • Data breaches and loss of patient trust
  • Irreparable harm to research integrity

Note

Each site is accountable for maintaining a compliant environment. No software is inherently compliant; compliance depends on the environment in which it’s implemented. This includes the policies, procedures, and infrastructure managed by each study team and institution. Decisions about server settings, security, data backups, and user access are the responsibility of each organization, not defined by Onconova itself.


Pseudonymized vs. Anonymized Data

Understanding the difference between pseudonymized and anonymized data is crucial in data protection workflows:

Concept Description Re-identification Possible
Pseudonymized Data Direct identifiers (names, IDs) are replaced with a code, but a separate key exists that can link data back to the individual. Yes
Anonymized Data All identifying information is removed or irreversibly transformed, making it impossible to link data back to a person. No

Onconova supports both modes of data handling, depending on the operational context and user permissions.

At the database-level, all data are considered pseudonymized with the clinical identifier being the primary re-identification mechanism if the source EHR system is accessible. At the API-level (and by extension to all end-users), the data is by default presented anmonymized. Users with elevated rights can access the pseudonymized under certain conditions.


Anonymization Methods in Onconova

To protect sensitive healthcare information during data management and analysis, Onconova implements anonymization strategies aligned with the HIPAA Safe Harbor De-identification Standard, which outlines specific identifiers that must be removed or altered to consider data de-identified. The following table describes each HIPAA criterion and how Onconova handles it:

HIPAA Identifier Description Collected in Onconova Onconova Anonymization Implementation
Names Full names, initials No
Geographic subdivisions smaller than a state Street address, city, ZIP codes (except first 3 digits if population > 20,000) No
All elements of dates (except year) directly related to an individual Birth date, admission date, discharge date, death date, etc. Month and year of birth and death Not exposed to end-users
All other dates Lab result date, therapy period, etc. Yes Random Date Shifts
Telephone numbers Any phone or fax numbers No
Fax numbers Same as above No
Email addresses All email addresses No
Social Security numbers Government-issued personal numbers No
Medical record numbers Patient record or study IDs Patient record ID, Clinical center name String Redaction
Health plan beneficiary numbers Insurance or benefit IDs No
Account numbers Billing or financial identifiers No
Certificate/license numbers Professional or legal license IDs No
Vehicle identifiers and serial numbers License plate, VIN No
Device identifiers and serial numbers Implanted medical device IDs No
Web URLs Personal or institution-specific links No
IP addresses Device or network addresses No
Biometric identifiers Fingerprints, voice prints No
Full-face photographs and images Identifiable images No
Any other unique identifying number, characteristic, or code Any indirect identifier Age, Age at diagnosis Age Binning
  • Random Date Shifts:
    All date fields (e.g., diagnosis dates, treatment dates) are shifted by a randomly generated number of days unique to each patient, while preserving internal date intervals and sequences.

  • Age Binning:
    Patient age is converted into predefined age ranges or bins (e.g., >20, 30-34, 90+, etc.) instead of displaying exact ages or birthdates.

  • String Redaction:
    Free-text fields and identifiable string data (e.g., clinical identifiers) are automatically redacted or replaced with placeholder values like ****** where appropriate.


Access Levels and Data Protection Policies

Onconova enforces role-based access control (RBAC) to manage who can access pseudonymized or anonymized data:

Access Level User Role Pseudonymized Data Anonymized Data
6 Administrator
5 Platform Manager
4 Project Manager
1 Member
0 External

Members that belong to an ongoing research project can be granted temporary rights by their project leaders to manage (add, update and delete) data in addition to the right to see the pseudonymized data of the patient.

Exporting of any data outside of Onconova is restricted to high-level users such as Project Managers or higher (i.e. users who could otherwise have access to the data, e.g. clinicials, IT administrators, etc.).

When handling sensitive data through Onconova:

  • Ensure all data use complies with the applicable data protection regulations (GDPR, HIPAA).
  • Secure appropriate ethics board approvals and data sharing agreements before accessing pseudonymized datasets or using any data for research purposes.
  • Periodically review anonymization and data protection configurations to remain compliant with evolving regulations.

Further Reading

runner