From public repositories to specialized datasets.
Chemical-Target, Side Effects, and Indications.
A meticulous workflow from download to final metrics.
The Master Workflow
Download
Automated & manual data retrieval.
Map
Standardize chemicals & diseases.
Process & Filter
Apply source-specific curation rules.
Summarize
Generate final data tables.
Our Data Sources
Integrating diverse information for a holistic view.
FAERS
The FDA Adverse Event Reporting System (FAERS) is a database that contains information on adverse event and medication error reports submitted to FDA.
OFFSIDES & SIDER
OFFSIDES and SIDER are databases of drug side effects extracted from package inserts.
ECHA
The European Chemicals Agency (ECHA) database contains information on chemicals registered in the EU and their hazardous properties.
ChEMBL
ChEMBL is a manually curated database of bioactive molecules with drug-like properties.
Papyrus
A large-scale curated dataset for bio- and chem-informatics, containing information on bioactive molecules and their targets.
And more...
We also integrate data from other sources like ToxRef, IRIS, PPRTV, RAIS, EWAS ATLAS, and EWAS CATALOG to provide a comprehensive view.
Data Processing Deep Dive
Each data source has a unique set of rigorous filtering and preprocessing rules to ensure data quality and consistency.
ChEMBL
Chemical-Target Associations
- ✓ Confidence score ≥ 7
- ✓ pChemBL value ≥ 5
- ✓ Human, Rat, or Mice models only
- ✓ Exclude "Inconclusive" or "Not active"
FAERS
Side Effect & Indication Data
- ✓ "Primary Suspect Drug" (PS) only
- ✓ Reported by Physician, Pharmacist, or HP
- ✓ Disease names mapped to CUIs
- ✓ Requires secure tunnel for disease manager
IRIS, PPRTV, RAIS
Toxicity & Contaminant Data
- ✓ Discard "Low Confidence" entries
- ✓ InChIKeys recovered via DTXSID/CAS
- ✓ Disease names harmonized via NLP
- ✓ Separate processing for carcinogenic data
Final Output: Summary Tables
The workflow culminates in a set of clean, integrated summary tables, ready for analysis and downstream applications.
chemical_summary
chemical_reac_summary
chemical_indi_summary
chemical_target_summary
Key Metrics
A glimpse into the scale of our data.