Healthcare graph

Linked, enriched and analytics ready, Compile’s smart data layer transforms messy and disparate datasets into an intuitive graph of healthcare providers and all their activities

Limitations with healthcare datasets today

Legacy solutions aren’t built for today's commercial analytics


Every large dataset, such as patient-level data (PLD), sales data and pharmacy claims have gaps due to partial or erroneous data fields. These need to be identified and cleansed before the data can be used.


Users of these datasets have to a steep learning curve to understand the data structure and build queries. Additionally multiple tables and a proliferation of fields require complex queries.


Presence of treating or associated providers (HCP/HCO) is critical in using claims-based data for any analytics. Yet these important fields aren’t filled in many instances, limiting provider and patient counts.


Joining multiple data assets isn’t straight forward. In the absence of a consistent key across all data, users have to build custom matching rules. This is time consuming and prone to errors.

Compile’s Smart Data layer

Unified, linked and analytics-ready data that gives a complete and accurate view of providers


Unified layout
  • Simplified claims data structure into one single comprehensive table
  • Easy to understand and navigate
  • Stitched valuable information like affiliations and provider metadata
Cleansed & enriched data
  • Cleaned, standardized and normalized data fields
  • Robust process to remove any duplicate or invalid claims
  • Advanced algorithms to standardize “free-form” fields like payer names
Enhanced data
  • Missing provider details and payer channel are backfilled based on historical claims patterns
  • Pre-computed data fields for faster analytics and simpler queries
  • Custom-built ML algorithms to identify precise brand usage
Analytics ready
  • Build, integrate and automate your analysis/dashboards/applications
  • Pre-computed field and features for faster insights
  • Reduced complexity in building and structuring queries/codes
  • 60%+ lower query time
  • Only the important fields that are used for 95% of the use cases are retained
  • One table each for medical and pharmacy claims
  • Parent affiliations and metadata for providers, prescribers is included

Entities profiled

30B+ Medical & pharmacy claims
300M+ Patients
55% Medical claims capture
7yr Longitudinal claims
6M+ Affiliations
100+ Data sources
75K+ Clinical trials
1.1M+ Publications
$80B+ Company payments
30K+ Twitter profiles
600K+ Facilities
1,000+ IDNs