Patient-Physician Conversation Dataset

Professionally curated with highest quality.

No regulatory or PHI issues:

Our patient-physician conversation data is precisely curated with the highest standard by our team of medical professionals in an ultra-realistic clinic environment avoiding any regulatory and PHI issues.

High-definition audio data:

Each conversation comes with a single-channel monaural recording with a minimum audio sample rate of 44kHz. Essential for training your ASR models.

Meticulously curated transcripts:

Our patient-physician conversation data is bundled with a human-validated transcript with a WER (Word Error Rate) of less than 2%, identified areas of cross-talks, turn accuracy under 100ms, and standardized representation of punctuation, symbols, verbal spelling, initialisms, acronyms, abbreviations, brand and product names, hesitations, and disfluency. 

Distinct voices and speakers diversity:

Our patient-physician conversation datasets are captured with many different North American accents for your ASR model to handle. Southern, Midatlantic, New England, New York, Philadelphia, and Pacific Northwest. You name it. We have it.

Test drive our patient-physician conversation data for your ASR, LLM, NLP models today!