A significant amount of technology not deployed with the commercial system is required in order to construct and optimize the algorithm used for pain classification.
Curated Dataset from Multiple Sites
PainQx (and through an exclusive license with the NYU School of Medicine) has established a dataset of over 600 EEG recordings with a specific focus on pain assessment. The dataset spans five collection sites, 3 EEG acquisition devices, with all data collected under a protocol with the sole purpose of supporting pain assessment research. To the knowledge of PainQx, this is the largest dataset of its kind and continues to grow. As the development dataset grows to provide a more complete representation of chronic pain, PainQx expects the performance of its pain assessment algorithm to increase. In addition to raw EEG files, the PainQx algorithm development database contains a complete set of qEEG features for each case derived using the artifacting, epoch selection and feature extraction software modules described above along with additional clinical data collected from the subjects. All data in the PainQx Development Dataset is de-identified, satisfying HIPAA requirements.
Machine Learning Technology
The role of Machine Learning (ML) in PainQx product development is to computationally analyze the set of cases contained in the PainQx algorithm development dataset, looking for patterns in the data and identifying those features, which when combined together, provide the most accurate classification of subjects. A variety of Machine Learning tools have been utilized by PainQx in order to determine the which are the best fit to the problem space, and also to facilitate looking across tools to identify the most powerful qEEG features regardless of the specific ML tool being used.
PainQx is focused on a form of machine learning referred to as “supervised learning”. A prediction model is constructed using a subset of the collected data, referred to as the train/test set. The remaining subset of the data is used as a “hold-out” for validating the model established using the train/test data set. The machine learning tools are run using the gathered train/test dataset. Control parameters are adjusted to optimize performance assessed via cross-validation. This step is used carefully to emphasize domain knowledge supported features while minimizing the potential for overtraining. Following algorithm optimization using the train/test dataset, the performance of the resulting prediction model is measured on the holdout dataset. This creates a ‘blinded’ performance measure which can be used to predict performance on future datasets.