The present invention discloses methods of data analysis directed to diagnostic development, and in particular the development of signatures for classifying chemogenomic data. The invention provides methods for identifying and functionally characterizing a “necessary” set of information rich variables. The invention also discloses methods for identifying a plurality of “sufficient” classifiers. The necessary set of variables may be incorporated into a single diagnostic device to provide simultaneous confirmation of a classification measurement with a plurality of independent classifiers. In the field of biological diagnostics, the invention may be used to provide a plurality of short lists of genes, referred to as “signatures” that are “sufficient” to carry out specific classification tasks such as predicting the activity and side effects of a compound in vivo.
本发明公开了用于诊断开发的数据分析方法,特别是用于
化学基因组数据分类的特征开发方法。本发明提供了用于识别和从功能上描述一组 "必要的 "富信息变量的方法。本发明还公开了确定多个 "充分 "分类器的方法。必要 "变量集可纳入单个诊断设备中,以便用多个独立分类器同时确认分类测量结果。在
生物诊断领域,本发明可用于提供多个
基因简表,称为 "特征","足以 "执行特定的分类任务,如预测化合物在体内的活性和副作用。