Motivation: There has been considerable interest in developing computational techniques for inferring genetic regulatory networks from whole-genome expression profiles. When expression time series data sets are available, dynamic models can, in principle, be used to infer correlative relationships between gene expression levels, which may be causal. However, because of the range of detectable expression levels and the current quality of the data, the predictive nature of such inferred, quantitative models is questionable. Network models derived from simple rate laws offer an intermediate level analysis, going beyond simple statistical analysis, but falling short of a fully quantitative description. This work shows how such network models can be constructed and describes the global properties of the networks derived from such a model. These global properties are statistically robust and provide insights into the design of the underlying network.
Results: Several whole-genome expression time series data sets from yeast microarray experiments were analyzed using a Markov-modeling method (Dewey and Galas, Func. Integr. Genomics, 1, 269–278, 2001) to infer an approximation to the underlying genetic network. We found that the global statistical properties of all the resulting networks are similar. The overall structure of these biological networks is distinctly different from that of other recently studied networks such as the Internet or social networks. These biological networks show hierarchical, hub-like structures that have some properties similar to a class of graphs known as small world graphs. Small world networks exhibit local cliquishness while exhibiting strong global connectivity. In addition to the small world properties, the biological networks show a power law or scale free distribution of connectivities. An inverse power law, N(k)∼k-3/2, for the number of vertices (genes) with k connections was observed for three different data sets from yeast. We propose network growth models based on gene duplication events. Simulations of these models yield networks with the same combination of global graphical properties that we inferred from the expression data.
Contact: Ashish_Bhan@kgi.eduDavid_Galas@kgi.eduGreg_Dewey@kgi.edu
Supplementary Information: http://www.kgi.edu/html/noncore/faculty/dewey/bioinf.pdf
* To whom correspondence should be addressed.
动机:开发计算技术以从全基因组表达谱中推断遗传调控网络引起了相当大的兴趣。当存在表达时间序列数据集时,动态模型可以原则上用于推断基因表达水平之间的相关关系,这些关系可能是因果关系。然而,由于可检测表达水平的范围和当前数据的质量,这些推断的定量模型的预测性质是值得怀疑的。从简单的速率定律中导出的网络模型提供了中间层次的分析,超越了简单的统计分析,但还不足以提供完全的定量描述。本文展示了如何构建这样的网络模型,并描述了从这样的模型中导出的网络的全局特性。这些全局特性具有统计鲁棒性并提供了对底层网络设计的见解。
结果:使用Markov建模方法(Dewey和Galas,Func。Integr。Genomics,1,269-278,2001)分析了几个来自酵母微阵列实验的全基因组表达时间序列数据集,以推断出潜在的遗传网络的近似值。我们发现所有结果网络的全局统计特性相似。这些生物网络的总体结构与其他最近研究的网络(如互联网或社交网络)明显不同。这些生物网络显示出分层的中心结构,具有与小世界图类似的某些属性。小世界网络表现出局部群集性,同时表现出强大的全局连通性。除了小世界特性外,这些生物网络还显示出连接度的幂律或无标度分布。对来自酵母的三个不同数据集的具有k个连接的顶点(基因)数量的倒幂律N(k)∼k-3/2进行了观察。我们提出了基于基因复制事件的网络增长模型。这些模型的模拟产生具有与我们从表达数据中推断出的全局图形特性相同的网络。
联系人:Ashish_Bhan@kgi.eduDavid_Galas@kgi.eduGreg_Dewey@kgi.edu
补充信息:http://www.kgi.edu/html/noncore/faculty/dewey/bioinf.pdf
*通讯应该寄给谁。