Page 8 - Cancer Systems Biology: Methods and Protocols
P. 8

_.....~.....a
   a
                       4      Hua Tan and Xiaobo Zhou                                                 "
                                                     Tumor Samples
                           E:/IHI  \IHTL"




                           z                 LL  ATTTH:aw



                                                                  [om  [wad roe

                       Fig.  1  Schematic  representation  of  two  combinatorial  mutational  patterns  studied  in  this  protocol:  the
                       co-mutational pattern  (upper  pane) refers to the scenario that a set of genes tends to mutate simultaneously
                       in a tumor sample,  whereas the  mutually exclusive  pattern  (lower panel represents the  opposite  scenario:
                       genes  in a given set tend to avoid mutating simultaneously in any one tumor sample


                                             [ 5]. Previous experimental and statistical analyses have consistently
                                             revealed two  combinatorial  mutational patterns  for a  given set of
                                             genes,  termed  co-mutational  and  mutually  exclusive  patterns
                                             [5, 8-10]. As  shown in Fig.  1, the  co-mutational  pattern occurs
                                            when  a  set  of genes  tend  to  mutate  simultaneously  in  a  single
                                             tumor,  while  the mutually exclusive pattern refers  to the  scenario
                                             in which one and only one of  a set of  genes is likely to be altered in a
                                             tumor.
                                                Mutually  exclusive  genes  are  likely  to  function  in  the  same
                                             signaling pathway,  whereas  co-mutational  genes  are  likely  to  take
                                             effect in different pathways  [11]. Combinatorial patterns of genes
                                             can be leveraged to infer signaling networks  implicated in human
                                             cancer  development and  progression.  Indeed,  many  efforts  have
                                             been devoted to de novo discover novel driver pathways  based on
                                            mutual  exclusivity  of gene  mutations  [ 11-13].  Therefore,  it has
                                            essential biological relevance to identify gene pairs or gene sets with
                                            significant combinatorial mutational patterns.
                                                Previous  work proposed a  statistical  method to deal with this
                                            question  and  nominated  a  number  of gene  sets  with  significant
                                            combinatorial patterns [10]. However,  this analysis was performed
                                            on a batch of  very limited cell line data.  The analysis  thus lacks  an
                                            elaborate  procedure  to  preprocess  data  from  a  giant  mutation
                                            database  which  consists  of a  large  number  of clinical  samples  of
                                            various cancer types  ( e.g., the recently released Catalog of  Somatic
                                            Mutations  in  Cancer  COSMIC  [14]  and  the  Cancer  Genome
                                            Atlas  TCGA,  https://tcga-data.nci.nih.gov  /tcga/).  In addition,
                                            the  analysis  by Yeang  et al.  adopted  different hypothesis  tests  to
                                            estimate the significance levels of  the two combinatorial mutational
                                            patterns,  which  tend  to  yield  a  too  conservative  p-value  for  the
                                            co-mutational pattern [10].
                                                To address these issues, we here describe a systematic and reliable
                                            pipeline to identify both combinatorial mutational patterns in cancer
                                            genomes.  Here, somatic  mutations  exclude  the  synonymous  point
   3   4   5   6   7   8   9   10   11