Design

Regulatory interactions for iron homeostasis in Aspergillus fumigatus inferred by a Systems Biology approach

Description
Regulatory interactions for iron homeostasis in Aspergillus fumigatus inferred by a Systems Biology approach
Categories
Published
of 14
0
Categories
Published
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Share
Transcript
  RESEARCH ARTICLE Open Access Regulatory interactions for iron homeostasis inAspergillus fumigatus inferred by a SystemsBiology approach Jörg Linde 1,4* , Peter Hortschansky 2 , Eugen Fazius 1 , Axel A Brakhage 2,4 , Reinhard Guthke 1,4 and Hubertus Haas 3 Abstract Background:  In System Biology, iterations of wet-lab experiments followed by modelling approaches and model-inspired experiments describe a cyclic workflow. This approach is especially useful for the inference of generegulatory networks based on high-throughput gene expression data. Experiments can verify or falsify thepredicted interactions allowing further refinement of the network model.  Aspergillus fumigatus  is a major humanfungal pathogen. One important virulence trait is its ability to gain sufficient amounts of iron during infectionprocess. Even though some regulatory interactions are known, we are still far from a complete understanding of the way iron homeostasis is regulated. Results:  In this study, we make use of a reverse engineering strategy to infer a regulatory network controlling ironhomeostasis in  A. fumigatus . The inference approach utilizes the temporal change in expression data after achange from iron depleted to iron replete conditions. The modelling strategy is based on a set of linear differentialequations and offers the possibility to integrate known regulatory interactions as prior knowledge. Moreover, itmakes use of important selection criteria, such as sparseness and robustness. By compiling a list of knownregulatory interactions for iron homeostasis in  A. fumigatus  and softly integrating them during network inference,we are able to predict new interactions between transcription factors and target genes. The proposed activation of the gene expression of   hapX   by the transcriptional regulator SrbA constitutes a so far unknown way of regulatingiron homeostasis based on the amount of metabolically available iron. This interaction has been verified byNorthern blots in a recent experimental study. In order to improve the reliability of the predicted network, theresults of this experimental study have been added to the set of prior knowledge. The final network includes threeSrbA target genes. Based on motif searching within the regulatory regions of these genes, we identify potentialDNA-binding sites for SrbA. Our wet-lab experiments demonstrate high-affinity binding capacity of SrbA to thepromoters of   hapX, hemA  and  srbA . Conclusions:  This study presents an application of the typical Systems Biology circle and is based on cooperationbetween wet-lab experimentalists and  in silico  modellers. The results underline that using prior knowledge duringnetwork inference helps to predict biologically important interactions. Together with the experimental results, weindicate a novel iron homeostasis regulating system sensing the amount of metabolically available iron andidentify the binding site of iron-related SrbA target genes. It will be of high interest to study whether theseregulatory interactions are also important for close relatives of   A. fumigatus  and other pathogenic fungi, such as Candida albicans . * Correspondence: joerg.linde@hki-jena.de 1 Research Group Systems Biology/Bioinformatics, Leibniz Institute for NaturalProduct Research and Infection Biology- Hans Knöll Institute,Beutenbergstraße 11a, 07745 Jena, GermanyFull list of author information is available at the end of the article Linde  et al  .  BMC Systems Biology   2012,  6 :6http://www.biomedcentral.com/1752-0509/6/6 © 2012 Linde et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative CommonsAttribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction inany medium, provided the srcinal work is properly cited.  Background A major workflow in Systems Biology is an interlockingcircle between experimental and theoretical work [1].Experimentalists perform high-throughput experimentsin order to monitor the response of a biological systemto an external stimulus. These data is then used to con-struct spatio-temporal models from which reasonablehypotheses are generated. These hypotheses are experi-mentally verified or falsified. Using the results of theseexperiments, scientists are able to refine the model andthus generate new knowledge [2].One way of describing biological systems are networks.Networks are graphical representations, where the nodesrepresent the objects of interest and edges represent rela-tions between these objects [3]. Network models help toexplain, understand and describe the functioning of a cell[4]. In many cases we do not know the underlying inter-action networks within the system of interest. Networkinference aims at the deduction of these networks utiliz-ing high-throughput data and prior knowledge. Theinference of gene regulatory networks consists of threeparts: the identification of potential regulators, the pre-diction of target genes, and the inference of the mode of interaction (e.g. activation or repression). A number of approaches are established to perform this task, such assetting up Bayesian Networks [5], information theoreticalapproaches [6-8], regression based inference [9-11], and differential equation models [12-17]. A number of studies successfully applied these methods for different biologicalpurposes, e.g. modelling of immune diseases [10,13], full genomic models of   Escherichia coli  [8] and  Saccharo-myces cerevisiae  [18], and models of pathogenic fungi[16]. It has been shown that the integration of differentdata sources improves the reverse engineering approach[10,19-21].Since different data sources might be contra- dictory, it is advantageous to softly integrate them duringthe modelling procedure. That means, proposed interac-tions can be scored by the confidence of the prior knowl-edge source and might be removed if they contradict toomuch to the measured data. A recent study shows how the Systems Biology circle supports network inference[22]. Due to the large amount of available data andknowledge  E. coli  is best suited as model organism fornetwork inference. However, this task is more difficultfor pathogenic fungi by virtue of the small amount of data and small number of known interactions.  Aspergillus fumigatus  is an airborne saprophytic fungus[23]. Humans constantly inhale numerous conidia of   A. fumigatus , which are usually eliminated by the immunesystem. However, in immunocompromised individuals thefungus can cause life-threatening infections [23]. In fact,the number of infections has been dramatically increaseddue to the growing number of immunocompromised indi- viduals [24-26]. The human host evolved a number of strategies to pre- vent microbial infection. One important strategy is tokeep iron away from the pathogen [27]. Iron is an essen-tial metal required as a cofactor for several proteins, aswell as for a number of biochemical processes. However,within the human host, iron is bound to proteins such ashaemoglobin, ferritin, transferrin, and lactoferrin. Conse-quently, there is almost no free iron available [28]. Thus,the acquisition of iron is an important virulence attributeof most pathogens. During co-evolution,  A. fumigatus has developed a number of efficient iron acquisitionpathways: 1) reductive iron uptake, 2) uptake via sidero-phores, and 3) low-affinity uptake (for a more detaileddescription see [29]). Since excess of iron is toxic for acell, iron homeostasis needs to be tightly regulated in  A. fumigatus . The knowledge about the molecular inter-actions underlying these regulations is still fragmentary.The transcription factors SreA and HapX have been iden-tified as a counter pair [30-32]. Under iron replete condi- tions, SreA is activated and represses iron uptake. Underthese conditions, SreA also represses  hapX   transcription.Since HapX is a repressor of iron consumption pathways,SreA indirectly activates iron consumption. Moreover,HapX also acts as an activator of iron acquisition. A num-ber of target genes are known for both regulators, howeverwe are still far from a complete understanding of ironhomeostasis in  A. fumigatus .Recently, we proposed a model predicting regulatory interactions for iron uptake of another fungal pathogen, Candida albicans , when the fungus is adhering to andinvading into human epithelial cells [17]. The model isbased on time series expression data during experimentalinfection of reconstituted human oral epithelium. Theusefulness of these data lies in the fact that it re-samplesimportant parts of a real infection scenario. On the otherhand, in the previous modelling approach a number of environmental parameters are not constant during infec-tion, such as pH and nutrient availability. This may havecaused side effects and made it difficult to decidewhether the proposed interactions are purely based onchanges in iron availability or other environmental para-meters. Such environmental variations finally hamperexperimental verifications of the proposed interactions.The use of   in vitro  time series expression data after achange from iron depleted to iron replete conditions willhelp to decide which interactions  C. albicans  uses to reg-ulate iron homeostasis. For  A. fumigatus , such time seriesexpression data is already available and utilized in thisstudy.In the present work, we propose the first computa-tional model of the regulation of iron homeostasis genesin  A. fumigatus  using high-throughput gene expressiontime series data after a shift from iron starvation to ironreplete conditions [31]. It is based on a set of linear Linde  et al  .  BMC Systems Biology   2012,  6 :6http://www.biomedcentral.com/1752-0509/6/6Page 2 of 14  differential equations and utilizes selection criteria suchas sparseness and robustness [17,21,33]. Since the soft integration of prior knowledge has been shown toimprove the reliability of the predicted networks[10,19-21], our modelling approach softly integrates three kinds of prior knowledge: Northern blot analysisunder limited iron [31,32], microarray expression analy- sis of transcription factor knock-out mutants [31,32], as well as the occurence of transcription factor bindingmotifs analysis in regulatory regions of genes [31,34-36]. The inferred model predicts new transcription factor totarget gene interactions. A recent study utilizes North-ern blots and experimentally verifies two of these inter-actions [37], while another predicted interaction isfalsified and one remains unevaluated. Using the resultsof the recent experiments as additional prior knowledge,we are able to refine our model. The final networkmodel predicts a number of SrbA targets. To study,whether or not the transcriptional regulator directly binds to these target genes, we performed motif search-ing that lead to the identification of potential SrbAbinding sites in the promoters of the predicted targetgenes. Indeed, wet-lab experiments demonstrate high-affinity binding capacity of SrbA to the promoters of  hapX, hemA  and  srbA . Methods Data and imputation Schrettl  et al  . performed full-genomic transcriptional pro-filing of   A. fumigatus  as response to the change from irondepleted growth to iron replete growth [31]. They moni-tored gene expression at five timepoints after adding ironto the culture medium: 10 min, 30 min, 60 min, 120 min,240 min. We used the preprocessed (i.e. normalised andlogarithmised) data of Schrettl  et al  . [31]. Figure 1 gives an overview about the applied methods. Since clustering andnetwork inference need complete data, we imputed miss-ing values using the Bayesian Principal Component Analy-sis (BPCA) imputation from the R-package  ‘ pcaMethods ’ [38]. This method performed best among a set of differentimputation methods (for more information see additionalfile 1, table S1). Clustering Schrettl  et al  . identified 1147 genes to be differentially expressed within the wild-type strain comparing iron Figure 1  Overview of applied workflow . The green lines illustrate how the cyclic workflow of Systems Biology was applied in this study. Linde  et al  .  BMC Systems Biology   2012,  6 :6http://www.biomedcentral.com/1752-0509/6/6Page 3 of 14  depleted and iron replete conditions [31]. We added  srbA to this set (see candidate genes for regulatory networkmodel) and collected (imputed) expression values of these genes. We applied fuzzy c-means clustering [39] tothis expression matrix. The optimal number of clusterswas estimated as previously described [13,17]. In short, 42 cluster validity indices (Dunn ’ s index and the Davis-Bouldin index with 18 generalizations each as well as thesilhouette width and five other indices as described in[13]) capturing different aspects of a clustering structurewere used to assess the partitions based on 2 up to 20clusters. The number of clusters that was ranked best by the most validity indices was chosen. Overrepresented gene ontology terms In order to identify key biological processes/functionsmost significantly enriched with genes within each of the clusters, we performed functional categorization andidentified significantly overrepresented categories usingthe tool FungiFun [40]. We applied both Funcat [41] (all four hierarchical levels) and Gene Ontology  [42] (Biolo-gical Process and Molecular Function) categorization. Network prediction Network inference was performed similarly as previously described [17] applying the Net  Gene rator tool [33].This tool is available upon request. In short, the net-work inferences approach has the following features:1. It is based on a set of linear differential equationsand models the temporal change of the expressionintensity   x i ( t  ) of gene  i  ( i  = 1.. n ) at time  t   as theweighted sum of the expression intensities of allother genes and an external stimulus  u ( t  ) at time  t  (see equation 1). The external stimulus  u ( t  ) is mod-elled as a stepwise constant function representingthe change from iron depletion to iron repletion. ˙  x i ( t  ) = n   j =1 w i ,  j  x  j ( t  ) +  b i u ( t  )  (1)2. Based on the given time series data, the tool calcu-lates the gene regulatory matrix  W   and the perturba-tion vector  B . The parameter  w i,j   (component of   W  )represents an influence of gene  j   on the expression of gene  i , while the parameter  b i  (component of   B )represents the impact of the external stimulus givenby the function  u ( t  ). Non-zero parameters define theedges of the regulatory network. A positive parameter w i,j   denotes an activation and a negative parameterdenotes a repression of gene  i  by gene  j. 3. The approach follows the selection criterion of sparseness. Using a heuristic search strategy it triesto minimise the number of non-zero parameters(interactions) which are necessary to fit to the mea-sured data points.4. The approach follows the selection criterion of robustness, i.e. technical noise in measured mRNAconcentrations caused by the microarray technology does not alter inferred regulatory interactions. Thisis achieved by iterating the network inference proce-dure 1000 times using randomly perturbed inputtime series data (Gaussian noise with mean 0 andstandard deviation 0.05 added to the measured andpre-processed data) [13,16]. Only edges which are confirmed by more than 50% of the iterations areconsidered to be robust.5. The inference approach uses prior knowledge (i.e.putative regulatory interactions based on additionaldata to time series expression data). Based on the con-fidence of the prior knowledge source, it is possible toscore each proposed interaction. Since different datasources might be contradictory, it is advantageous tosoftly integrate them during the modelling procedure.If a proposed interaction contradicts the measureddata too much it might be removed. If necessary, thetool adds new interactions not covered by the priorknowledge in order to fit to the measured data.6. Interactions included in the regulatory model mightmainly be based on their occurrence in the set of priorknowledge, rather than on the expression data. Thus,we tested whether or not the predicted interactionsare robust against changes in the set of prior knowl-edge by iterating the modelling approach 1000 timeswhile randomly skipping 10% of all interactions in theset of prior knowledge in each run. Again, only edgeswhich are confirmed by more than 50% of the itera-tions are considered to be robust.Three different sources are used to compile prior knowl-edge for the prediction of gene regulatory networks: Source 1 : Evidence of transcription factor - targetgene interactions based on single experiments (e.g.Northern blots). Confidence score = 0.5 Source 2 : Gene expression studies under limitediron conditions and expression analysis of transcrip-tional regulator knock-out mutants. Confidencescore = 0.25 Source 3 : Occurrence of the respective transcriptionfactor binding motif in the upstream intergenic regionsof iron homeostasis genes. Confidence score = 0.125.The score is additive, i.e., if an interaction is predictedby several sources the used score equals the sum overall confidence scores for the respective sources. Linde  et al  .  BMC Systems Biology   2012,  6 :6http://www.biomedcentral.com/1752-0509/6/6Page 4 of 14  SrbA binding site Three DNA sequences with high binding-affinity to thetranscriptional regulator Sre1 were identified in  Schizo- saccharomyces pombe  [43]. Sre1 and SrbA show highsequence similarity. Furthermore, the SrbA protein con-tains a basic helix-loop-helix/leucine zipper (bHLHZ)domain. This domain has been shown to specifically bindDNA in  S. pombe  [43]. The three high-affinity Sre1 bind-ing-sites are characterised by a conserved ATC at the 5 ’ end and a conserved AT at the 3 ’  end, while the remain-ing parts are highly variable (5 ’ -ATCNNNNNAT-3 ’ ). Forthe human ortholog of Sre1 and SrbA, the adenosin andthymidin enable the contact with the protein [44]. Topredict a SrbA binding site in  A. fumigatus , we firstdownloaded intergenic regions of genes predicted to beSrbA targets by our network model. Next, these inter-genic regions were scanned for the occurrence of thethree high-affinity binding sites of Sre1 allowing maximaltwo mismatches [45]. Finally, we only considered thosesites which contain the conserved 5 ’  and 3 ’  AT.To determine whether  A. fumigatus  SrbA recognizesthe identified putative binding sites, the bHLHZ domainof SrbA (amino acids 161-267,"SrbA161-267 ”  ) was pro-duced in  E. coli  and purified. The protein domain wasanalysed by real-time  in vitro  surface plasmon resonance(SPR) binding assays. Immobilized DNA duplexes (seeadditional file 2 for experimental details ) were used totest whether or not the protein domain can bind to thepredicted DNA sites. Results Clustering and overrepresented GO categories The BPCA (Baysian Principal Component Analysis)method gave best results for imputation (see additionalfile 1, table S1) and was thus chosen to impute missing values into the srcinal gene expression data set.The optimal number of clusters for partitioning theexpression data was found to be four (see additional file 3,figure S3). Scaled time series profiles are visualised infigure 2. Additional file 4 shows to which cluster each 0 30 60 120 240      −    2      −    1   0   1   2 time   r  e   l  a   t   i  v  e   g  e  n  e   e  x  p  r  e  s  s   i  o  n Cluster 1 0 30 60 120 240      −    2      −    1   0   1   2 time   r  e   l  a   t   i  v  e   g  e  n  e   e  x  p  r  e  s  s   i  o  n Cluster 2 0 30 60 120 240      −    2      −    1   0   1   2 time   r  e   l  a   t   i  v  e   g  e  n  e   e  x  p  r  e  s  s   i  o  n Cluster 3 0 30 60 120 240      −    2      −    1   0   1   2 time   r  e   l  a   t   i  v  e   g  e  n  e   e  x  p  r  e  s  s   i  o  n Cluster 4 Figure 2  Cluster analysis results . The best partition consists of four clusters. Points: mean (logarithmised, scaled and centred) expression valuesof all genes in the cluster, lines: standard deviation. Linde  et al  .  BMC Systems Biology   2012,  6 :6http://www.biomedcentral.com/1752-0509/6/6Page 5 of 14
Search
Similar documents
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x