6 Pluripotent Stem Cell Example

This section contains instructions to reproduce the results of simulating perturbations on FVS control nodes in a pluripotent stem cell signaling. The goal of these simulations is to identify targets that can reprogram cells from the Epiblast stem cell (EpiSC) fate towards the Embryonic Stem Cell (ESC) fate. You can read the original report here: link

The original study of reversion of primed pluripotency in mouse epiblast stem cells was performed by Yachie-Kinoshita et al. in “Modeling signaling-dependent pluripotency with Boolean logic to predict cell fate transitions (Molecular Systems Biology (2018)14:e7952)”. The network and internal-marker nodes were extracted from the publication and its supplementary material. The normalized expression data used for initial activities in NETISCE can be found at GSE62155.

The input data, nextflow pipeline, and results of this simulation can be found in the ipsc folder in the NETISCE github repository

6.1 Input Data

network.sif contains the network structure for pluripotent stem cell signaling

expression.csv contains the initial activities for ESC cells (3 replicates), and EpiSC cells (3 replicates)

internal-marker-kinoshita.txt contains the 4 internal marker nodes that were originally used in Yachie-Kinoshita et al., to evaluate simulations.

internal-marker-kinoshita-expanded.txt contains the 4 internal marker nodes that were originally used in Yachie-Kinoshita et al., plus the additional marker ndoes identified from the data used to evaluate simulations.

samples.txt contains they key for NETISCE to associate certain samples to the phenotypes of Embryonic Stem Cells (ESCs) or Epiblast Stem Cells (EpiSCs)

6.2 Run the simulation

These simulations were run on a high performance cluster that uses a SLURM executor. If your hpc uses a different executor, please update those specifications in the nextflow.config file in the directory. Please see https://www.nextflow.io/docs/latest/config.html for more information regarding your executor.

For ease of reproduction, we have included all files necessary to reproduce the reported results directly in the directory. We do recommend you run this simulation on an hpc. We have included the bash file we used on our SLURM executor.

Note: within the NETISCE.nf configuration file, we have included two lines for specifying the internal-marker nodes:

#!/usr/bin/env nextflow

params.expressions = "$baseDir/input_data/expression.csv"
params.network = "$baseDir/input_data/network.sif"
params.samples = "$baseDir/input_data/samples.txt"
params.internal_control="$baseDir/input_data/internal-marker-kinoshita.txt"
// params.internal_control="$baseDir/input_data/internal-marker-kinoshita-expanded.txt"
params.alpha = 0.9
params.undesired = 'EpiSC'
params.desired = 'ESC'
params.filter ="strict"

As discussed in our paper, we filtered the perturbations using the original 4 internal-marker nodes for pluripotency (Oct4, Sox2, Nanog, EpiTFs), and then again using 3 additional internal-marker nodes. Therefore, to run either analysis, comment/uncomment the internal-marker node file you are interested in. If you want to run NETISCE first with the original internal-marker nodes, make sure to change the results file names for exp_internalmarkers.txt,successful_controlnode_perturubations.txt, and original-experimental_internalmarkers.pdf as to not overwrite them (or move them into a separate folder). Additionally, when you run the nextflow command, please be sure to use the -resume flag so that you use the cached computations that do not need to be re-computed

You can also run NETISCE directly using the following command: ./nextflow run NETISCE.nf -resume

6.3 Results

Herein, we will focus on the results that are deposited in the results folder by NETISCE. However, each step of the nextflow pipeline produces its corresponding raw results (for example, the entire attractor state for network simulations initialized with experimental data). If you are interested in looking at those raw results, they can be found within the work folder. We provide workfiles.txt which is a guide to which folders/subfolders contain the relevant results of each step.

6.3.1 General Results

First, let’s take a look at the results that do not depend on the internal-marker node set.

FVS finding

The FVS solving algorithm identified one FVS, containing 6 nodes.

name
Sox2
Nanog
Gata6
Tbx3
Oct4
Klf4

Attractor landscape estimation via k-means analysis

Now, let’s look at the results of k-means analysis. First, NETISCE determines the optimal number of k clusters by computing the elbow and silhouette metrics.

Figure 6.1: optimal k as identified by a) the elbow and b) silhouette metrics

We see that the optimal k assessed by the elbow metric was k=4, while the optimal k identified by the silhouette metric was k=2. NETISCE automatically chooses the smaller k value, after checking that the attractors generated from the ESC samples and EpiSC samples do not appear in the same cluster.

k=2 was selected for k-means optimal k, and we can see that the attractors generated from the ESC samples and EpiSC samples do not appear in separate clusters.

name	clusters
ESC_1	0
ESC_2	0
ESC_3	0
EpiSC_1	1
EpiSC_2	1
EpiSC_3	1

Pertrubations on FVS control nodes that pass criterion 1

With 6 FVS control nodes, NETISCE performed 729 simulations of combinations of perturbations on the FVS control nodes. The resulting attractors were classified to the clusters produced from the k-means analysis using Naive Bayes, Support Vector Machine, and Random Forest Machine Learning classification algorithms. Then, the perturbations are filtered by which of their corresponding attractors were classified to the ESC cluster by at least 2 of the 3 methods. These results can be found in crit1_perts.txt. Here we show the first 10 rows.

## [1] "number of perturbations that pass filtering criteria 1: 375"

x
pert_8
pert_17
pert_26
pert_35
pert_44
pert_53
pert_88
pert_89
pert_97
pert_98

6.3.2 Results using 4 internal-marker nodes

The relevant files have the prefix ‘original’ in the github repository

Our second perturbation filtering criterion identifies perturbations where, in their corresponding attractors, 90% of the steady state values for internal-marker nodes that are within the steady state expression ranges in the attractors generated from the ESC experimental data.

First, let’s take a look at the steady state values of the internal-marker nodes Oct4, Sox2, and Nanog in the attractors generated from the ESC and EpiSC experimental data. The values can be found in the original-exp_internalmarkers.txt and are plotted in original-experimental_internalmarkers.pdf:

Figure 6.2: histograms of 4 internal-marker node values

We see that the values of the internal-markers for the pluripotent state (Oct4, Sox2, Nanog) are higher in the attractors generated from the experimental data of the ESCs than the attractors generated from the experimental data of the EpiSCs, and the makrer of the epiblast stem state is higher in the attractors generated from the experimental data of the EpiSCs than in the attractors generated from the experimental data of the ESC.

Now, we can take a look at the attractors that passed filtering criterion 2. We show the first 10 rows here, but you can view the entire set in original-successful_controlnode_perturbations.txt

crit2_4<-read.delim('ipsc/results/original-successful_controlnode_perturbations.txt',sep=" ",as.is = T)
print(paste0('number of perturbations that pass filtering criteria 2: ',nrow(crit2_4)))

## [1] "number of perturbations that pass filtering criteria 2: 132"

knitr::kable(crit2_4[1:10,])  %>% column_spec(8, bold = F, border_left = T) %>% scroll_box(width = "100%", box_css = "border: 0px;")

	Sox2	Nanog	Gata6	Tbx3	Oct4	Klf4	up	down	total
pert_445	nochange	up	nochange	nochange	nochange	nochange	1	0	1
pert_607	up	nochange	nochange	nochange	nochange	nochange	1	0	1
pert_368	nochange	nochange	nochange	nochange	up	up	2	0	2
pert_418	nochange	up	down	nochange	nochange	nochange	1	1	2
pert_436	nochange	up	nochange	down	nochange	nochange	1	1	2
pert_446	nochange	up	nochange	nochange	nochange	up	2	0	2
pert_448	nochange	up	nochange	nochange	up	nochange	2	0	2
pert_454	nochange	up	nochange	up	nochange	nochange	2	0	2
pert_580	up	nochange	down	nochange	nochange	nochange	1	1	2
pert_598	up	nochange	nochange	down	nochange	nochange	1	1	2

We can look to see if there are any trends in the orientation of the perturbations on FVS control nodes across the perturbations that passed both filtering criteria.

library(data.table)
library(ggplot2)
d3<-crit2_4 [,c(1:6)] %>% transpose() %>% as.matrix()
row.names(d3)<-colnames(crit2_4[1:6])
colnames(d3)<-row.names(crit2_4)
d3r<-reshape2::melt(d3) %>% select(-Var2)

#From Paul Tol: https://personal.sron.nl/~pault/
Tol_muted <- c('#88CCEE', '#44AA99', '#117733', '#332288', '#DDCC77', '#999933','#CC6677', '#882255', '#AA4499', '#DDDDDD')

ggplot(d3r, aes(x=value)) +
  facet_wrap(~Var1,scales = "free_x",shrink=FALSE) +
  geom_bar(aes(y = (..count..)/ncol(d3),fill=value))  + scale_y_continuous(labels=scales::percent) + scale_fill_manual(values=Tol_muted)+ ylab("% of specified perturbation to an FVS control node across all \n combinations of perturbations that passed filtering criteria ") + xlab("node perturbation")

Here, we see that in the majority of perturbations, there is overexpression of the FVS nodes Sox2, Nanog and Oct4. This aligns with their role as maintainers of pluripotency.

The steady-state values of the internal-marker nodes for these perturbations can be found in pert1_internal_markers.txt.

A perturbation on the FVS control node, Nanog overexpression (pert_445), was also identified in Yachie-Kinoshita et al. and experimentally verified to shift cells from the EpiSC state towards the ESC state. We can plot the internal-marker node values using a radar plot.

We can also plot Klf4 overexpression. This was a perturbation identified by Yachie-Kinoshita et al. to shift cells from the EpiSC state towards the ESC state. However, in our analyses, this perturbation (pert_365), did pass filtering criterion 1, but did not pass filtering criterion 2.

library(fmsb)
attr_pert<-read.delim("ipsc/results/pert1_internal_markers.txt",sep=" ",row.names = 1)
attr_pert <-attr_pert[c('pert_445','pert_365'),c('Oct4',"Sox2","Nanog","EpiTFs")]
exp<-read.delim("ipsc/results/original-exp_internalmarkers.txt",sep=" ",row.names = 1)
EpiSC_avg<-colMeans(exp[4:6,])
ESC_avg<-colMeans(exp[1:3,])

d1<-rbind(EpiSC_avg,ESC_avg,attr_pert)
rownames(d1)[1:4]<- c("EpiSC", "ESC","Nanog overexpression","Klf4 overexpression")
maxcol<-apply(d1, 2, max)
mincol<-apply(d1, 2, min)

d2<-rbind(maxcol,mincol, d1)
rownames(d2)[1:2]<- c("Max", "Min")

par(mar = c(4, 0.1, 4, 0.1))


for (i in 5:nrow(d2)) {
  create_beautiful_radarchart(d2[c(1:4, i), ],color = c("#00AFBB", "#E7B800","#FC4E07"),caxislabels = seq(round(min(d2[c(1:4, i),]),1),round(max(d2[c(1:4, i),]),1),.2),title =row.names(d2)[i]);legend(x=0.4, y=1.4, legend =c("average EpiSC","average ESC",row.names(d2)[i]) , bty = "n", pch=20 , col=c("#00AFBB", "#E7B800","#FC4E07") , text.col = "black", cex=1, pt.cex=2)
}

Figure 6.3: radar charts of the steady-state values of 4 internal-marker nodes for Nanog overexpression and Klf4 overexpression perturbation

We see here, indeed that the values for Oct4, Sox2, Nanog, (when considered these three genes as internal-marker nodes) and EpiTFs are within the expected range for Nanog overexpression, but the values of Oct4, Sox2, Nanog in the attractor generated from Klf4 overexpression do not reach the values of the ESC state.

6.3.3 Results using 7 internal-marker nodes

The relevant files have the prefix ‘expanded’ in the github repository.

We explored filtering the 132 perturbations that passed the second filtering criterion by adding additional internal-marker nodes associated with pluripotency. We added 3 nodes, Lefty1, Pitx2 (transcription factors active in EpiSCs), and Esrrb (transcription factor active in ESCs) to the 4 previously used internal-marker nodes.

Let’s look at the steady state values of the internal-marker nodes in the attractors generated from the ESC and EpiSC experimental data. The values can be found in the expanded-exp_internalmarkers.txt and are plotted in expanded-experimental_internalmarkers.pdf:

Figure 6.4: histograms of 7 internal-marker node values

Again, the internal-marker nodes have the correct expression patterns within the attractors generated from the ESC and EpiSC states.

Now, we can take a look at the attractors that passed filtering criterion 2. These are in file expanded-successful_controlnode_perturbations.txt

crit2_7<-read.delim('ipsc/results/expanded-successful_controlnode_perturbations.txt',sep=" ")
print(paste0('number of perturbations that pass filtering criteria 2: ',nrow(crit2_7)))

## [1] "number of perturbations that pass filtering criteria 2: 15"

knitr::kable(crit2_7)  %>% column_spec(8, bold = F, border_left = T) %>% scroll_box(width = "100%", box_css = "border: 0px;")

	Sox2	Nanog	Gata6	Tbx3	Oct4	Klf4	up	down	total
pert_445	nochange	up	nochange	nochange	nochange	nochange	1	0	1
pert_418	nochange	up	down	nochange	nochange	nochange	1	1	2
pert_436	nochange	up	nochange	down	nochange	nochange	1	1	2
pert_446	nochange	up	nochange	nochange	nochange	up	2	0	2
pert_454	nochange	up	nochange	up	nochange	nochange	2	0	2
pert_409	nochange	up	down	down	nochange	nochange	1	2	3
pert_419	nochange	up	down	nochange	nochange	up	2	1	3
pert_427	nochange	up	down	up	nochange	nochange	2	1	3
pert_437	nochange	up	nochange	down	nochange	up	2	1	3
pert_455	nochange	up	nochange	up	nochange	up	3	0	3
pert_473	nochange	up	up	nochange	nochange	up	3	0	3
pert_410	nochange	up	down	down	nochange	up	2	2	4
pert_428	nochange	up	down	up	nochange	up	3	1	4
pert_464	nochange	up	up	down	nochange	up	3	1	4
pert_482	nochange	up	up	up	nochange	up	4	0	4

We can plot the trends of the orientation of perturbations for each FVS control node across the 15 perturbations.

d4<-crit2_7 [,c(1:6)] %>% transpose() %>% as.matrix()
row.names(d4)<-colnames(crit2_7[1:6])
colnames(d4)<-row.names(crit2_7)
d4r<-reshape2::melt(d4) %>% select(-Var2)



ggplot(d4r, aes(x=value)) +
  facet_wrap(~Var1,scales = "free_x",shrink=FALSE) +
  geom_bar(aes(y = (..count..)/ncol(d4),fill=value))  + scale_y_continuous(labels=scales::percent) + scale_fill_manual(values=Tol_muted) + ylab("% of specified perturbation to an FVS control node across all \n combinations of perturbations that passed filtering criteria ") + xlab("node perturbation")

Interestingly, we see among these 15 perturbations, Nanog overexpression, but no change to Oct4 or Sox2, is indicated. We also see that Klf4 is overexpressed in the majority of these 15 perturbations.

The steady-state values of the internal-marker nodes for these perturbations can be found in pert1_internal_markers.txt.

We can also generate radar plots for these 15 perturbations.

library(fmsb)
attr_pert<-read.delim("ipsc/results/pert1_internal_markers.txt",sep=" ",row.names = 1)
attr_pert <-attr_pert[row.names(crit2_7),]
exp<-read.delim("ipsc/results/expanded-exp_internalmarkers.txt",sep=" ",row.names = 1)
EpiSC_avg<-colMeans(exp[4:6,])
ESC_avg<-colMeans(exp[1:3,])

d1<-rbind(EpiSC_avg,ESC_avg,attr_pert)
rownames(d1)[1:2]<- c("EpiSC", "ESC")
maxcol<-apply(d1, 2, max)
mincol<-apply(d1, 2, min)

d2<-rbind(maxcol,mincol, d1)
rownames(d2)[1:2]<- c("Max", "Min")

par(mar = c(4, 0.1, 4, 0.1))

for (i in 5:nrow(d2)) {
  create_beautiful_radarchart(d2[c(1:4, i), ],color = c("#00AFBB", "#E7B800","#FC4E07"),caxislabels = seq(round(min(d2[c(1:4, i),]),1),round(max(d2[c(1:4, i),]),1),.2),title =row.names(d2)[i]);legend(x=0.4, y=1.4, legend =c("average EpiSC","average ESC",row.names(d2)[i]) , bty = "n", pch=20 , col=c("#00AFBB", "#E7B800","#FC4E07") , text.col = "black", cex=1, pt.cex=2)
}

radar charts of the steady-state values of 7 internal-marker nodes perturbation

Figure 6.5: radar charts of the steady-state values of 7 internal-marker nodes perturbation

These radar plots show that for all 15 perturbations on FVS control nodes, the steady-state values of the internal-marker nodes are within the expression range of the attractors generated from the ESC experimental data.