library(dplyr);library(stringr)
#sample and estimated
n=1000
sample=c(rpois(n,3),rpois(n,8),rpois(n,15),rpois(n,30))
N=length(sample);
class=4
x=sample
#initial conditions
mu=sample(c(-20:20),class);sig=sample(c(1:50),class)
pi=c(0.2,0.6,0.1,0.1)
times=1000
for(j in 1:times){
s_piN=array(0,dim=c(class,N))
for(i in 1:class){
s_piN[i,]=pi[i]*dnorm(x,mu[i],sig[i])
}
qn=c()
for(i in 1:class){
qn=c(qn,sum(s_piN[i,]/apply(s_piN,2,sum))/N)
}
pi=qn
mu=c()
sig=c()
for(i in 1:class){
mu_val=sum(x*s_piN[i,]/apply(s_piN,2,sum))/(N*pi[i])
mu=c(mu,mu_val)
sig=c(sig,sqrt(sum((s_piN[i,]/apply(s_piN,2,sum))*((x-mu_val)*(x-mu_val)))/(N*pi[i])))
}
}
mu=round(mu)
#random mating
genotype=c("AB","Ab","aB","ab")
s=length(genotype)
g=mu
g=g/sum(g)
mat=t(t(g))%*%t(g)
colnames(mat)=genotype;rownames(mat)=genotype
G=mat
#generations
times=100
t(eigen(G)$vectors)%*%(diag(eigen(G)$values)^times)%*%eigen(G)$vectors
genotype_prob=apply(G,1,sum)
female=male=c("AB","Ab","aB","ab")
r=array("0",dim=c(length(female),length(male)))
for(j in 1:nrow(r)){
for(i in 1:ncol(r)){
r[i,j]=paste0(male[i],female[j])
}
}
genotype=female
type=genotype[1]
C=array(0,dim=dim(r))
colnames(C)=colnames(G);rownames(C)=rownames(G)
for(j in 1:nrow(r)){
for(i in 1:ncol(r)){
word=r[i,j]
lan1=substr(type,1,1);lan2=substr(type,2,2)
if(str_count(word,lan1)>0 & str_count(word,lan2)>0){
value=min(str_count(word,lan1),str_count(word,lan2))
C[i,j]=C[i,j]+value
}
}
}
cor_mat=array(0,dim=c(length(g),length(g)))
for(j in 1:nrow(cor_mat)){
for(i in 1:ncol(cor_mat)){
cor_mat[i,j]=-g[i]*g[j]/(sqrt(g[i]*(1-g[i]))*sqrt(g[j]*(1-g[j])))
}
}
prob=cor_mat/sum(cor_mat)
homo_prob=sum(diag(prob))
f=homo_prob
I=array(0,dim=c(length(g),length(g)))
h=0.01
r=sample(1:100,s);r=sum(mu)*(r/sum(r))
r=round(r);n=sum(mu)
I=array(0,dim=c(s,s))
for(j in 1:nrow(I)){
for(i in 1:ncol(I)){
if(i==j){
#I[i,j]=n*dpois(r[i],lambda=mu[i])*(r[i]/mu[i]-1)^2
I[i,j]=r[i]*(r[i]/mu[i]-1)^2
}
}
}
kai2=sum((r-n*(mu/sum(mu)))^2/(n*mu/sum(mu)))
alpha=0.01
qchisq(1-alpha,s-1)
#生物配列の統計
#p.123
n=50;m=100
times=100
d_vec=c()
for(k in 1:times){
X=array(0,dim=c(n,m))
for(j in 1:n){
val=rpois(4,sample(10:20,1))
X[j,]=sample(1:4,m,replace=T,prob=val/sum(val))
}
for(j in 1:m){
val2=rpois(2,sample(10:20,1))
if(sample(0:1,1,prob=val2/sum(val2))>0){
X[,j]=rep(sample(1:4,1,replace=T),n)
}
}
S=c()
for(j in 1:m){
mat=X;vec=mat[,j];
S=c(S,ifelse(length(unique(vec))==1,0,1))
}
pis=c()
for(j in 1:m){
vec=X[,j]
for(i in 1:length(vec)){
vec_sub=vec;val=vec_sub[i];vec_sub=vec_sub[-i]
vec_sub=vec_sub-val;
pis=c(pis,sum(vec_sub!=0))
}
}
S_hat=sum(S)/sum(1/c(1:(n-1)));pis_hat=sum(pis)/(n*(n-1))
d=pis_hat-S_hat;d_vec=c(d_vec,d)
}
d_vec/sqrt(var(d_vec))
#生物配列の統計
#p.132
library(dplyr)
n=100
u=rpois(n,sample(1:10,1))
u=cumsum(u);u=sort(u)
sita=c()
for(j in 2:n){
sita=c(sita,j*(j-1)*u[j])
}
sita=sita/(n-1)
sita_MLE=sum(sita)
N=1000000
mu=sita_MLE/(2*N)
#全枝の長さ
S_hat=sum(c(2:n)*u[2:n])
#多型性
sita_S=S_hat/sum(1/c(1:(n-1)))
v_vec=cumsum(u)
cost=function(v,s){
val=c()
for(i in 2:n){
func=function(x){
y<-exp(-s[i-1]*x)
return(y)
}
Y=integrate(func,v[i-1],v[i])$value
val=c(val,log(i)+log(i-1)-(s[i-1]*v[i-1])-Y)
}
return(sum(val))
}
ite=10000;h=0.001;eta=10^(-9)
sita=rep(10,n-1)
for(l in 1:ite){
sita_pre=sita
vec=sita
for(j in 1:(n-1)){
vec_sub=vec;vec_sub[j]=vec_sub[j]+h
if((sita[j]+eta*(cost(v_vec,vec_sub)-cost(v_vec,vec))/h)>0){
sita[j]=sita[j]+eta*(cost(v_vec,vec_sub)-cost(v_vec,vec))/h
}
}
print(cost(v_vec,sita))
#print(sum(abs(sita_pre-sita)))
}