1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

BWT for Sorted Rank Coding

1
Posted at

Move to front(MTF)以上に記号を小さい値に偏らせるよう考案されたのがSorted Rank Coding(SRC)です(知名度は低い)。

入力文字列
That that is is that that is not is not is that it it is
BWT+MTF(情報量13.4282)
116,0,0,0,116,1,0,0,1,0,1,0,1,0,0,106,0,0,0,0,2,0,0,87,1,36,0,0,0,0,0,0,0,0,0,112,0,108,0,0,0,0,0,102,0,113,0,2,0,2,0,0,4,0,0,0
BWT+SRC(情報量11.6964)
4,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,1,0,0,1,0,2,0,0,1,6,0,0,0,0,0,2,0,1,1,0,1,0,0,7,0,2,0,0,2,0,0,0,0,5,0,8,0,3

上記のように情報量がMTFよりほのかに小さくなる傾向があります。ただし、SRCでは記号頻度情報を保持しておく必要があります(復号で必須)。そのため入力文字列はかなり大きめでないと真価を発揮できません。

実装編

JavaScriptで書き散らかしております。

function SRCp(C,A){
	for(var a=-1,b,c=0,h=4,t;a<255;)if(C[++a])A[c++]=a;
	for(h=4;h<c;)h=h*3+1;
	do for(a=h=h/3|0;a<c;A[b+h]=t)
		for(b=a-h,t=A[a++];0<=b&&(C[A[b]]<C[t]||C[t]===C[A[b]]&&t<A[b]);b-=h)
			A[b+h]=A[b];
	while(h^1);
	return c
}
/*符号化
SRCe(T,RCV,C)
@T: 変換元配列{n|0~255}
@RCV: 変換先配列{n|0~255}
@C: 空配列.度数を格納する
@return: 変換された配列{n|0~255}
*/
function SRCe(T,RCV,C){
	if(!RCV)RCV=[];
	for(var a=0,b=0,c,i=256,j,n=0,p=0,z=T.length,A=[],P=[],R=[],S=[];i;)C[--i]=0;
	for(;n<z;C[c]+=n-j){
		for(c=T[j=n];T[++n]===c;);
		if(!C[c])S[R[b]=c]=b++
	}
	for(b=SRCp(C,A);a<b;p+=C[c])P[c=A[a++]]=p;
	for(;i<n;P[c]=++p){
		if(a=RCV[p=P[c=T[i]]]=S[c]){
			while((S[R[a]=R[a-1]]=a--)>1);
			S[R[0]=c]=0
		}
		while(T[++i]===c)RCV[++p]=0
	}return RCV
}
/*復号
@RCV: 変換元配列{n|0~255}
@T: 変換先配列{n|0~255}
@C: 度数配列{n|0~0xffffffff}
返値: 変換された配列{n|0~255}
*/
function SRCd(RCV,T,C){
	if(!T)T=[];
	for(var a=0,A=[],b=SRCp(C,A),c,i=0,n=RCV.length,s,P=[],R=[],Q=[];a<b;Q[c]=i+=C[c])
		P[c=R[RCV[i]]=A[a++]]=i+1;
	for(c=R[i=0];i<n;)
		if(P[T[i++]=c]<Q[c]){
			if(a=RCV[P[c]++]){
				for(s=0;R[s]=R[++s],s<a;);
				R[a]=c,c=R[0]}
		}else if(--b){
			for(s=0;R[s]=R[++s],s<b;);
			c=R[0]}
	return T
}
test
function BWT(T,U,n){
	for(var a=n=n||T.length,i,S=U||new Uint8Array(n),I=new Uint32Array(n);a;)I[--a]=a;
	I.sort(function(a,b,c,d){for(c=n;c--;)if(d=T[a++%n]-T[b++%n])return d});
	for(;a<n;S[a++]=T[--i])(i=I[a])||(S.i=a,i=n);
	return S
}
let txt=Array.from("That that is is that that is not is not is that it it is",a=>a.charCodeAt()),
	bwt=BWT(txt), enc=[], dec=[], count=[];

SRCe(bwt,enc,count);
SRCd(enc,dec,count);
document.write("enc ",enc,"<hr>dec ",dec,"<hr>bwt ",bwt)

などといった具合です。ちなみにRank Coding(Sorted無し)という実装もあります

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?