static dictionary methods of text compression
8643 단어 compression
we classify the method named dictionary method into two categories. One is static dictionary method, and the other is auto or dynamic dictionary method.
Now I plan to describe the first shortly with a routine example.
if we have much information about a structure of a text , it is available to take the static dictionary method. We could use many ways to implement the method varying with occasions, but a way named double letters code is popular with programmers.
To make it clearer, I prefer to take a simple example to explain the method, as follows.
Now there is a signal composed by five letters, that is 'a', 'b', 'c', 'd' and 'r'. Then we get a dictionary accroding to our signal knowledge. The dictionary is
code
letter
000
a
001
b
010
c
011
d
100
r
101
ab
110
ac
111
ad
Then I will code a sequence that is 'abracadabra'.
At first, the coder will read the first of two letters, which are 'ab'. After that, the coder have to find if the pair of letters is in our dictionary. If it does, the coder will return the letters's code and read the next letters. otherwise it will return the first letter's code and read the following letter. In this example, the coder will find the code in the dictionary, and return '101'. Following the step, the coder reads 'ra', but it cann't find the value of our dictionary by key 'ra'. So it have to return the code of 'r' that is '100', and read the letter 'c' following 'a' to compose of a new pair of letters that is 'ac'. The coder return '110'. Then read 'ad', return '110'. ...
The output is '101100110111101100000'.
The routine written by python is as follows.
1 def getCodeDict():
2 codeDict = {}
3 codeDict['a'] = '000'
4 codeDict['b'] = '001'
5 codeDict['c'] = '010'
6 codeDict['d'] = '011'
7 codeDict['r'] = '100'
8 codeDict['ab'] = '101'
9 codeDict['ac'] = '110'
10 codeDict['ad'] = '111'
11 return codeDict
12
13 def compress(code):
14 print('start to compress')
15 result = ''
16 codeDict = getCodeDict()
17 offset = 2
18 unCodedCode = code
19 while unCodedCode != '':
20 targetCode = unCodedCode[0 : 2]
21 if targetCode in codeDict:
22 #find a pair of letters, and move two steps
23 result = result + codeDict[targetCode]
24 offset = 2
25 else :
26 #not find a pair of letters, and move only one step
27 result = result + codeDict[targetCode[0]]
28 offset = 1
29 unCodedCode = unCodedCode[offset : ]
30 print('complete to compress')
31 return result
32
33 if __name__=='__main__':
34 signals = 'abracadabra'
35 result = compress(signals)
36 print(result)
이 내용에 흥미가 있습니까?
현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:
NEventStore에서 mongo 지속성을 사용하는 동안 데이터를 압축할 수 있는 방법이 있습니까?저는 C#, Dotnet 코어 및 NeventStore(버전 9.0.1)를 사용하여 기본적으로 지원하는 다양한 지속성 옵션을 평가하려고 합니다. 더 구체적으로 말하자면, mongo 지속성을 사용하려고 할 때 압축이 ...
텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.
CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.