유전자 데이터 처리 48의 ART 사용 실례
102926 단어 유전자 데이터 처리
1. 인스턴스 1:
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -f 20 -o G38L100F20Nhs20
====================ART====================
ART_Illumina (2008-2016)
Q Version 2.5.1 (Apr 17, 2016)
Contact: Weichun Huang @gmail.com>
-------------------------------------------
Single-end Simulation
Total CPU time used: 1162.71
The random seed for the run: 1464879720
Parameters used during run
Read Length: 100
Genome masking 'N' cutoff frequency: 1 in 100
Fold Coverage: 20X
Profile Type: Combined
ID Tag:
Quality Profile(s)
First Read: HiSeq 2000 Length 100 R1 (built-in profile)
Output files
FASTQ Sequence File:
G38L100F20Nhs20.fq
ALN Alignment File:
G38L100F20Nhs20.aln
2. 인스턴스 2를 사용합니다.
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ art_illumina -ss HS25 -sam -i GRCH38chr1L3556522.fna -p -l 150 -f 20 -m 200 -s 10 -o paired_dat
====================ART====================
ART_Illumina (2008-2016)
Q Version 2.5.1 (Apr 17, 2016)
Contact: Weichun Huang @gmail.com>
-------------------------------------------
Paired-end sequencing simulation
Total CPU time used: 1070.33
The random seed for the run: 1464880583
Parameters used during run
Read Length: 150
Genome masking 'N' cutoff frequency: 1 in 150
Fold Coverage: 20X
Mean Fragment Length: 200
Standard Deviation: 10
Profile Type: Combined
ID Tag:
Quality Profile(s)
First Read: HiSeq 2500 Length 150 R1 (built-in profile)
First Read: HiSeq 2500 Length 150 R2 (built-in profile)
Output files
FASTQ Sequence Files:
the 1st reads: paired_dat1.fq
the 2nd reads: paired_dat2.fq
ALN Alignment Files:
the 1st reads: paired_dat1.aln
the 2nd reads: paired_dat2.aln
SAM Alignment File:
paired_dat.sam
파일을 보려면 다음과 같이 하십시오.
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ ll -h
total 50G
drwxrwxr-x 2 hadoop hadoop 4.0K 6 2 23:16 ./
drwxrwxr-x 6 hadoop hadoop 4.0K 6 2 22:59 ../
-rw-rw-r-- 1 hadoop hadoop 11G 6 2 23:29 G38L100F20Nhs20.aln
-rw-rw-r-- 1 hadoop hadoop 9.4G 6 2 23:29 G38L100F20Nhs20.fq
-rw-r--r-- 1 hadoop hadoop 241M 6 2 23:00 GRCH38chr1L3556522.fna
-rw-rw-r-- 1 hadoop hadoop 2.5K 6 2 23:09 GRCH38chr1L3556522.fna.amb
-rw-rw-r-- 1 hadoop hadoop 144 6 2 23:09 GRCH38chr1L3556522.fna.ann
-rw-rw-r-- 1 hadoop hadoop 238M 6 2 23:09 GRCH38chr1L3556522.fna.bwt
-rw-rw-r-- 1 hadoop hadoop 60M 6 2 23:09 GRCH38chr1L3556522.fna.pac
-rw-rw-r-- 1 hadoop hadoop 119M 6 2 23:10 GRCH38chr1L3556522.fna.sa
-rw-rw-r-- 1 hadoop hadoop 4.9G 6 2 23:42 paired_dat1.aln
-rw-rw-r-- 1 hadoop hadoop 4.6G 6 2 23:42 paired_dat1.fq
-rw-rw-r-- 1 hadoop hadoop 4.8G 6 2 23:42 paired_dat2.aln
-rw-rw-r-- 1 hadoop hadoop 4.6G 6 2 23:42 paired_dat2.fq
-rw-rw-r-- 1 hadoop hadoop 11G 6 2 23:42 paired_dat.sam
생성 파일이 너무 커요.
3. 각 서열에 따라reads수를 정한다: (생성된 데이터가 작아진다)
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -c 50 -o G38L100c50Nhs20
====================ART====================
ART_Illumina (2008-2016)
Q Version 2.5.1 (Apr 17, 2016)
Contact: Weichun Huang @gmail.com>
-------------------------------------------
Single-end Simulation
Total CPU time used: 15.96
The random seed for the run: 1464918709
Parameters used during run
Read Length: 100
Genome masking 'N' cutoff frequency: 1 in 100
Fold Coverage: 0X
Profile Type: Combined
ID Tag:
Quality Profile(s)
First Read: HiSeq 2000 Length 100 R1 (built-in profile)
Output files
FASTQ Sequence File:
G38L100c50Nhs20.fq
ALN Alignment File:
G38L100c50Nhs20.aln
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ ls
G38L100c50Nhs20.aln G38L100F20Nhs20.aln GRCH38chr1L3556522.fna GRCH38chr1L3556522.fna.ann GRCH38chr1L3556522.fna.pac paired_dat1.aln paired_dat2.aln paired_dat.sam
G38L100c50Nhs20.fq G38L100F20Nhs20.fq GRCH38chr1L3556522.fna.amb GRCH38chr1L3556522.fna.bwt GRCH38chr1L3556522.fna.sa paired_dat1.fq paired_dat2.fq
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ ll
total 51506772
drwxrwxr-x 2 hadoop hadoop 4096 6 3 09:51 ./
drwxrwxr-x 6 hadoop hadoop 4096 6 2 22:59 ../
-rw-rw-r-- 1 hadoop hadoop 11400 6 3 09:52 G38L100c50Nhs20.aln
-rw-rw-r-- 1 hadoop hadoop 10428 6 3 09:52 G38L100c50Nhs20.fq
4. 데이터를 생성합니다.
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -c 1 -o G38L100c1Nhs20
====================ART====================
ART_Illumina (2008-2016)
Q Version 2.5.1 (Apr 17, 2016)
Contact: Weichun Huang @gmail.com>
-------------------------------------------
Single-end Simulation
Total CPU time used: 15.82
The random seed for the run: 1464918910
Parameters used during run
Read Length: 100
Genome masking 'N' cutoff frequency: 1 in 100
Fold Coverage: 0X
Profile Type: Combined
ID Tag:
Quality Profile(s)
First Read: HiSeq 2000 Length 100 R1 (built-in profile)
Output files
FASTQ Sequence File:
G38L100c1Nhs20.fq
ALN Alignment File:
G38L100c1Nhs20.aln
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c1Nhs20.
cat: G38L100c1Nhs20.: No such file or directory
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c1Nhs20.fq
@chr1-1
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGACAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
+
@C@D@FFDFHHHHIJ.JBIJJGJGIJ:G47JHJ@IJJ91BJJIGHHHEIJDGD=IJJJBJJ'DG=3D)chr1 chr1-1 225496693 +
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGAAAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGACAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
5. bwa 유효성 검사를 사용합니다.
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c1Nhs20.sam
@SQ SN:chr1 LN:248956422
@PG ID:bwa PN:bwa VN:0.7.13-r1126 CL:bwa samse GRCH38chr1L3556522.fna G38L100c1Nhs20.sai G38L100c1Nhs20.fq
chr1-1 0 chr1 225496694 37 100M * 0 0 CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGACAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT @C@D@FFDFHHHHIJ.JBIJJGJGIJ:G47JHJ@IJJ91BJJIGHHHEIJDGD=IJJJBJJ'DG=3D)chr1 chr1-1 225496693 +
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGAAAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGACAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
art에서 발생하는 데이터는 위치 0에서 시작하여 Adam과 일치한다는 것을 알 수 있다. bwa는 처음부터 어떻게 bwa 등 알고리즘의 정확도를 자동으로 판단합니까?
6. snap으로 확인:
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c1Nhs20.snap.sam
@HD VN:1.4 SO:unsorted
@RG ID:FASTQ PL:Illumina PU:pu LB:lb SM:sm
@PG ID:SNAP PN:SNAP CL:single index G38L100c1Nhs20.fq -o G38L100c1Nhs20.snap.sam VN:1.0beta.23
@SQ SN:chr1__AC:CM000663.2__gi:568336023__LN:248956422__rl:Chromosome__M5:6aef897c3d6ff0c78aff06ac189178dd__AS:GRCh38 LN:248956422
chr1-1 0 chr1__AC:CM000663.2__gi:568336023__LN:248956422__rl:Chromosome__M5:6aef897c3d6ff0c78aff06ac189178dd__AS:GRCh38 225496694 70 100M * 0 0 CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGACAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT @C@D@FFDFHHHHIJ.JBIJJGJGIJ:G47JHJ@IJJ91BJJIGHHHEIJDGD=IJJJBJJ'DG=3D)chr1 chr1-1 225496693 +
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGAAAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGACAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
부록(1) 50개의 데이터 bwa 대비:
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c50Nhs20.sam
@SQ SN:chr1 LN:248956422
@PG ID:bwa PN:bwa VN:0.7.13-r1126 CL:bwa samse GRCH38chr1L3556522.fna G38L100c50Nhs20.sai G38L100c50Nhs20.fq
chr1-50 0 chr1 93465785 37 100M * 0 0 TTCCACAATAGTTGAACTAATTTACAGTCCCACCAACAGTGTAAAAGTGTTCCTATTTCTCCACATCCTCTCCAGCACCTGTTGTTTCCTGACTTTTTAA @@CDFDFDHFHGHIJH:IJJJ(JJE?JDIDEJIB@FGJIGBHJ()HG8(CIICGFFHEH=GI3@&@DD58FADDACHDDHFCD8D,DCCXT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-48 0 chr1 228133746 37 100M * 0 0 ATCATTGTATGCCACAGAAATAATTAAATTTCCTTGTCAACTGACACATTATTATTAGGCACTCTCACCAGATCTTTACCCATGGCCATTTAAAGTGTGG @>CFFFFFH<@D(:EFDDC@;DDAC95(D?BD XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:44G55
chr1-47 0 chr1 13772988 37 100M * 0 0 TTCAGTAATTCAGAATAACACATGAGGGAATGAATGAATGAATAAATAAAAAAAAACTGAATGAATAAATTACAAAAAATTGTGTTTCAGGGAAGAAAAA CC@F(FFFDFH.HDHIGI(JIIIGGIEEJIIIHJJHHH3IJJIIJ3=EI>JDIGH((IBJCIEHGD>;J@HF+DC)CCCADBDBD+BDDDD5B5DDDE(C XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:56A15A27
chr1-46 16 chr1 37474758 37 100M * 0 0 GGGTCGGGGTCCTGTTCCCCGGTCCGCCGGGCCTCAGGACCCCTCCAACTTTGCCCAAGTTGGGAGAGCCGGGGAAGAGCACCAGGTTCCTGATCGGGAT (5CBACDDD>FBDDDDDEC:CE(CBDFDDHEFH;FGEFHGHDGJJJDIGI:JEHJ=JJJJJH8CI?JJJG9JIII>IJIIGJ=EIJGAHHHHFFDFDCC? XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:22C77
chr1-45 0 chr1 29056657 37 100M * 0 0 CTGGGATTACAGGTGCCCGCCACCATGCCCAGCTAATTTTTGTATTTTTGGTAGAGACAAGGTTTCACCATGTTGGCCGGGATTGTCTCGAACTCCTGAT B@@FFFFFHHG)HIJJJJBJIJCJHGJIBFJJI3IIHDF@JIAJ9JJJIJJBIJJ?BJID8F:HFHA(+D>J>CG>7D=DDFF@EDC3D@BD@B XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-44 0 chr1 49993893 37 100M * 0 0 CAATTTAGCCAAAACTGGCTAATCGTTTTACCAGAATCATTCCCATTGTTCAAGACCTATTTTAAGCTCCACTATCACCATAAAACTTTCCCGATCAGTT C@CFFFFDHHHHH@JJJHI)IDIBIJA:HJHFJJJIJGGJJIIIIHGGJJGHFDDED:CD:>DD5C&DDCD XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:24C75
chr1-43 16 chr1 194714506 37 100M * 0 0 AATATGTTTTAATAATATCATATTTAAATTTGATGATACTTTAAAAATGGTTCCATGTGTGTTCTCTTGGGTTATTTCACAATCAATAAAAGGTCTGCAA CCCCDDC@E>CDCDDC>D9CD=C)CGC>E@7.HF)DIBJBJJ.JEJEJ@JJIIIIGD?@B XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-42 0 chr1 35706203 37 100M * 0 0 CAGGTTCAAGCGATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGATTACAGGCACGTGCCACCATGCCTGGCAATTTTTGTATTTTTAGTACAGATGGGG CC@FDFFAHHGFHJIHJFJJII=@JEHIJIIJIJEJIJJHHGIJBBFJG6JJHJJGXT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-41 16 chr1 156482338 37 100M * 0 0 GTGTGTGCATAGGCAGGTCTGCGTGTACATGCAACGTGGGCACGTGTCCATGTGGATGCAGGCGGGGGTATATCCTGGTGCCTGTGTGTATGGGCCCACC D;CCDDCDCDDDCD:EDA@@CJJI,=FHJJIGJ7GEC?IGJJIFBBICHJEIJJHHAIJIJI.IJGJJGJJHHGHHFFFFFB=B XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-40 16 chr1 221779284 37 100M * 0 0 CATGGCACATAGCACTTTGGTGATGGGGACTGCTTTGCTAATGTCAGGGTCAAGGGGTGCATGGACCATGGGCAGAGTGCTGGGCTCAGCCAAATGGTTC DDBCDCDDDDDDD25F?DD@4I5HED?CAHGA?JJIIJB)IHFJJFCJII?@XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:39G60
chr1-39 16 chr1 3895605 37 100M * 0 0 GTCCTCTCCGGATTGACAGGAGTCAAAACATGAGATCGGCTTAGCTTCAGTTTCGTCATGGATTAACCACCTCCAAGGTGTCAACTCCAAAATGTCAAGA DD5CCAD&8DAD>D&FDDDCDBDD?6DD.FHDDIFE?@IDEGIBCGD?JFJ>JGBI,IJIF.JJIHJJJEIEGFJ=JJHJHHJFHIJHHHHHFFAFFC@B XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-38 16 chr1 33174926 37 100M * 0 0 CACACATACATATATGTGTGTATATATATATATATATATATATACACACATATACATATATATGCACACACACATGTATGTATATGTATATGTATATGTG CDDC(FBDC(AACBDDCBDDECEC5@H;HFDJFH>=FCHAHJFJ'H3JG9JFEHIJFDJJ9IJHEJIGJIJJJJJC;J?AJFJGEHFHDC>D?DDBDFB)DDDDDC5(9>F;G)FB84/AJE3JJIJIGIGJBBIGCJCJGJGHJIDJ>IB7IGJGEGCIIGFJJJEFHIIJHHF=HFF8=F??= XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:14C85
chr1-34 16 chr1 12934213 0 100M * 0 0 TTTTGATACTTTTGATGTGGCCAAAGGTTCTCCAATAAAGATACCATATATAAATATATGTATTTCTAATGTCTGAAACAGATTAAAACCTTCCCTGTAT D@CB?DEDCEDDD(DC>F>DEHE>[email protected]' 5I8IBFJHDI=CIIJ8JFHIBJJI0IJGFFJGIIJJABHXT:A:R NM:i:0 X0:i:2 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100 XA:Z:chr1,+13267477,100M,0;
chr1-33 0 chr1 48968233 37 100M * 0 0 TAATAGTAGGCAATAAACAAAGAGAGCAACTTAGGAGCCAGATCACATGTGGCCGCTCGAGCAATATGGTAAAAGTTCTGGACTTCATTCTAGGTGAATG 1CCB=FFFHHHHHEDHJIIAFG4JIFJIJB)JJI?(&JJIJJEE)HIJJBJ?HJ(=B(I@?I?8DC8C>JHJH>@EDFDD5DDDDDDDDCFD:=DCC(DD XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:0G53T45
chr1-32 0 chr1 88980623 37 100M * 0 0 TAGTTCAGTAAACTATTTATCAAACAGGTGTCAGGTCATTTTAACATACTCCTTGCTTTGAACAATATTCATTCATACTTGGTACAAACTCTATATCCTA B?CFDFFFHHFH3JIJJJIGJJJJJFJDEJGJ(EHFI>E=JIJ(GGJDFCH>>GJ=IHDJEHHDI>GEBJE@DD@HH'AA@ECC@BDEDDD@CDDADBDD XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:32T44T22
chr1-31 16 chr1 227005594 37 100M * 0 0 TCACCAGGCATCTTTACTGACTCACACCAATAGTAGTACTGGGATTAGAAATAAGACGCTGCAATACTCACAACCTAGGTGAAGTTAGTTAATTTGGGAA D@D5B=DACDDDDDBEFECBFDC5BCDDDCDFIDC8ICEIJ=DHIGHIJIJJB0HJJCDJHJGJIJI9GGHGGJ3@IJJAIJGGBGJ7HFHHDEBFF@CC XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-30 0 chr1 9852129 37 100M * 0 0 TGTGAAATGGAGTCAGCAGAGTGAGCCGGCCTCCACTCAGTGAGCCGGGTCTCCCCCACAGCCGGCATGTGCTGACCTCCTTCCAACTGCTCTACCAAGA CBCDDFFDHGHHHIEJ+JD+?ECDDDDB XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-29 16 chr1 156397431 37 100M * 0 0 TCAGCCTCCCGAGTAGCTGGGATTACAGGAACCTGCCACCACGCCCGGCTAATTTTTGTATTTTCAGTTGAGACGGGGTTTCACCATGTTGCCCAGGCTG D1D(@9DDDC@D0C3=CDDJ;FDHDD@H2BDHIDAGDDDCDIFJ9GIFGIG@?)JJHJGFGJIB7JG>' IJIJJGJ+JJGIIHFIJIDHHHFFFFFFC@B XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:68A31
chr1-28 16 chr1 56986638 37 100M * 0 0 ACTCAGAACAGGTCTCCTTGTGGAACCATGGCCTTCCTTTTGGATCCTGGCCATGAGAGCCCATTCTTAGGAACCATGTTTCAATTCCAGTAGGTGATGT DD)DC@CJ@A)GIFJJJHHFFHFDDFFCCC XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-27 16 chr1 172015198 37 100M * 0 0 AGGTGTCAGTCCTCCAGCTTTGTTCTTCTTTTATATTGTGTTGGCTATCCTGGGCTCTTTGCTTCTCCATACAAAACTTAGAATCAGTTTGTTGATATCC B8BD>/D@CCEBBEBCH,F?CCD.E;HGJBJ)IGD7HED5@6JJJCHIGHJIJFDJCIJJHGJIJJJIEF:FEJHBJ.JJJIGHBHCF2DDFFCC@ XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-26 0 chr1 233336763 37 100M * 0 0 AGATATACAGCAAAGTTTGAAAGCTACAGTTCTGAGGACCATATTTATGGATTCCTTCTTATATGTTATCTGGGTTGATATAGAAATTCTTCCATGGCTA CBCFDFF;HHDDHHB?C9DE?DCE@D?B&5E>DDD7DD?D XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:41G58
chr1-25 0 chr1 105787069 0 100M * 0 0 GCCATTCTAACTGGTGTGAGATGGTATCTCATTGTGGTTTTGATTTGCATTTCTCTGATGGCCAGTGATGGTGAGCATTTTTTCATGTGTTTTTTGGCTG CCCDFFFFBGHHHHHCIGFJ:JAIGIJIJG)HCIIJGIHHJJJGEDHIHJHIII3J>JHJ?GDD?:;EFE(EDIJD?DDEAHCEDCDD?CDCF6D=>DDD XT:A:R NM:i:0 X0:i:52 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-24 0 chr1 235841969 37 100M * 0 0 GTTGGCTACTAGCTTAGCAGAGGTGCAAAACCATGAATTTCTGGTGGTATGGATTTTTTCAGCTATTTCAGATTCACCAGCAGGATCCAGCTGCTTGGGT CCCFF?FFFHHHFGI,JEJIIG@IIBDJJIFDIEAIJB;JGADHJD,CBD@DEC;?DDHD@DCDEDDDAD? XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:25G60T13
chr1-23 0 chr1 96545358 37 100M * 0 0 AGTGAAAAAGGCTGGCTGCCCTTCAATATCATCTTCAAATGTTAACAACACTGAATATTAATAAATTTCCTTTAGCGAATAATGAATCCAGCCTTCCTTA C@CF+FFFGGDHGJIBDJI2JGJIHHJJII?GJJJJGIJIJJGFJJG)IJ0HD0JIFJDJDFC;D7JGFFCEDFHADCDCCDEDDEAHDDD+9?<CA2:D XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-22 0 chr1 80270679 37 100M * 0 0 TTGTACACCCTATTTCTGACCAGAAGAAGGAGCATTTTGCTTTTTGCCAAATGAGAAGTGCATTCTGGAAACACTTGATGCCTGCACCACACCTCGAGTT ?@CFDDFFHFHHHJJJGC7J(GI8IJJJE?HHI>BJG*IJFJIDJHD0IEJIHDI>@H=EHGIAHJ33(EJCDEDA?FDG@DDBD XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-21 0 chr1 35923261 37 100M * 0 0 CTAAGCAGCAGTGTTTTTGGATACTTTTTTTTTCTGTTTGTGAATAAGGCCAGCACTCAAGATGGGCAGCCAAGGGTGCACTGACTATTAGCTGGCCCAT =@@DFDFEHGHHH8JIJGJH1JJHHJIHJGH?IIFEJIIG87JI=IAJJJBJIJD(IIFI8JIHF=JDHEJHEHDDCEDCDEACDDCCADXT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-20 16 chr1 112489190 37 100M * 0 0 AGGGAATGAACTATGCACATCTATATAGTAACAGGGACAGATTTTTTTTTAACATGAGAGTGTAAAAAAAAGAAAAAGAAAAAAAAAGGCCAGGCACAGT DACDABD@DDDDDA7DDDC8GHI@EI(DC?FG'+8.FBDJIHIEGG=IIG=I@*DFIJJIBIIJIJIIHJCHBGFJJJI@F>HJIIIHHAHFAFDFFC@1 XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:33A66
chr1-19 0 chr1 160371244 37 100M * 0 0 AGGCCCTGGGCACAGGCAGAGAGCCCACCGGCTGGTCATGAGGGCCTCTTCCTTTCTCTGACCCAGGCACCTCGAGGGCTCTTCTCCTGGGTTCCTTCCG @@:FDDFFCHHAHI:GEJFJGF@JJJFIC9JIIJJJ?IIEFHGJ'G?BFFBIIDIG,J)AJIHEGFBHCI&ECCD@EDD?)DED(D>3C?ABEEDDD4BD XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:80A19
chr1-18 16 chr1 179855835 37 100M * 0 0 AGCAATTAAAATAAATTAGGGTATCTTTAAAAGTTGTAAAATTATAGCAGTGAAGTACTGTTGACCAGGCACAGTGGCTCACACCTGTAATACCAGCACT DCEDBBDD/DD9DDD@DDFB(DDDHCHDF;C?;FJGC/IJ8DHEJ:DFGGIGHBIGIJDI(JDHGJJGJIHJII@HJJJ3JIJDIJBBHHFHFDFFF@C@ XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:20T39C39
chr1-17 16 chr1 207455995 37 100M * 0 0 GGTTCTTATGATTGGAAAGGTTAAAGAGTGACCTATAGGTCACTTTCCAATTATGAAAACAAAAAATTAAGAAATATATATATTTTCATTATTTCACTCC CBDDDCDD:&DDCFCFDDHDEJEDCFDJ;;EHGCD;CG?DIHGGCIJJJJ-GIJ7GIFHHHCGI)JJJJIJEGJIGJJJIH@FDCC1 XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-16 0 chr1 114154603 37 100M * 0 0 TGTATCTTTCTGCTAAGCATAACAAGAAAGACAGAAAGCTCAACGGGAGGATTGAGGCTAGACTTAAAGTAGAGATCCCCTCAGAAACTGTGGAGTGAGG CCCF8FFDHHHH4JIJIGIJIIFJHJJ?JEDI9BG?I>GHJ7FJJJIF67EIIHD2C>?>DDHDE8E7@JEJ(IFDDC;EDCC:FD>@DBC>D5D>=XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-15 16 chr1 169767580 37 100M * 0 0 GGTGGGGGAGAGGAAAGGAAACGAGGGAGGAAAGGCCCTAATAGGGAGGATTTTGGAGTTTAGATTTTAAAATGATAAAGGTTGTTTGACACTCTAGGCA DEDD9DDD@DD4DDDAEDDDC@D7=D;DA)7;IIJFD(J?JJDGI(IDGD7D'3JIE;H?AC@EHJJE?JJHDFJIIIECG)GGJJECHFHHFDFDFC@C XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:45A35A18
chr1-14 0 chr1 117644126 37 100M * 0 0 GCATTTCATTGTGGACTAATTTTCCCCCACTATTGAGGGAAGACCCTTTTGAGTACTCTATCTGATGCCCCATGAATGATAAAGTTTTATACTCTGGCTG C?CFBDFFGHHHJI5CJ=CD9DC-HGIJDCJHHDBEDDCC&DDDBD39DBCDDDDDCD XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-13 0 chr1 104996994 37 100M * 0 0 TTCTTGCTGGAACACATGTTTTCACCTTTACCTTCACCCACAGCCCAATGTGCATCAATATGGAGATAATGCAGTTCCATTTATACCTCTTTGTGGTTCA =@?FFFDDHBHHHJHJIJ)JIJHBHIJ*G:CJCJJJI?G)>GJI;JD3FJ8FJFGD;DDDDFBED7C7&A(ABC9CD+DCC&DDCA XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:3 XO:i:0 XG:i:0 MD:Z:18G63G12A4
chr1-12 16 chr1 108617705 37 100M * 0 0 AGGTCGGGGAGATTGGGAAGAAGAATGAGCAAAGAAACCACCAGTGTGATCAGAGGAGGAAAGCAAAGCAGAGTCCTGTCCTGAAAACCAAATGAAGAAA :=>+D(DCEC=@GHB(CDDDDHABDD+HBJJ9F?A35DDIE?JJHIHJJIEE?JFJ?7JBGJJI>JJGJBJIIBIJJJIIIIJGJGJHHDFHF3FFFCB@ XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-11 16 chr1 72085324 37 100M * 0 0 TACTAGCCTTGAAAATGTTTAAAATAATATTCCAGAGTTAATATTGTTGTCCCTGGTATGTTAAAGAGTATTTGTTATCATAGCCAATTCTTGAGTCTGC 8@DDCD4D>D?C3DF(DCCHDDDA;HDEIBFCHGHHHFFIFEG1JHIJIJCGEJIHJG)IH(IJ)BDJ??FHHJHCJJIFJHJJJGIGH)2HFFFA=HFG1JJFIIJJDJJIJDIGICHEFD@D3:0A/(BECDDDDCBE>BD8DDDDDC8C XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-8 0 chr1 79478960 37 100M * 0 0 GGCAACACTTGAGAACACAAAGTGAGTTCTCACTTTGGGCGGTGGTTTCAGGCTTCAGGGTGGAGTTTTGTCAGGAACCCAACCTTTTCTGCCTAGAATT @CCFDFDADHHAHFIH@ICJJHI5?JIJ)GCFEIJHG=II)HIGI9JJIJGHEJHFI8EIDG)GCI4FJF?I8HCDH;DD0&3CFDDDD@C4DCD6ADD> XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:81C18
chr1-7 0 chr1 178190761 37 100M * 0 0 GTAGCCGGAATAAACAGTCACTGTGAGTTGTCCATTTTAGAGCATAGGTTTTCAGGTGGTGAAGACCTGTCCTTAGTTGAATTTGTATGTGAATTAAACT B?D :DAFG7)&DJID9J)FCD/HHJEDFIJ@D@DDADF?C@A@ADCDD@CDDD XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:52T47
chr1-6 0 chr1 42572411 37 100M * 0 0 AACCCTTTATCAGGTATGTATTATAAACATCGACTCTGTGGCTTGCATTTTCATTCTCCTTATATATCTTTTGATGAATCAAAGTTTTTAATTTGAATAT BCCFFFFDAH)HHJJHIGG,HFH2JIJJ4IDI93IJJ<=JJ>IH7IJIJBIBG)CFH7DHHFAHFHEDIFBEFH;EBICA?3DD5D(DDBACDC(BADD: XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:94T5
chr1-5 0 chr1 153186635 37 100M * 0 0 GTCTTGACTCTTTATCCACTTTGCCAGTCTGTGTCTTGTAATTGGGGCATTTAGCCTATTTACATTTAAGGTTAATATTGTTATGTGTGAATTTGATCCT C1@?FF=FHHHHHI?JJEJFIIIHG:.?>EEJEI(JG9J'IIHIJIHIJGJGJFJ9FJAG4EEC:DADE8DAEJFCCBBCDAEDDDD-DDDD@+DBC8D+ XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-4 0 chr1 145038405 0 100M * 0 0 AGTGGAAATAATACTCGTCAACATATGCCTTTCAAAAAAATTTTTTTTCATATTTTAAATTTACCTTTACTACCTATTTATTTGGTTCAAGGCTCCATTT C:CFFFDDHFHDHJIIJJJJ29CJ+JJJIIJIIFIG?JI08?CJJIFIFDEFDGBD>JAIDJDJ>JCBG(CG=DE5?(EDB3HDD>ED2:CCHDBJA2CCHECDEFD,DDFBCHXT :A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:92C5A1
chr1-2 0 chr1 62477842 37 100M * 0 0 TAGGAAAATGGAGAAACTTTAATATGAAATCTTCCTGTTTTTCACATTATGTTTAGATTGTTACAGCATAAAATTTCAGAAACATTGCAAAAAGTTTTAA @C=FFDFDHHHH>GJ@IEJGJIIJJJJF@JHHIIGGJICIAJFJIH)H7E?GHEDI>HHFAHC@)(D>DDDEDXT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:77C0A21
chr1-1 0 chr1 11355150 37 100M * 0 0 ATTTATTGGCTGTCTTTCAGGCACATTTTAGCTGTCATCCAACATTCTCAACCTTAGTCCCCTTCTCTGGGCTAAGGGGAGAATGATGGTCCTACCCCAG BC?DFFFFHH7FJJJJJFDHID)JCH=3DIJ5JGDI8@@I@A=>3<:idca9ddffi> XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c50Nhs20.aln
##ART_Illumina read_length 100
@CM art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -c 50 -o G38L100c50Nhs20 -rs 1464918709
@SQ chr1 AC:CM000663.2 gi:568336023 LN:248956422 rl:Chromosome M5:6aef897c3d6ff0c78aff06ac189178dd AS:GRCh38 248956422
##Header End
>chr1 chr1-50 93465784 +
TTCCACAATAGTTGAACTAATTTACAGTCCCACCAACAGTGTAAAAGTGTTCCTATTTCTCCACATCCTCTCCAGCACCTGTTGTTTCCTGACTTTTTAA
TTCCACAATAGTTGAACTAATTTACAGTCCCACCAACAGTGTAAAAGTGTTCCTATTTCTCCACATCCTCTCCAGCACCTGTTGTTTCCTGACTTTTTAA
>chr1 chr1-48 228133745 +
ATCATTGTATGCCACAGAAATAATTAAATTTCCTTGTCAACTGAGACATTATTATTAGGCACTCTCACCAGATCTTTACCCATGGCCATTTAAAGTGTGG
ATCATTGTATGCCACAGAAATAATTAAATTTCCTTGTCAACTGACACATTATTATTAGGCACTCTCACCAGATCTTTACCCATGGCCATTTAAAGTGTGG
>chr1 chr1-47 13772987 +
TTCAGTAATTCAGAATAACACATGAGGGAATGAATGAATGAATAAATAAAAAAAAAATGAATGAATAAATTAAAAAAAATTGTGTTTCAGGGAAGAAAAA
TTCAGTAATTCAGAATAACACATGAGGGAATGAATGAATGAATAAATAAAAAAAAACTGAATGAATAAATTACAAAAAATTGTGTTTCAGGGAAGAAAAA
>chr1 chr1-46 211481565 -
ATCCCGATCAGGAACCTGGTGCTCTTCCCCGGCTCTCCCAACTTGGGCAAAGTTGGAGGGGTCCTGAGGCCCGGCGGGCCGGGGAACAGGACCCCGACCC
ATCCCGATCAGGAACCTGGTGCTCTTCCCCGGCTCTCCCAACTTGGGCAAAGTTGGAGGGGTCCTGAGGCCCGGCGGACCGGGGAACAGGACCCCGACCC
>chr1 chr1-45 29056656 +
CTGGGATTACAGGTGCCCGCCACCATGCCCAGCTAATTTTTGTATTTTTGGTAGAGACAAGGTTTCACCATGTTGGCCGGGATTGTCTCGAACTCCTGAT
CTGGGATTACAGGTGCCCGCCACCATGCCCAGCTAATTTTTGTATTTTTGGTAGAGACAAGGTTTCACCATGTTGGCCGGGATTGTCTCGAACTCCTGAT
>chr1 chr1-44 49993892 +
CAATTTAGCCAAAACTGGCTAATCCTTTTACCAGAATCATTCCCATTGTTCAAGACCTATTTTAAGCTCCACTATCACCATAAAACTTTCCCGATCAGTT
CAATTTAGCCAAAACTGGCTAATCGTTTTACCAGAATCATTCCCATTGTTCAAGACCTATTTTAAGCTCCACTATCACCATAAAACTTTCCCGATCAGTT
>chr1 chr1-43 54241817 -
TTGCAGACCTTTTATTGATTGTGAAATAACCCAAGAGAACACACATGGAACCATTTTTAAAGTATCATCAAATTTAAATATGATATTATTAAAACATATT
TTGCAGACCTTTTATTGATTGTGAAATAACCCAAGAGAACACACATGGAACCATTTTTAAAGTATCATCAAATTTAAATATGATATTATTAAAACATATT
>chr1 chr1-42 35706202 +
CAGGTTCAAGCGATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGATTACAGGCACGTGCCACCATGCCTGGCAATTTTTGTATTTTTAGTACAGATGGGG
CAGGTTCAAGCGATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGATTACAGGCACGTGCCACCATGCCTGGCAATTTTTGTATTTTTAGTACAGATGGGG
>chr1 chr1-41 92473985 -
GGTGGGCCCATACACACAGGCACCAGGATATACCCCCGCCTGCATCCACATGGACACGTGCCCACGTTGCATGTACACGCAGACCTGCCTATGCACACAC
GGTGGGCCCATACACACAGGCACCAGGATATACCCCCGCCTGCATCCACATGGACACGTGCCCACGTTGCATGTACACGCAGACCTGCCTATGCACACAC
>chr1 chr1-40 27177039 -
GAACCATTTGGCTGAGCCCAGCACTCTGCCCATGGTCCATGCACCCCTTGACCCTGACATCAGCAAAGCAGTCCCCATCACCAAAGTGCTATGTGCCATG
GAACCATTTGGCTGAGCCCAGCACTCTGCCCATGGTCCATGCACCCCTTGACCCTGACATTAGCAAAGCAGTCCCCATCACCAAAGTGCTATGTGCCATG
>chr1 chr1-39 245060718 -
TCTTGACATTTTGGAGTTGACACCTTGGAGGTGGTTAATCCATGACGAAACTGAAGCTAAGCCGATCTCATGTTTTGACTCCTGTCAATCCGGAGAGGAC
TCTTGACATTTTGGAGTTGACACCTTGGAGGTGGTTAATCCATGACGAAACTGAAGCTAAGCCGATCTCATGTTTTGACTCCTGTCAATCCGGAGAGGAC
>chr1 chr1-38 215781397 -
TACATATACATATACATATACATACATGTGTGTGTGCATATATATGTATATGTGTGTATATATATATATATATATATATACACACATATATGTATGTGTG
CACATATACATATACATATACATACATGTGTGTGTGCATATATATGTATATGTGTGTATATATATATATATATATATATACACACATATATGTATGTGTG
>chr1 chr1-37 42831546 -
AACAACGTATGTCCACACAAAAACTTGCACATGAATGTTCACTAGCAGCATTATTTGTAACCTGCCCAAGGTGGAAACAACCCAAATGTCTATTGACTGA
AACAACGTATGTCCACACAAAAACTTGCACATGAATGTTCACTAGCAGCATTATTTGTAACCTGCCCAAGGTGGAAACAACCCAAATGTCTATTGACTGA
>chr1 chr1-36 181673625 +
TCCACTGCCCAGAAAGAGGACATCCCTTATAGGGCCAGCGGATGGAAGCCATGGGCTGGGCAGGACATTCCTGTCCCAACCCACATGGCAGCTAGAGTCC
TCCACTGCCCAGAAAGAGGACATCCCTTATAGGACCAGCGGATGGAAGCCATGGGCTGGGCAGGACATTCCTGTCCCAACCCACATGGCAGCTAGAGTCC
>chr1 chr1-35 96851543 -
GTTGTGACCTCCCAACCCCCACAGAGGTTCACGTGTTGAAGTCTTAACCCTCAGTACCTCAGAATGTAATCATATTTGAAGATATGGTATTTATAGAGGT
GTTGTGACCTCCCAACCCCCACAGAGGTTCACGTGTTGAAGTCTTAACCCTCAGTACCTCAGAATGTAATCATATTTGAAGATATTGTATTTATAGAGGT
>chr1 chr1-34 13267476 +
ATACAGGGAAGGTTTTAATCTGTTTCAGACATTAGAAATACATATATTTATATATGGTATCTTTATTGGAGAACCTTTGGCCACATCAAAAGTATCAAAA
ATACAGGGAAGGTTTTAATCTGTTTCAGACATTAGAAATACATATATTTATATATGGTATCTTTATTGGAGAACCTTTGGCCACATCAAAAGTATCAAAA
>chr1 chr1-33 48968232 +
GAATAGTAGGCAATAAACAAAGAGAGCAACTTAGGAGCCAGATCACATGTGGCCTCTCGAGCAATATGGTAAAAGTTCTGGACTTCATTCTAGGTGAATG
TAATAGTAGGCAATAAACAAAGAGAGCAACTTAGGAGCCAGATCACATGTGGCCGCTCGAGCAATATGGTAAAAGTTCTGGACTTCATTCTAGGTGAATG
>chr1 chr1-32 88980622 +
TAGTTCAGTAAACTATTTATCAAACAGGTGTCTGGTCATTTTAACATACTCCTTGCTTTGAACAATATTCATTCATATTTGGTACAAACTCTATATCCTA
TAGTTCAGTAAACTATTTATCAAACAGGTGTCAGGTCATTTTAACATACTCCTTGCTTTGAACAATATTCATTCATACTTGGTACAAACTCTATATCCTA
>chr1 chr1-31 21950729 -
TTCCCAAATTAACTAACTTCACCTAGGTTGTGAGTATTGCAGCGTCTTATTTCTAATCCCAGTACTACTATTGGTGTGAGTCAGTAAAGATGCCTGGTGA
TTCCCAAATTAACTAACTTCACCTAGGTTGTGAGTATTGCAGCGTCTTATTTCTAATCCCAGTACTACTATTGGTGTGAGTCAGTAAAGATGCCTGGTGA
>chr1 chr1-30 9852128 +
TGTGAAATGGAGTCAGCAGAGTGAGCCGGCCTCCACTCAGTGAGCCGGGTCTCCCCCACAGCCGGCATGTGCTGACCTCCTTCCAACTGCTCTACCAAGA
TGTGAAATGGAGTCAGCAGAGTGAGCCGGCCTCCACTCAGTGAGCCGGGTCTCCCCCACAGCCGGCATGTGCTGACCTCCTTCCAACTGCTCTACCAAGA
>chr1 chr1-29 92558892 -
CAGCCTGGGCAACATGGTGAAACCCCGTCTCTACTGAAAATACAAAAATTAGCCGGGCGTGGTGGCAGGTTCCTGTAATCCCAGCTACTCGGGAGGCTGA
CAGCCTGGGCAACATGGTGAAACCCCGTCTCAACTGAAAATACAAAAATTAGCCGGGCGTGGTGGCAGGTTCCTGTAATCCCAGCTACTCGGGAGGCTGA
>chr1 chr1-28 191969685 -
ACATCACCTACTGGAATTGAAACATGGTTCCTAAGAATGGGCTCTCATGGCCAGGATCCAAAAGGAAGGCCATGGTTCCACAAGGAGACCTGTTCTGAGT
ACATCACCTACTGGAATTGAAACATGGTTCCTAAGAATGGGCTCTCATGGCCAGGATCCAAAAGGAAGGCCATGGTTCCACAAGGAGACCTGTTCTGAGT
>chr1 chr1-27 76941125 -
GGATATCAACAAACTGATTCTAAGTTTTGTATGGAGAAGCAAAGAGCCCAGGATAGCCAACACAATATAAAAGAAGAACAAAGCTGGAGGACTGACACCT
GGATATCAACAAACTGATTCTAAGTTTTGTATGGAGAAGCAAAGAGCCCAGGATAGCCAACACAATATAAAAGAAGAACAAAGCTGGAGGACTGACACCT
>chr1 chr1-26 233336762 +
AGATATACAGCAAAGTTTGAAAGCTACAGTTCTGAGGACCAGATTTATGGATTCCTTCTTATATGTTATCTGGGTTGATATAGAAATTCTTCCATGGCTA
AGATATACAGCAAAGTTTGAAAGCTACAGTTCTGAGGACCATATTTATGGATTCCTTCTTATATGTTATCTGGGTTGATATAGAAATTCTTCCATGGCTA
>chr1 chr1-25 96853884 +
GCCATTCTAACTGGTGTGAGATGGTATCTCATTGTGGTTTTGATTTGCATTTCTCTGATGGCCAGTGATGGTGAGCATTTTTTCATGTGTTTTTTGGCTG
GCCATTCTAACTGGTGTGAGATGGTATCTCATTGTGGTTTTGATTTGCATTTCTCTGATGGCCAGTGATGGTGAGCATTTTTTCATGTGTTTTTTGGCTG
>chr1 chr1-24 235841968 +
GTTGGCTACTAGCTTAGCAGAGGTGGAAAACCATGAATTTCTGGTGGTATGGATTTTTTCAGCTATTTCAGATTCACCAGCAGGATTCAGCTGCTTGGGT
GTTGGCTACTAGCTTAGCAGAGGTGCAAAACCATGAATTTCTGGTGGTATGGATTTTTTCAGCTATTTCAGATTCACCAGCAGGATCCAGCTGCTTGGGT
>chr1 chr1-23 96545357 +
AGTGAAAAAGGCTGGCTGCCCTTCAATATCATCTTCAAATGTTAACAACACTGAATATTAATAAATTTCCTTTAGCGAATAATGAATCCAGCCTTCCTTA
AGTGAAAAAGGCTGGCTGCCCTTCAATATCATCTTCAAATGTTAACAACACTGAATATTAATAAATTTCCTTTAGCGAATAATGAATCCAGCCTTCCTTA
>chr1 chr1-22 80270678 +
TTGTACACCCTATTTCTGACCAGAAGAAGGAGCATTTTGCTTTTTGCCAAATGAGAAGTGCATTCTGGAAACACTTGATGCCTGCACCACACCTCGAGTT
TTGTACACCCTATTTCTGACCAGAAGAAGGAGCATTTTGCTTTTTGCCAAATGAGAAGTGCATTCTGGAAACACTTGATGCCTGCACCACACCTCGAGTT
>chr1 chr1-21 35923260 +
CTAAGCAGCAGTGTTTTTGGATACTTTTTTTTTCTGTTTGTGAATAAGGCCAGCACTCAAGATGGGCAGCCAAGGGTGCACTGACTATTAGCTGGCCCAT
CTAAGCAGCAGTGTTTTTGGATACTTTTTTTTTCTGTTTGTGAATAAGGCCAGCACTCAAGATGGGCAGCCAAGGGTGCACTGACTATTAGCTGGCCCAT
>chr1 chr1-20 136467133 -
ACTGTGCCTGGCCTTTTTTTTTCTTTTTCTTTTTTTTACACTCTCATGTTAAAAAAAAATCTGTCCTTGTTACTATATAGATGTGCATAGTTCATTCCCT
ACTGTGCCTGGCCTTTTTTTTTCTTTTTCTTTTTTTTACACTCTCATGTTAAAAAAAAATCTGTCCCTGTTACTATATAGATGTGCATAGTTCATTCCCT
>chr1 chr1-19 160371243 +
AGGCCCTGGGCACAGGCAGAGAGCCCACCGGCTGGTCATGAGGGCCTCTTCCTTTCTCTGACCCAGGCACCTCGAGGGCTATTCTCCTGGGTTCCTTCCG
AGGCCCTGGGCACAGGCAGAGAGCCCACCGGCTGGTCATGAGGGCCTCTTCCTTTCTCTGACCCAGGCACCTCGAGGGCTCTTCTCCTGGGTTCCTTCCG
>chr1 chr1-18 69100488 -
AGTGCTGGTATTACAGGTGTGAGCCACTGTGCCTGGTCAGCAGTACTTCACTGCTATAATTTTACAACTTTTAAAGATAACCTAATTTATTTTAATTGCT
AGTGCTGGTATTACAGGTGTGAGCCACTGTGCCTGGTCAACAGTACTTCACTGCTATAATTTTACAACTTTTAAAGATACCCTAATTTATTTTAATTGCT
>chr1 chr1-17 41500328 -
GGAGTGAAATAATGAAAATATATATATTTCTTAATTTTTTGTTTTCATAATTGGAAAGTGACCTATAGGTCACTCTTTAACCTTTCCAATCATAAGAACC
GGAGTGAAATAATGAAAATATATATATTTCTTAATTTTTTGTTTTCATAATTGGAAAGTGACCTATAGGTCACTCTTTAACCTTTCCAATCATAAGAACC
>chr1 chr1-16 114154602 +
TGTATCTTTCTGCTAAGCATAACAAGAAAGACAGAAAGCTCAACGGGAGGATTGAGGCTAGACTTAAAGTAGAGATCCCCTCAGAAACTGTGGAGTGAGG
TGTATCTTTCTGCTAAGCATAACAAGAAAGACAGAAAGCTCAACGGGAGGATTGAGGCTAGACTTAAAGTAGAGATCCCCTCAGAAACTGTGGAGTGAGG
>chr1 chr1-15 79188743 -
TGCCTAGAGTGTCAAACATCCTTTATCATTTTAAAATCTAAACTCCAAAATCCTTCCTATTAGGGCCTTTCCTCCCTCGTTTCCTTTCCTCTCCCCCACC
TGCCTAGAGTGTCAAACAACCTTTATCATTTTAAAATCTAAACTCCAAAATCCTCCCTATTAGGGCCTTTCCTCCCTCGTTTCCTTTCCTCTCCCCCACC
>chr1 chr1-14 117644125 +
GCATTTCATTGTGGACTAATTTTCCCCCACTATTGAGGGAAGACCCTTTTGAGTACTCTATCTGATGCCCCATGAATGATAAAGTTTTATACTCTGGCTG
GCATTTCATTGTGGACTAATTTTCCCCCACTATTGAGGGAAGACCCTTTTGAGTACTCTATCTGATGCCCCATGAATGATAAAGTTTTATACTCTGGCTG
>chr1 chr1-13 104996993 +
TTCTTGCTGGAACACATGGTTTCACCTTTACCTTCACCCACAGCCCAATGTGCATCAATATGGAGATAATGCAGTTCCATTTGTACCTCTTTGTGATTCA
TTCTTGCTGGAACACATGTTTTCACCTTTACCTTCACCCACAGCCCAATGTGCATCAATATGGAGATAATGCAGTTCCATTTATACCTCTTTGTGGTTCA
>chr1 chr1-12 140338618 -
TTTCTTCATTTGGTTTTCAGGACAGGACTCTGCTTTGCTTTCCTCCTCTGATCACACTGGTGGTTTCTTTGCTCATTCTTCTTCCCAATCTCCCCGACCT
TTTCTTCATTTGGTTTTCAGGACAGGACTCTGCTTTGCTTTCCTCCTCTGATCACACTGGTGGTTTCTTTGCTCATTCTTCTTCCCAATCTCCCCGACCT
>chr1 chr1-11 176870999 -
GCAGACTCAATAATTGGCTATGATAACAAATACTCTTTCACATACCAGGGACAACAATATTAACTCTGGAATATTATTTTAAACATTTTCAAGGCTAGTA
GCAGACTCAAGAATTGGCTATGATAACAAATACTCTTTAACATACCAGGGACAACAATATTAACTCTGGAATATTATTTTAAACATTTTCAAGGCTAGTA
>chr1 chr1-10 34644993 -
AGTCCAGAATTCTGAGTCTGTGAGTTTACACCTTCCAACAGTGATAATCAGATATCAAGCTTGAAGACTACCAACAAAAGTGGACCAAATAGGGATCATC
AGTCCAGAATTCTGAGTCTGTGAGTTTACACCTTCCAACAGTGATAATCAGATATCAAGCTTGAAGACTACCAACAAAAGTGGACCAAATAGGGATCACC
>chr1 chr1-9 152012628 +
AAGCAATTCTCTTGCTTTAGCCTCCCGAGAAGCTCGGATTACAGGCATGTCCACCACACCCAGCTAATTCTTTTGTATTTTTAGTAGACATGGGGTTTTG
AAGCAATTCTCTTGCTTTAGCCTCCCGAGAAGCTCGGATTACAGGCATGTCCACCACACCCAGCTAATTCTTTTGTATTTTTAGTAGACATGGGGTTTTG
>chr1 chr1-8 79478959 +
GGCAACACTTGAGAACACAAAGTGAGTTCTCACTTTGGGCGGTGGTTTCAGGCTTCAGGGTGGAGTTTTGTCAGGAACCCACCCTTTTCTGCCTAGAATT
GGCAACACTTGAGAACACAAAGTGAGTTCTCACTTTGGGCGGTGGTTTCAGGCTTCAGGGTGGAGTTTTGTCAGGAACCCAACCTTTTCTGCCTAGAATT
>chr1 chr1-7 178190760 +
GTAGCCGGAATAAACAGTCACTGTGAGTTGTCCATTTTAGAGCATAGGTTTTTAGGTGGTGAAGACCTGTCCTTAGTTGAATTTGTATGTGAATTAAACT
GTAGCCGGAATAAACAGTCACTGTGAGTTGTCCATTTTAGAGCATAGGTTTTCAGGTGGTGAAGACCTGTCCTTAGTTGAATTTGTATGTGAATTAAACT
>chr1 chr1-6 42572410 +
AACCCTTTATCAGGTATGTATTATAAACATCGACTCTGTGGCTTGCATTTTCATTCTCCTTATATATCTTTTGATGAATCAAAGTTTTTAATTTTAATAT
AACCCTTTATCAGGTATGTATTATAAACATCGACTCTGTGGCTTGCATTTTCATTCTCCTTATATATCTTTTGATGAATCAAAGTTTTTAATTTGAATAT
>chr1 chr1-5 153186634 +
GTCTTGACTCTTTATCCACTTTGCCAGTCTGTGTCTTGTAATTGGGGCATTTAGCCTATTTACATTTAAGGTTAATATTGTTATGTGTGAATTTGATCCT
GTCTTGACTCTTTATCCACTTTGCCAGTCTGTGTCTTGTAATTGGGGCATTTAGCCTATTTACATTTAAGGTTAATATTGTTATGTGTGAATTTGATCCT
>chr1 chr1-4 127714516 -
AGTGGAAATAATACTCGTCAACATATGCCTTTCAAAAAAATTTTTTTTCATATTTTAAATTTACCTTTACTACCTATTTATTTGGTTCAAGGCTCCATTT
AGTGGAAATAATACTCGTCAACATATGCCTTTCAAAAAAATTTTTTTTCATATTTTAAATTTACCTTTACTACCTATTTATTTGGTTCAAGGCTCCATTT
>chr1 chr1-3 84685020 +
TATTATTAAAACTATAAATGGACCAATTAAACAAACGTGTCATGAGCCAAGGAATATAAACTAATTCTTTACACCTGAAGTCCTTTAAAATGCTTTAAAT
TATTATTAAAACTATAAATGGACCAATTAAACAAACGTGTCATGAGCCAAGGAATATAAACTAATTCTTTACACCTGAAGTCCTTTAAAATGATTTAATT
>chr1 chr1-2 62477841 +
TAGGAAAATGGAGAAACTTTAATATGAAATCTTCCTGTTTTTCACATTATGTTTAGATTGTTACAGCATAAAATTTCCAAAACATTGCAAAAAGTTTTAA
TAGGAAAATGGAGAAACTTTAATATGAAATCTTCCTGTTTTTCACATTATGTTTAGATTGTTACAGCATAAAATTTCAGAAACATTGCAAAAAGTTTTAA
>chr1 chr1-1 11355149 +
ATTTATTGGCTGTCTTTCAGGCACATTTTAGCTGTCATCCAACATTCTCAACCTTAGTCCCCTTCTCTGGGCTAAGGGGAGAATGATGGTCCTACCCCAG
ATTTATTGGCTGTCTTTCAGGCACATTTTAGCTGTCATCCAACATTCTCAACCTTAGTCCCCTTCTCTGGGCTAAGGGGAGAATGATGGTCCTACCCCAG
이 내용에 흥미가 있습니까?
현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:
유전자 데이터 처리 77 vcf 파일에서 어떤 염색체의 데이터 추출1. 코드: 2. 스크립트: 3. 결과: GRCH38chr20: GRCH38chr22:...
텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.
CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.