版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
-
PAGE
1
-
ASimplifiedParallelDecodingSchemeforTurboCodes
WangLe,HuangYu,YangHongwen
WirelessCommunicationCenter,BeijingUniversityofPostsandTelecommunications,Beijing,China(100876)
Abstract
ParalleldecodingofTurbocodesisvitaltotheapplicationswithveryhighdatarates.Intheoverlappingparalleldecodingscheme,thedecodingthroughputcannotincreaselinearlywiththenumberofsegments.Aschemewherethethroughputcanincreaselinearlywasproposed,wheretheboundaryinformationofeachiterationisstoredtobetheinitialconditioninnextiteration.Inthispaper,wehavesimplifiedtheschemebyonlystoringtheindexofthemostprobablestateoftheboundary.Withthecostofsomesmallperformanceloss,theschemecanreducelargelytheextramemoryrequiredtostoretheboundaryinformation.
Keywords: Turbocodes;MAPalgorithm;paralleldecoding
Introduction
Withtheever-increasingdemandofthedatarate,turbocodeisfacingstringentchallenges.ITUsuggeststhatthedatarateof4Gsystembeatleast100Mbps,inmobileconditions.Asthelocaloperatingclockfrequencyinhardwaresystemistypicallywithin100MHznowadays,itisreported[1]thatthethroughputofonly6.17Mbpscanbeachievedforturbocodeswithconventionalserialdecoding[2].Obviously,thedecodingalgorithmofturbocodehastobeimprovedsignificantlytosurviveinthefuturecommunicationsystems.
Thebottleneckofthethroughputliesinitsdecodingscheme.Theconvolutionalcodesistypicallyemployedastheconstituentcodes,leadingtothedecodingalgorithmemployssomerecursiveschemesbasedontrellis,suchassymbol-by-symbolMaximumAPosterioriprobability(MAP)algorithm[3].Itrequireslotsofrecursivecomputationsinbothcomponentdecoders.Moreover,thedecodingprocedureisiterative.Hence,thedecodingdelayisquitelarge.
Toreducethedecodingdelayandimprovethedecodingthroughput,theparalleldecodingmethodwasproposed[4]bydividingthewholeframeintomultiplesub-blocksanddecodingthesegmentswithabankofidenticalsub-blockMAPprocessors.Withtheparalleldecodingscheme,thethroughputcanbeimprovedbyafactorofaboutK,thenumberofsub-blockssegmented.Underthesameconditionsasmentionedpreviously,thethroughputof100MbpscanbeachievedwithK=16.
However,asaresultofthesegmentation,eachsub-blocksuffersthelostofboundaryinformation(initialconditions),whichisvitalfortheperformance.Togettheboundaryinformation,amethodwithsegmentsoverlappingisproposed[4],bycomputingsomemoreredundanttrellisstages.However,suchoverlappingsufferssomeextracomputationaldelay,andleadstolowerdecodingspeed.TheactualspeedupfactorissmallerthanK,especiallywhenKislargeandtheframelengthissmall.Anotherwaytogettheboundarydistributioninformationwasproposedin[5].Itsuggeststousetheforwardandbackwardvariablescomputedinthepreviousiterationtoprovideanapproximationtotheboundary.Simulationresultsshowthatitsperformanceisveryclosetotheserialdecodingschemein[2].Thecostofthismethodissomeadditionalmemoryneededtostoretheintermediateboundaryinformation.
Toreducethesizeoftheadditionalmemoryin[5],wesuggestthatinsteadofthewholeboundarydistributioninformation,onlytheindexofthemostprobablestatebestored.Withthissimplifiedstoragescheme,wecansavealotofmemoryatthecostofsomeacceptableperformanceloss.
Therestofthepaperisorganizedasfollows.First,section2givesabriefdescriptionofthestandardserialdecodingtoshowitsweakpointindecodingspeed.Section3showsthetwoparalleldecodingmethodsproposedin[4-5].Ourproposedschemeisgiveninsection4andsection5isthesimulationresults.Finally,section6concludesthepaper.
SerialdecodingschemeofTurbocodes
TurbocodesintroducedbyBerrouetal[2]haveanear-Shannoncapability,thusbeingveryattractivetovariousapplications.However,thelargedecodingdelayprohibitsitsapplications,especiallyforthosetimeconstraintorhighdatarateservices.
ThedecoderofTurbocodesemploysaseriallyconcatenatedstructure,whichconsistsoftwocomponentdecoders,asshowninFig.1.Thedecodingproceedsinaniterativemanner.Ineachiteration,thefirstcomponentdecoder,correspondingtothefirstcomponentencoder,deliversitscalculatedextrinsicinformation(Le1)tothe2ndcomponentdecoderafterinterleavingtoserveastheapriori
information(La2).Thenthesecondcomponentdecoderfeedbacksitsextrinsicinformation(Le2)tothefirstoneafterde-interleaving,workingasaprioriinformation(La1).xandxinFig.1are
receivedinformationbitsanditsinterleavedversion,respectively.The
y1and
y2arereceived
redundantbitsproducedbythecomponentRecursiveSystematicConvolutional(RSC)encoders,respectively.
La1x
y1
Le1
La2
x
y2
Le2
LLRs
Figure1.DecoderofTurboCodes
ThefunctionofthecomponentdecoderistocalculatetheLLRs(LogarithmofLikelihoodRatio)fortheinformationbits.ThecalculationofLLRsgenerallyemploystheMAPalgorithm[3]wheretheLLRsfortheithinformationbitofalengthNframecanbeexpressedas
i(s)i(s)i(s,1)
LLRiLog
1
i(s)i(s)i(s,0)
0
(1)
wherethesummationinnumeratoranddenominatorisoverthesetofbranchesintrellisdiagram,inwhichtheinformationbitis1or0,respectively.NotethatwearenotgoingtoexplainthedetailsoftheMAPalgorithmoritsvariants;rather,wesimplyshowtherecursivenatureinthealgorithmsthat
prohibitthedecodingspeed.Therearethreekindsofvariablesin(1).
i(s)
and
i(s)
denotethe
forwardandbackwardvariablesintheithtrellisstage,
is,d
isthebranchmetriccorresponding
thebranchwhichstartsfromstatesandtheinformationbitisd.
Thei
in(1)canbedirectlycalculatedwithchannelinputsandtheaprioriinputswithinoneclock
cycle,providedthatalltheinputsnecessaryarereadytouse.Itistherequiresrecursivecalculationasshownin(2)and(3)forbinaryRSC:
i(s)
and
i(s)
which
i(s)i1(s1)i(s1,1)i1(s0)i(s0,0),
i1,2...N1
(2)
1
1
0
0
i(s)i1(s)i1(s,1)i1(s)i1(s,0),
iN1,N2...1
(3)
wheresd
isthepreviousstatecomingtostateswiththeinputinformationbitsasd.
isthe
s
d
arrivingstateviathecurrentstateswithinputbitasd.Theinitialconditionfortherecursivecalculationis
0
0
(s)1
(s)1
s0
else
s0
(4)
(5)
0
N else
Notethattheinitialcondition(5)onlyappliesfortheRSCwhichisterminated[6].IfRSCisopen[2],theinitialconditionshouldbe
1
N(s)2m, s
(6)
where2misthenumberofthetrellisstates.ThewholeprocessisshowninFig.2.
LLRs
Figure2.theoperationflowchartforMAPalgorithm
Apparently,tocalculatei,weshouldcalculate
i1
first.Andtocalculate
N1,wehavetowait
forallpastibeenobtainedandthesituationissimilartoi.Thus,theprocessingforone
componentdecoderinoneiterationcannotbefinishedwithinNclockcycles(someextraclock
cyclesmayberequiredtoexecutesomeotheroperations).Iftheclockfrequencyis
fc,thenthedelay
forprocessingoftheMAPismorethan
Tmap
N
f
.Supposethatthemaximumiterationis
c
Imax,then
thedelayfordecodingonecodewordofturbocodeismorethanT
2T I
2NI
,leadsto
dec map
max
max
c
ainformationthroughputlessthan
RNTdec
fc
2Imax
f
.Forexample,when
fc100MHz
and
Imax8,thethroughputcannotbelargerthan6.25Mbps.Notethatwithsomeearly-stopiterationstrategy[7],theactualnumberofiterationstosuccessfullydecodeonecodewordmaybemuchless
than
Imax,butthehardwarehavetobedesignedwith
Imax.
Paralleldecodingmethods
Toreducethedecodingdelayandimproveitsthroughput,wecandividethewholecodewordintomultiplesub-blocksanddecodethemwithmultipleMAPprocessorsinparallel.However,thesegmentsinthemiddlehavenoinitialcondition[4-5].Withrandomorarbitraryinitializationas(6)willleadtoaperformancelossthatisunacceptable.Sodirectlyworkingateachsegmentfacesaproblemofhowto
initializetheforwardandbackwardvariablesattheboundaryofeachsegment.Asshownin(2),
ks
dependsonallpast
js,jk.Notethatthefarawaythejleavesk,thelesseffect
will
j(s)
haveon
k(s).Thus,tohaveagoodaccuracyforthe
k(s)
atthebeginningofeach
sub-block,weshouldstarttheprocessingof(2)earlieratjkL
withLlargeenough[4].This
isshowninFig.3.Foracertainsub-blockoflengthMlocatingfrompositionktokM1,we
startthecalculationfromkL
with
kLs
beingarbitrarilyinitializedas(6).Theposition
kL
tok1
belongstotheprevioussub-block,hencetheparallelprocessersarepartially
overlapped.Thecalculationof
k(s)
isexactlythesame.Tohaveagoodaccuracyforthe
kM1(s)
attheright-handbeginningpointofthesegment,weshouldstartfromjkM1L.
overlappingarea
kM(s)
…
kL(s)
…
k1(s)
k(s)
…
kM1(s)
Figure3.theoverlappingmethodin[4]
Therefore,toproducealltheforwardvariables
i(s)
inonecomponentdecoderwithinoneiteration,
K
thewholeprocessingofthecodewordwithKsegmentsandLoverlappingrequiresNL
clock
cyclesinsteadofN cycles,andthedecodingspeedwillbesloweddownbyafactorof N .
K NLK
LargertheLandKis,slowerthedecodingspeedwillbe.Forinstance,assumingthattheblocklengthis2298[8],ifwesegmentitinto50sub-blocks,andtheoverlappinglengthis30,thenitrequires76clockcyclestoproducealltheforwardvariables,insteadof46cycles.Hence,theMAPdecodingspeedcanonlybeimprovedby30times,farlessthanK50,whichisoriginallysupposedtobe.
Tosolvesuchproblem,amethodwithstoragefeaturewasproposedin[5].Insteadofoverlappingtechnique[4],itutilizestheforwardandbackwardvariablescomputedinthepreviousiterationtogettheboundaryinformationforeachsub-block.Forexample,forthep+1thiterationandthesame
sub-blocklocatingfromkto
kM1,thecalculationof(2)willstartatkwithinitialcondition
k1
p1s
beapproximatedwith
ps,thefinalresultsinlastiteration.ThisisshowninFig.4.
(s)
(p)
k1
k1
Thebackwardvariableisprocessedinthesameway.
(p)(s)
kM
(p)(s)
k1
(p1)(s)
k
(p1)(s)
kM1
Figure4.themethodin[5]
Comparedwiththeoverlappingmethod,suchmethodrequiresnoredundantcomputations.Hence,thedecodingspeedcanincreaselinearlywiththenumberofsegments.However,itrequiresanextra
memoryofsize2m(K1)v
tostorethefinalresultsof
ps
inlastiteration,wherevisthe
k1
k1
numberofbitsusedtoquantizetheps.
Proposedmethod
k1
Forsomeapplicationssuchassmallmobileterminals,itispreferredthatthememoryrequiredshouldbeassmallaspossible.Tokeepthefeatureofthemethodsin[5]butsavethememory,inthispaperweproposedtostoretheindexofthemostprobablestateinsteadofalltherealvaluesofintermediate
boundary(e.g.
(p)(s)inFig.4),computedinthepreviousiteration.Withthestatenumberof2m,
weonlyneedtostorembits,ratherthan2mvbits.Thismaybealargesavingifmandvislarge.
Atp+1thiteration,forthesamesub-blocklocatingfromkto
kM1,thecalculationof(2)will
startatkandtheinitialconditionissetas(inlogarithmdomain)
logp1s 0
ss*
(7)
k1
p1
else
whereisapredeterminedparameterwhichonlydependsonthenumberofiterationsandcanbe
k1
optimizedviasimulation,s*argmaxlogpsistheindexofthemostprobablestateinthe
s
previousiteration.Withthismethod,theonlythingweneedtostoredfornextiterationisrequiresmbits.
Simulationresults
s*,which
Inthissectionwewillshowtheperformanceoftheproposedmethod.Thesimulationadoptstheturbocodedefinedincdma2000[9],theinterleaverlengthischosenasN=2298,themaximumiteration
numberis
Imax8.Thenumberofsub-blockis
K20
implyingthatthehardwareMAPprocesser
consistsof20parallelprocessorsandeachprocesses
2298/20115
bits.Bothlogmapand
max-logmapdecodingalgorithmsareconsideredandFER(frameerrorrate)andBER(biterrorrate)areobserved.
Figure5.FERperformanceofcoderate=1/3
Fig.5includesfourFERperformancecurvesforcoderate1/3,usingmax-logmaporlogmapdecodingalgorithmrespectively.AtFER=0.01,formax-logmapalgorithm,theperformancelossoftheproposedmethodisabout0.07dB,comparedwiththemethodin[5].Andforlogmap,theperformancelossatFER=0.01isabout0.1dB.
Figure6.FERperformanceofcoderate=3/4
InFig.6,wecanseethatforcoderate3/4,theperformancelossislargerthanthatofcoderate1/3.AtFFR=0.01,theperformancelossisabout0.3dBforbothlogmapandmax-logmapdecodingalgorithm.WehavealsothesimulatedBERperformancewhicharenotshownhereduetolimitedspace.TheconclusionforBERperformanceisroughlythesame.
Conclusion
Inthispaper,wesimplifiedthemethodin[5],byonlystoringtheindexofthemostprobablestate,insteadofallvaluesofintermediateboundary.Theproposedschemecanreducethesizeofmemorylargely,withsomeacceptableFERorBERperformanceloss.
References
“Throughput,latencyandcomplexityestimationfortheturbodecoder,”3GPPTSGRANWG1#46,R1-062155,
Tallinn,Estonia,Aug
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 医疗新技术结题
- 职场心态培训
- 年产xxx生物破坏性塑料项目建议书
- 年产xx数字程控交换设备项目建议书
- 年产xxx松木实木门项目可行性研究报告(项目规划)
- 年产xxx回转缸项目可行性研究报告(立项备案)
- 年产xx再生海绵扶手项目建议书
- 2023年耐高温超轻硅酸钙隔热保湿材料资金筹措计划书
- 高三地理一轮复习 课件 世界的气温和降水
- 地质构造与地貌、地质剖面图判读课件高三地理一轮复习
- 核安全工程师-核安全综合知识-辐射防护基础-辐射防护剂量限值
- 音乐治疗学基础理论
- 小学二年级期中家长会课件
- 第六届大学生化学实验技能竞赛初赛笔试试题
- 质量通病防治施工措施及质量通病防治措施
- 英语作业纸打印版
- 静脉留置针操作常见并发症预防及处理课件
- LTE与5G移动通信技术PPT完整全套教学课件
- 军事理论(中北大学版)学习通超星课后章节答案期末考试题库2023年
- 高中文言文整理使动和意动用法-课件
- 多维自我体像关系调查问卷(MBSRQ)中文修订版及评分方法
评论
0/150
提交评论