




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
-
PAGE
1
-
ASimplifiedParallelDecodingSchemeforTurboCodes
WangLe,HuangYu,YangHongwen
WirelessCommunicationCenter,BeijingUniversityofPostsandTelecommunications,Beijing,China(100876)
Abstract
ParalleldecodingofTurbocodesisvitaltotheapplicationswithveryhighdatarates.Intheoverlappingparalleldecodingscheme,thedecodingthroughputcannotincreaselinearlywiththenumberofsegments.Aschemewherethethroughputcanincreaselinearlywasproposed,wheretheboundaryinformationofeachiterationisstoredtobetheinitialconditioninnextiteration.Inthispaper,wehavesimplifiedtheschemebyonlystoringtheindexofthemostprobablestateoftheboundary.Withthecostofsomesmallperformanceloss,theschemecanreducelargelytheextramemoryrequiredtostoretheboundaryinformation.
Keywords: Turbocodes;MAPalgorithm;paralleldecoding
Introduction
Withtheever-increasingdemandofthedatarate,turbocodeisfacingstringentchallenges.ITUsuggeststhatthedatarateof4Gsystembeatleast100Mbps,inmobileconditions.Asthelocaloperatingclockfrequencyinhardwaresystemistypicallywithin100MHznowadays,itisreported[1]thatthethroughputofonly6.17Mbpscanbeachievedforturbocodeswithconventionalserialdecoding[2].Obviously,thedecodingalgorithmofturbocodehastobeimprovedsignificantlytosurviveinthefuturecommunicationsystems.
Thebottleneckofthethroughputliesinitsdecodingscheme.Theconvolutionalcodesistypicallyemployedastheconstituentcodes,leadingtothedecodingalgorithmemployssomerecursiveschemesbasedontrellis,suchassymbol-by-symbolMaximumAPosterioriprobability(MAP)algorithm[3].Itrequireslotsofrecursivecomputationsinbothcomponentdecoders.Moreover,thedecodingprocedureisiterative.Hence,thedecodingdelayisquitelarge.
Toreducethedecodingdelayandimprovethedecodingthroughput,theparalleldecodingmethodwasproposed[4]bydividingthewholeframeintomultiplesub-blocksanddecodingthesegmentswithabankofidenticalsub-blockMAPprocessors.Withtheparalleldecodingscheme,thethroughputcanbeimprovedbyafactorofaboutK,thenumberofsub-blockssegmented.Underthesameconditionsasmentionedpreviously,thethroughputof100MbpscanbeachievedwithK=16.
However,asaresultofthesegmentation,eachsub-blocksuffersthelostofboundaryinformation(initialconditions),whichisvitalfortheperformance.Togettheboundaryinformation,amethodwithsegmentsoverlappingisproposed[4],bycomputingsomemoreredundanttrellisstages.However,suchoverlappingsufferssomeextracomputationaldelay,andleadstolowerdecodingspeed.TheactualspeedupfactorissmallerthanK,especiallywhenKislargeandtheframelengthissmall.Anotherwaytogettheboundarydistributioninformationwasproposedin[5].Itsuggeststousetheforwardandbackwardvariablescomputedinthepreviousiterationtoprovideanapproximationtotheboundary.Simulationresultsshowthatitsperformanceisveryclosetotheserialdecodingschemein[2].Thecostofthismethodissomeadditionalmemoryneededtostoretheintermediateboundaryinformation.
Toreducethesizeoftheadditionalmemoryin[5],wesuggestthatinsteadofthewholeboundarydistributioninformation,onlytheindexofthemostprobablestatebestored.Withthissimplifiedstoragescheme,wecansavealotofmemoryatthecostofsomeacceptableperformanceloss.
Therestofthepaperisorganizedasfollows.First,section2givesabriefdescriptionofthestandardserialdecodingtoshowitsweakpointindecodingspeed.Section3showsthetwoparalleldecodingmethodsproposedin[4-5].Ourproposedschemeisgiveninsection4andsection5isthesimulationresults.Finally,section6concludesthepaper.
SerialdecodingschemeofTurbocodes
TurbocodesintroducedbyBerrouetal[2]haveanear-Shannoncapability,thusbeingveryattractivetovariousapplications.However,thelargedecodingdelayprohibitsitsapplications,especiallyforthosetimeconstraintorhighdatarateservices.
ThedecoderofTurbocodesemploysaseriallyconcatenatedstructure,whichconsistsoftwocomponentdecoders,asshowninFig.1.Thedecodingproceedsinaniterativemanner.Ineachiteration,thefirstcomponentdecoder,correspondingtothefirstcomponentencoder,deliversitscalculatedextrinsicinformation(Le1)tothe2ndcomponentdecoderafterinterleavingtoserveastheapriori
information(La2).Thenthesecondcomponentdecoderfeedbacksitsextrinsicinformation(Le2)tothefirstoneafterde-interleaving,workingasaprioriinformation(La1).xandxinFig.1are
receivedinformationbitsanditsinterleavedversion,respectively.The
y1and
y2arereceived
redundantbitsproducedbythecomponentRecursiveSystematicConvolutional(RSC)encoders,respectively.
La1x
y1
Le1
La2
x
y2
Le2
LLRs
Figure1.DecoderofTurboCodes
ThefunctionofthecomponentdecoderistocalculatetheLLRs(LogarithmofLikelihoodRatio)fortheinformationbits.ThecalculationofLLRsgenerallyemploystheMAPalgorithm[3]wheretheLLRsfortheithinformationbitofalengthNframecanbeexpressedas
i(s)i(s)i(s,1)
LLRiLog
1
i(s)i(s)i(s,0)
0
(1)
wherethesummationinnumeratoranddenominatorisoverthesetofbranchesintrellisdiagram,inwhichtheinformationbitis1or0,respectively.NotethatwearenotgoingtoexplainthedetailsoftheMAPalgorithmoritsvariants;rather,wesimplyshowtherecursivenatureinthealgorithmsthat
prohibitthedecodingspeed.Therearethreekindsofvariablesin(1).
i(s)
and
i(s)
denotethe
forwardandbackwardvariablesintheithtrellisstage,
is,d
isthebranchmetriccorresponding
thebranchwhichstartsfromstatesandtheinformationbitisd.
Thei
in(1)canbedirectlycalculatedwithchannelinputsandtheaprioriinputswithinoneclock
cycle,providedthatalltheinputsnecessaryarereadytouse.Itistherequiresrecursivecalculationasshownin(2)and(3)forbinaryRSC:
i(s)
and
i(s)
which
i(s)i1(s1)i(s1,1)i1(s0)i(s0,0),
i1,2...N1
(2)
1
1
0
0
i(s)i1(s)i1(s,1)i1(s)i1(s,0),
iN1,N2...1
(3)
wheresd
isthepreviousstatecomingtostateswiththeinputinformationbitsasd.
isthe
s
d
arrivingstateviathecurrentstateswithinputbitasd.Theinitialconditionfortherecursivecalculationis
0
0
(s)1
(s)1
s0
else
s0
(4)
(5)
0
N else
Notethattheinitialcondition(5)onlyappliesfortheRSCwhichisterminated[6].IfRSCisopen[2],theinitialconditionshouldbe
1
N(s)2m, s
(6)
where2misthenumberofthetrellisstates.ThewholeprocessisshowninFig.2.
LLRs
Figure2.theoperationflowchartforMAPalgorithm
Apparently,tocalculatei,weshouldcalculate
i1
first.Andtocalculate
N1,wehavetowait
forallpastibeenobtainedandthesituationissimilartoi.Thus,theprocessingforone
componentdecoderinoneiterationcannotbefinishedwithinNclockcycles(someextraclock
cyclesmayberequiredtoexecutesomeotheroperations).Iftheclockfrequencyis
fc,thenthedelay
forprocessingoftheMAPismorethan
Tmap
N
f
.Supposethatthemaximumiterationis
c
Imax,then
thedelayfordecodingonecodewordofturbocodeismorethanT
2T I
2NI
,leadsto
dec map
max
max
c
ainformationthroughputlessthan
RNTdec
fc
2Imax
f
.Forexample,when
fc100MHz
and
Imax8,thethroughputcannotbelargerthan6.25Mbps.Notethatwithsomeearly-stopiterationstrategy[7],theactualnumberofiterationstosuccessfullydecodeonecodewordmaybemuchless
than
Imax,butthehardwarehavetobedesignedwith
Imax.
Paralleldecodingmethods
Toreducethedecodingdelayandimproveitsthroughput,wecandividethewholecodewordintomultiplesub-blocksanddecodethemwithmultipleMAPprocessorsinparallel.However,thesegmentsinthemiddlehavenoinitialcondition[4-5].Withrandomorarbitraryinitializationas(6)willleadtoaperformancelossthatisunacceptable.Sodirectlyworkingateachsegmentfacesaproblemofhowto
initializetheforwardandbackwardvariablesattheboundaryofeachsegment.Asshownin(2),
ks
dependsonallpast
js,jk.Notethatthefarawaythejleavesk,thelesseffect
will
j(s)
haveon
k(s).Thus,tohaveagoodaccuracyforthe
k(s)
atthebeginningofeach
sub-block,weshouldstarttheprocessingof(2)earlieratjkL
withLlargeenough[4].This
isshowninFig.3.Foracertainsub-blockoflengthMlocatingfrompositionktokM1,we
startthecalculationfromkL
with
kLs
beingarbitrarilyinitializedas(6).Theposition
kL
tok1
belongstotheprevioussub-block,hencetheparallelprocessersarepartially
overlapped.Thecalculationof
k(s)
isexactlythesame.Tohaveagoodaccuracyforthe
kM1(s)
attheright-handbeginningpointofthesegment,weshouldstartfromjkM1L.
overlappingarea
kM(s)
…
kL(s)
…
k1(s)
k(s)
…
kM1(s)
Figure3.theoverlappingmethodin[4]
Therefore,toproducealltheforwardvariables
i(s)
inonecomponentdecoderwithinoneiteration,
K
thewholeprocessingofthecodewordwithKsegmentsandLoverlappingrequiresNL
clock
cyclesinsteadofN cycles,andthedecodingspeedwillbesloweddownbyafactorof N .
K NLK
LargertheLandKis,slowerthedecodingspeedwillbe.Forinstance,assumingthattheblocklengthis2298[8],ifwesegmentitinto50sub-blocks,andtheoverlappinglengthis30,thenitrequires76clockcyclestoproducealltheforwardvariables,insteadof46cycles.Hence,theMAPdecodingspeedcanonlybeimprovedby30times,farlessthanK50,whichisoriginallysupposedtobe.
Tosolvesuchproblem,amethodwithstoragefeaturewasproposedin[5].Insteadofoverlappingtechnique[4],itutilizestheforwardandbackwardvariablescomputedinthepreviousiterationtogettheboundaryinformationforeachsub-block.Forexample,forthep+1thiterationandthesame
sub-blocklocatingfromkto
kM1,thecalculationof(2)willstartatkwithinitialcondition
k1
p1s
beapproximatedwith
ps,thefinalresultsinlastiteration.ThisisshowninFig.4.
(s)
(p)
k1
k1
Thebackwardvariableisprocessedinthesameway.
(p)(s)
kM
(p)(s)
k1
(p1)(s)
k
(p1)(s)
kM1
Figure4.themethodin[5]
Comparedwiththeoverlappingmethod,suchmethodrequiresnoredundantcomputations.Hence,thedecodingspeedcanincreaselinearlywiththenumberofsegments.However,itrequiresanextra
memoryofsize2m(K1)v
tostorethefinalresultsof
ps
inlastiteration,wherevisthe
k1
k1
numberofbitsusedtoquantizetheps.
Proposedmethod
k1
Forsomeapplicationssuchassmallmobileterminals,itispreferredthatthememoryrequiredshouldbeassmallaspossible.Tokeepthefeatureofthemethodsin[5]butsavethememory,inthispaperweproposedtostoretheindexofthemostprobablestateinsteadofalltherealvaluesofintermediate
boundary(e.g.
(p)(s)inFig.4),computedinthepreviousiteration.Withthestatenumberof2m,
weonlyneedtostorembits,ratherthan2mvbits.Thismaybealargesavingifmandvislarge.
Atp+1thiteration,forthesamesub-blocklocatingfromkto
kM1,thecalculationof(2)will
startatkandtheinitialconditionissetas(inlogarithmdomain)
logp1s 0
ss*
(7)
k1
p1
else
whereisapredeterminedparameterwhichonlydependsonthenumberofiterationsandcanbe
k1
optimizedviasimulation,s*argmaxlogpsistheindexofthemostprobablestateinthe
s
previousiteration.Withthismethod,theonlythingweneedtostoredfornextiterationisrequiresmbits.
Simulationresults
s*,which
Inthissectionwewillshowtheperformanceoftheproposedmethod.Thesimulationadoptstheturbocodedefinedincdma2000[9],theinterleaverlengthischosenasN=2298,themaximumiteration
numberis
Imax8.Thenumberofsub-blockis
K20
implyingthatthehardwareMAPprocesser
consistsof20parallelprocessorsandeachprocesses
2298/20115
bits.Bothlogmapand
max-logmapdecodingalgorithmsareconsideredandFER(frameerrorrate)andBER(biterrorrate)areobserved.
Figure5.FERperformanceofcoderate=1/3
Fig.5includesfourFERperformancecurvesforcoderate1/3,usingmax-logmaporlogmapdecodingalgorithmrespectively.AtFER=0.01,formax-logmapalgorithm,theperformancelossoftheproposedmethodisabout0.07dB,comparedwiththemethodin[5].Andforlogmap,theperformancelossatFER=0.01isabout0.1dB.
Figure6.FERperformanceofcoderate=3/4
InFig.6,wecanseethatforcoderate3/4,theperformancelossislargerthanthatofcoderate1/3.AtFFR=0.01,theperformancelossisabout0.3dBforbothlogmapandmax-logmapdecodingalgorithm.WehavealsothesimulatedBERperformancewhicharenotshownhereduetolimitedspace.TheconclusionforBERperformanceisroughlythesame.
Conclusion
Inthispaper,wesimplifiedthemethodin[5],byonlystoringtheindexofthemostprobablestate,insteadofallvaluesofintermediateboundary.Theproposedschemecanreducethesizeofmemorylargely,withsomeacceptableFERorBERperformanceloss.
References
“Throughput,latencyandcomplexityestimationfortheturbodecoder,”3GPPTSGRANWG1#46,R1-062155,
Tallinn,Estonia,Aug
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 开学典礼工作总结
- 学校科研研工作计划
- 乡镇个人工作计划
- 2025年度金融产品预约解除与投资者风险控制合同
- 二零二五年度花店店面转让与特色花卉品种引进合同
- 主题公园房产居间代理合同
- 磁性材料在环保科技领域的应用
- 旗舰店装修合同保密条款
- 2025年中国金银花咽清胶囊市场调查研究报告
- 2025至2030年快速高温试样机项目投资价值分析报告
- 水幕喷淋系统的工作原理与应用
- 2024年山东海洋集团有限公司社会招聘考试真题
- 小学生拗九节课件
- 《感冒中医治疗》课件
- 研发费用管理制度内容
- 压力容器设计委托书
- 《眉毛的基本技法》课件
- 人教版PEP小学五年级英语下册全册教案(含计划)
- 2025年幼儿园膳食工作计划
- 药剂学第9版课件:第一章-绪论
- 2023年中考英语话题复习课件 健康与饮食
评论
0/150
提交评论