




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
-
PAGE
1
-
ASimplifiedParallelDecodingSchemeforTurboCodes
WangLe,HuangYu,YangHongwen
WirelessCommunicationCenter,BeijingUniversityofPostsandTelecommunications,Beijing,China(100876)
Abstract
ParalleldecodingofTurbocodesisvitaltotheapplicationswithveryhighdatarates.Intheoverlappingparalleldecodingscheme,thedecodingthroughputcannotincreaselinearlywiththenumberofsegments.Aschemewherethethroughputcanincreaselinearlywasproposed,wheretheboundaryinformationofeachiterationisstoredtobetheinitialconditioninnextiteration.Inthispaper,wehavesimplifiedtheschemebyonlystoringtheindexofthemostprobablestateoftheboundary.Withthecostofsomesmallperformanceloss,theschemecanreducelargelytheextramemoryrequiredtostoretheboundaryinformation.
Keywords: Turbocodes;MAPalgorithm;paralleldecoding
Introduction
Withtheever-increasingdemandofthedatarate,turbocodeisfacingstringentchallenges.ITUsuggeststhatthedatarateof4Gsystembeatleast100Mbps,inmobileconditions.Asthelocaloperatingclockfrequencyinhardwaresystemistypicallywithin100MHznowadays,itisreported[1]thatthethroughputofonly6.17Mbpscanbeachievedforturbocodeswithconventionalserialdecoding[2].Obviously,thedecodingalgorithmofturbocodehastobeimprovedsignificantlytosurviveinthefuturecommunicationsystems.
Thebottleneckofthethroughputliesinitsdecodingscheme.Theconvolutionalcodesistypicallyemployedastheconstituentcodes,leadingtothedecodingalgorithmemployssomerecursiveschemesbasedontrellis,suchassymbol-by-symbolMaximumAPosterioriprobability(MAP)algorithm[3].Itrequireslotsofrecursivecomputationsinbothcomponentdecoders.Moreover,thedecodingprocedureisiterative.Hence,thedecodingdelayisquitelarge.
Toreducethedecodingdelayandimprovethedecodingthroughput,theparalleldecodingmethodwasproposed[4]bydividingthewholeframeintomultiplesub-blocksanddecodingthesegmentswithabankofidenticalsub-blockMAPprocessors.Withtheparalleldecodingscheme,thethroughputcanbeimprovedbyafactorofaboutK,thenumberofsub-blockssegmented.Underthesameconditionsasmentionedpreviously,thethroughputof100MbpscanbeachievedwithK=16.
However,asaresultofthesegmentation,eachsub-blocksuffersthelostofboundaryinformation(initialconditions),whichisvitalfortheperformance.Togettheboundaryinformation,amethodwithsegmentsoverlappingisproposed[4],bycomputingsomemoreredundanttrellisstages.However,suchoverlappingsufferssomeextracomputationaldelay,andleadstolowerdecodingspeed.TheactualspeedupfactorissmallerthanK,especiallywhenKislargeandtheframelengthissmall.Anotherwaytogettheboundarydistributioninformationwasproposedin[5].Itsuggeststousetheforwardandbackwardvariablescomputedinthepreviousiterationtoprovideanapproximationtotheboundary.Simulationresultsshowthatitsperformanceisveryclosetotheserialdecodingschemein[2].Thecostofthismethodissomeadditionalmemoryneededtostoretheintermediateboundaryinformation.
Toreducethesizeoftheadditionalmemoryin[5],wesuggestthatinsteadofthewholeboundarydistributioninformation,onlytheindexofthemostprobablestatebestored.Withthissimplifiedstoragescheme,wecansavealotofmemoryatthecostofsomeacceptableperformanceloss.
Therestofthepaperisorganizedasfollows.First,section2givesabriefdescriptionofthestandardserialdecodingtoshowitsweakpointindecodingspeed.Section3showsthetwoparalleldecodingmethodsproposedin[4-5].Ourproposedschemeisgiveninsection4andsection5isthesimulationresults.Finally,section6concludesthepaper.
SerialdecodingschemeofTurbocodes
TurbocodesintroducedbyBerrouetal[2]haveanear-Shannoncapability,thusbeingveryattractivetovariousapplications.However,thelargedecodingdelayprohibitsitsapplications,especiallyforthosetimeconstraintorhighdatarateservices.
ThedecoderofTurbocodesemploysaseriallyconcatenatedstructure,whichconsistsoftwocomponentdecoders,asshowninFig.1.Thedecodingproceedsinaniterativemanner.Ineachiteration,thefirstcomponentdecoder,correspondingtothefirstcomponentencoder,deliversitscalculatedextrinsicinformation(Le1)tothe2ndcomponentdecoderafterinterleavingtoserveastheapriori
information(La2).Thenthesecondcomponentdecoderfeedbacksitsextrinsicinformation(Le2)tothefirstoneafterde-interleaving,workingasaprioriinformation(La1).xandxinFig.1are
receivedinformationbitsanditsinterleavedversion,respectively.The
y1and
y2arereceived
redundantbitsproducedbythecomponentRecursiveSystematicConvolutional(RSC)encoders,respectively.
La1x
y1
Le1
La2
x
y2
Le2
LLRs
Figure1.DecoderofTurboCodes
ThefunctionofthecomponentdecoderistocalculatetheLLRs(LogarithmofLikelihoodRatio)fortheinformationbits.ThecalculationofLLRsgenerallyemploystheMAPalgorithm[3]wheretheLLRsfortheithinformationbitofalengthNframecanbeexpressedas
i(s)i(s)i(s,1)
LLRiLog
1
i(s)i(s)i(s,0)
0
(1)
wherethesummationinnumeratoranddenominatorisoverthesetofbranchesintrellisdiagram,inwhichtheinformationbitis1or0,respectively.NotethatwearenotgoingtoexplainthedetailsoftheMAPalgorithmoritsvariants;rather,wesimplyshowtherecursivenatureinthealgorithmsthat
prohibitthedecodingspeed.Therearethreekindsofvariablesin(1).
i(s)
and
i(s)
denotethe
forwardandbackwardvariablesintheithtrellisstage,
is,d
isthebranchmetriccorresponding
thebranchwhichstartsfromstatesandtheinformationbitisd.
Thei
in(1)canbedirectlycalculatedwithchannelinputsandtheaprioriinputswithinoneclock
cycle,providedthatalltheinputsnecessaryarereadytouse.Itistherequiresrecursivecalculationasshownin(2)and(3)forbinaryRSC:
i(s)
and
i(s)
which
i(s)i1(s1)i(s1,1)i1(s0)i(s0,0),
i1,2...N1
(2)
1
1
0
0
i(s)i1(s)i1(s,1)i1(s)i1(s,0),
iN1,N2...1
(3)
wheresd
isthepreviousstatecomingtostateswiththeinputinformationbitsasd.
isthe
s
d
arrivingstateviathecurrentstateswithinputbitasd.Theinitialconditionfortherecursivecalculationis
0
0
(s)1
(s)1
s0
else
s0
(4)
(5)
0
N else
Notethattheinitialcondition(5)onlyappliesfortheRSCwhichisterminated[6].IfRSCisopen[2],theinitialconditionshouldbe
1
N(s)2m, s
(6)
where2misthenumberofthetrellisstates.ThewholeprocessisshowninFig.2.
LLRs
Figure2.theoperationflowchartforMAPalgorithm
Apparently,tocalculatei,weshouldcalculate
i1
first.Andtocalculate
N1,wehavetowait
forallpastibeenobtainedandthesituationissimilartoi.Thus,theprocessingforone
componentdecoderinoneiterationcannotbefinishedwithinNclockcycles(someextraclock
cyclesmayberequiredtoexecutesomeotheroperations).Iftheclockfrequencyis
fc,thenthedelay
forprocessingoftheMAPismorethan
Tmap
N
f
.Supposethatthemaximumiterationis
c
Imax,then
thedelayfordecodingonecodewordofturbocodeismorethanT
2T I
2NI
,leadsto
dec map
max
max
c
ainformationthroughputlessthan
RNTdec
fc
2Imax
f
.Forexample,when
fc100MHz
and
Imax8,thethroughputcannotbelargerthan6.25Mbps.Notethatwithsomeearly-stopiterationstrategy[7],theactualnumberofiterationstosuccessfullydecodeonecodewordmaybemuchless
than
Imax,butthehardwarehavetobedesignedwith
Imax.
Paralleldecodingmethods
Toreducethedecodingdelayandimproveitsthroughput,wecandividethewholecodewordintomultiplesub-blocksanddecodethemwithmultipleMAPprocessorsinparallel.However,thesegmentsinthemiddlehavenoinitialcondition[4-5].Withrandomorarbitraryinitializationas(6)willleadtoaperformancelossthatisunacceptable.Sodirectlyworkingateachsegmentfacesaproblemofhowto
initializetheforwardandbackwardvariablesattheboundaryofeachsegment.Asshownin(2),
ks
dependsonallpast
js,jk.Notethatthefarawaythejleavesk,thelesseffect
will
j(s)
haveon
k(s).Thus,tohaveagoodaccuracyforthe
k(s)
atthebeginningofeach
sub-block,weshouldstarttheprocessingof(2)earlieratjkL
withLlargeenough[4].This
isshowninFig.3.Foracertainsub-blockoflengthMlocatingfrompositionktokM1,we
startthecalculationfromkL
with
kLs
beingarbitrarilyinitializedas(6).Theposition
kL
tok1
belongstotheprevioussub-block,hencetheparallelprocessersarepartially
overlapped.Thecalculationof
k(s)
isexactlythesame.Tohaveagoodaccuracyforthe
kM1(s)
attheright-handbeginningpointofthesegment,weshouldstartfromjkM1L.
overlappingarea
kM(s)
…
kL(s)
…
k1(s)
k(s)
…
kM1(s)
Figure3.theoverlappingmethodin[4]
Therefore,toproducealltheforwardvariables
i(s)
inonecomponentdecoderwithinoneiteration,
K
thewholeprocessingofthecodewordwithKsegmentsandLoverlappingrequiresNL
clock
cyclesinsteadofN cycles,andthedecodingspeedwillbesloweddownbyafactorof N .
K NLK
LargertheLandKis,slowerthedecodingspeedwillbe.Forinstance,assumingthattheblocklengthis2298[8],ifwesegmentitinto50sub-blocks,andtheoverlappinglengthis30,thenitrequires76clockcyclestoproducealltheforwardvariables,insteadof46cycles.Hence,theMAPdecodingspeedcanonlybeimprovedby30times,farlessthanK50,whichisoriginallysupposedtobe.
Tosolvesuchproblem,amethodwithstoragefeaturewasproposedin[5].Insteadofoverlappingtechnique[4],itutilizestheforwardandbackwardvariablescomputedinthepreviousiterationtogettheboundaryinformationforeachsub-block.Forexample,forthep+1thiterationandthesame
sub-blocklocatingfromkto
kM1,thecalculationof(2)willstartatkwithinitialcondition
k1
p1s
beapproximatedwith
ps,thefinalresultsinlastiteration.ThisisshowninFig.4.
(s)
(p)
k1
k1
Thebackwardvariableisprocessedinthesameway.
(p)(s)
kM
(p)(s)
k1
(p1)(s)
k
(p1)(s)
kM1
Figure4.themethodin[5]
Comparedwiththeoverlappingmethod,suchmethodrequiresnoredundantcomputations.Hence,thedecodingspeedcanincreaselinearlywiththenumberofsegments.However,itrequiresanextra
memoryofsize2m(K1)v
tostorethefinalresultsof
ps
inlastiteration,wherevisthe
k1
k1
numberofbitsusedtoquantizetheps.
Proposedmethod
k1
Forsomeapplicationssuchassmallmobileterminals,itispreferredthatthememoryrequiredshouldbeassmallaspossible.Tokeepthefeatureofthemethodsin[5]butsavethememory,inthispaperweproposedtostoretheindexofthemostprobablestateinsteadofalltherealvaluesofintermediate
boundary(e.g.
(p)(s)inFig.4),computedinthepreviousiteration.Withthestatenumberof2m,
weonlyneedtostorembits,ratherthan2mvbits.Thismaybealargesavingifmandvislarge.
Atp+1thiteration,forthesamesub-blocklocatingfromkto
kM1,thecalculationof(2)will
startatkandtheinitialconditionissetas(inlogarithmdomain)
logp1s 0
ss*
(7)
k1
p1
else
whereisapredeterminedparameterwhichonlydependsonthenumberofiterationsandcanbe
k1
optimizedviasimulation,s*argmaxlogpsistheindexofthemostprobablestateinthe
s
previousiteration.Withthismethod,theonlythingweneedtostoredfornextiterationisrequiresmbits.
Simulationresults
s*,which
Inthissectionwewillshowtheperformanceoftheproposedmethod.Thesimulationadoptstheturbocodedefinedincdma2000[9],theinterleaverlengthischosenasN=2298,themaximumiteration
numberis
Imax8.Thenumberofsub-blockis
K20
implyingthatthehardwareMAPprocesser
consistsof20parallelprocessorsandeachprocesses
2298/20115
bits.Bothlogmapand
max-logmapdecodingalgorithmsareconsideredandFER(frameerrorrate)andBER(biterrorrate)areobserved.
Figure5.FERperformanceofcoderate=1/3
Fig.5includesfourFERperformancecurvesforcoderate1/3,usingmax-logmaporlogmapdecodingalgorithmrespectively.AtFER=0.01,formax-logmapalgorithm,theperformancelossoftheproposedmethodisabout0.07dB,comparedwiththemethodin[5].Andforlogmap,theperformancelossatFER=0.01isabout0.1dB.
Figure6.FERperformanceofcoderate=3/4
InFig.6,wecanseethatforcoderate3/4,theperformancelossislargerthanthatofcoderate1/3.AtFFR=0.01,theperformancelossisabout0.3dBforbothlogmapandmax-logmapdecodingalgorithm.WehavealsothesimulatedBERperformancewhicharenotshownhereduetolimitedspace.TheconclusionforBERperformanceisroughlythesame.
Conclusion
Inthispaper,wesimplifiedthemethodin[5],byonlystoringtheindexofthemostprobablestate,insteadofallvaluesofintermediateboundary.Theproposedschemecanreducethesizeofmemorylargely,withsomeacceptableFERorBERperformanceloss.
References
“Throughput,latencyandcomplexityestimationfortheturbodecoder,”3GPPTSGRANWG1#46,R1-062155,
Tallinn,Estonia,Aug
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 快递公司客户服务中心操作手册
- 钱学森的科学精神读后感
- 软件开发公司与用户软件许可合同
- 压纹机相关项目投资计划书
- 粘合衬相关行业投资方案范本
- 离子交换树脂相关行业投资规划报告范本
- 微波集成电路AL2O3基片相关项目投资计划书
- 银行基金销售培训
- 农业生产与管理知识竞赛试题
- 会议纪要详细版与行动方案计划表
- (二模)长春市2025届高三质量监测(二)地理试卷(含答案)
- 2025天津市建筑安全员-C证考试题库
- 2025年河南省高职单招计算机类职业技能测试题(附答案)
- GB/T 18936-2025禽流感诊断技术
- 《主题四 鸡蛋撞地球》教学设计-2023-2024学年六年级下册综合实践活动辽师大版
- 2025年国航机务系统AMECO工程师岗位校园招聘笔试参考题库附带答案详解
- 巨量千川中级营销师认证考试题(附答案)
- 2025中智集团招聘高频重点提升(共500题)附带答案详解
- 金融公司早会内容
- 药剂学第9版课件:第一章-绪论
- 《下载-综合布线》课件
评论
0/150
提交评论