版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
-
PAGE
1
-
ASimplifiedParallelDecodingSchemeforTurboCodes
WangLe,HuangYu,YangHongwen
WirelessCommunicationCenter,BeijingUniversityofPostsandTelecommunications,Beijing,China(100876)
Abstract
ParalleldecodingofTurbocodesisvitaltotheapplicationswithveryhighdatarates.Intheoverlappingparalleldecodingscheme,thedecodingthroughputcannotincreaselinearlywiththenumberofsegments.Aschemewherethethroughputcanincreaselinearlywasproposed,wheretheboundaryinformationofeachiterationisstoredtobetheinitialconditioninnextiteration.Inthispaper,wehavesimplifiedtheschemebyonlystoringtheindexofthemostprobablestateoftheboundary.Withthecostofsomesmallperformanceloss,theschemecanreducelargelytheextramemoryrequiredtostoretheboundaryinformation.
Keywords: Turbocodes;MAPalgorithm;paralleldecoding
Introduction
Withtheever-increasingdemandofthedatarate,turbocodeisfacingstringentchallenges.ITUsuggeststhatthedatarateof4Gsystembeatleast100Mbps,inmobileconditions.Asthelocaloperatingclockfrequencyinhardwaresystemistypicallywithin100MHznowadays,itisreported[1]thatthethroughputofonly6.17Mbpscanbeachievedforturbocodeswithconventionalserialdecoding[2].Obviously,thedecodingalgorithmofturbocodehastobeimprovedsignificantlytosurviveinthefuturecommunicationsystems.
Thebottleneckofthethroughputliesinitsdecodingscheme.Theconvolutionalcodesistypicallyemployedastheconstituentcodes,leadingtothedecodingalgorithmemployssomerecursiveschemesbasedontrellis,suchassymbol-by-symbolMaximumAPosterioriprobability(MAP)algorithm[3].Itrequireslotsofrecursivecomputationsinbothcomponentdecoders.Moreover,thedecodingprocedureisiterative.Hence,thedecodingdelayisquitelarge.
Toreducethedecodingdelayandimprovethedecodingthroughput,theparalleldecodingmethodwasproposed[4]bydividingthewholeframeintomultiplesub-blocksanddecodingthesegmentswithabankofidenticalsub-blockMAPprocessors.Withtheparalleldecodingscheme,thethroughputcanbeimprovedbyafactorofaboutK,thenumberofsub-blockssegmented.Underthesameconditionsasmentionedpreviously,thethroughputof100MbpscanbeachievedwithK=16.
However,asaresultofthesegmentation,eachsub-blocksuffersthelostofboundaryinformation(initialconditions),whichisvitalfortheperformance.Togettheboundaryinformation,amethodwithsegmentsoverlappingisproposed[4],bycomputingsomemoreredundanttrellisstages.However,suchoverlappingsufferssomeextracomputationaldelay,andleadstolowerdecodingspeed.TheactualspeedupfactorissmallerthanK,especiallywhenKislargeandtheframelengthissmall.Anotherwaytogettheboundarydistributioninformationwasproposedin[5].Itsuggeststousetheforwardandbackwardvariablescomputedinthepreviousiterationtoprovideanapproximationtotheboundary.Simulationresultsshowthatitsperformanceisveryclosetotheserialdecodingschemein[2].Thecostofthismethodissomeadditionalmemoryneededtostoretheintermediateboundaryinformation.
Toreducethesizeoftheadditionalmemoryin[5],wesuggestthatinsteadofthewholeboundarydistributioninformation,onlytheindexofthemostprobablestatebestored.Withthissimplifiedstoragescheme,wecansavealotofmemoryatthecostofsomeacceptableperformanceloss.
Therestofthepaperisorganizedasfollows.First,section2givesabriefdescriptionofthestandardserialdecodingtoshowitsweakpointindecodingspeed.Section3showsthetwoparalleldecodingmethodsproposedin[4-5].Ourproposedschemeisgiveninsection4andsection5isthesimulationresults.Finally,section6concludesthepaper.
SerialdecodingschemeofTurbocodes
TurbocodesintroducedbyBerrouetal[2]haveanear-Shannoncapability,thusbeingveryattractivetovariousapplications.However,thelargedecodingdelayprohibitsitsapplications,especiallyforthosetimeconstraintorhighdatarateservices.
ThedecoderofTurbocodesemploysaseriallyconcatenatedstructure,whichconsistsoftwocomponentdecoders,asshowninFig.1.Thedecodingproceedsinaniterativemanner.Ineachiteration,thefirstcomponentdecoder,correspondingtothefirstcomponentencoder,deliversitscalculatedextrinsicinformation(Le1)tothe2ndcomponentdecoderafterinterleavingtoserveastheapriori
information(La2).Thenthesecondcomponentdecoderfeedbacksitsextrinsicinformation(Le2)tothefirstoneafterde-interleaving,workingasaprioriinformation(La1).xandxinFig.1are
receivedinformationbitsanditsinterleavedversion,respectively.The
y1and
y2arereceived
redundantbitsproducedbythecomponentRecursiveSystematicConvolutional(RSC)encoders,respectively.
La1x
y1
Le1
La2
x
y2
Le2
LLRs
Figure1.DecoderofTurboCodes
ThefunctionofthecomponentdecoderistocalculatetheLLRs(LogarithmofLikelihoodRatio)fortheinformationbits.ThecalculationofLLRsgenerallyemploystheMAPalgorithm[3]wheretheLLRsfortheithinformationbitofalengthNframecanbeexpressedas
i(s)i(s)i(s,1)
LLRiLog
1
i(s)i(s)i(s,0)
0
(1)
wherethesummationinnumeratoranddenominatorisoverthesetofbranchesintrellisdiagram,inwhichtheinformationbitis1or0,respectively.NotethatwearenotgoingtoexplainthedetailsoftheMAPalgorithmoritsvariants;rather,wesimplyshowtherecursivenatureinthealgorithmsthat
prohibitthedecodingspeed.Therearethreekindsofvariablesin(1).
i(s)
and
i(s)
denotethe
forwardandbackwardvariablesintheithtrellisstage,
is,d
isthebranchmetriccorresponding
thebranchwhichstartsfromstatesandtheinformationbitisd.
Thei
in(1)canbedirectlycalculatedwithchannelinputsandtheaprioriinputswithinoneclock
cycle,providedthatalltheinputsnecessaryarereadytouse.Itistherequiresrecursivecalculationasshownin(2)and(3)forbinaryRSC:
i(s)
and
i(s)
which
i(s)i1(s1)i(s1,1)i1(s0)i(s0,0),
i1,2...N1
(2)
1
1
0
0
i(s)i1(s)i1(s,1)i1(s)i1(s,0),
iN1,N2...1
(3)
wheresd
isthepreviousstatecomingtostateswiththeinputinformationbitsasd.
isthe
s
d
arrivingstateviathecurrentstateswithinputbitasd.Theinitialconditionfortherecursivecalculationis
0
0
(s)1
(s)1
s0
else
s0
(4)
(5)
0
N else
Notethattheinitialcondition(5)onlyappliesfortheRSCwhichisterminated[6].IfRSCisopen[2],theinitialconditionshouldbe
1
N(s)2m, s
(6)
where2misthenumberofthetrellisstates.ThewholeprocessisshowninFig.2.
LLRs
Figure2.theoperationflowchartforMAPalgorithm
Apparently,tocalculatei,weshouldcalculate
i1
first.Andtocalculate
N1,wehavetowait
forallpastibeenobtainedandthesituationissimilartoi.Thus,theprocessingforone
componentdecoderinoneiterationcannotbefinishedwithinNclockcycles(someextraclock
cyclesmayberequiredtoexecutesomeotheroperations).Iftheclockfrequencyis
fc,thenthedelay
forprocessingoftheMAPismorethan
Tmap
N
f
.Supposethatthemaximumiterationis
c
Imax,then
thedelayfordecodingonecodewordofturbocodeismorethanT
2T I
2NI
,leadsto
dec map
max
max
c
ainformationthroughputlessthan
RNTdec
fc
2Imax
f
.Forexample,when
fc100MHz
and
Imax8,thethroughputcannotbelargerthan6.25Mbps.Notethatwithsomeearly-stopiterationstrategy[7],theactualnumberofiterationstosuccessfullydecodeonecodewordmaybemuchless
than
Imax,butthehardwarehavetobedesignedwith
Imax.
Paralleldecodingmethods
Toreducethedecodingdelayandimproveitsthroughput,wecandividethewholecodewordintomultiplesub-blocksanddecodethemwithmultipleMAPprocessorsinparallel.However,thesegmentsinthemiddlehavenoinitialcondition[4-5].Withrandomorarbitraryinitializationas(6)willleadtoaperformancelossthatisunacceptable.Sodirectlyworkingateachsegmentfacesaproblemofhowto
initializetheforwardandbackwardvariablesattheboundaryofeachsegment.Asshownin(2),
ks
dependsonallpast
js,jk.Notethatthefarawaythejleavesk,thelesseffect
will
j(s)
haveon
k(s).Thus,tohaveagoodaccuracyforthe
k(s)
atthebeginningofeach
sub-block,weshouldstarttheprocessingof(2)earlieratjkL
withLlargeenough[4].This
isshowninFig.3.Foracertainsub-blockoflengthMlocatingfrompositionktokM1,we
startthecalculationfromkL
with
kLs
beingarbitrarilyinitializedas(6).Theposition
kL
tok1
belongstotheprevioussub-block,hencetheparallelprocessersarepartially
overlapped.Thecalculationof
k(s)
isexactlythesame.Tohaveagoodaccuracyforthe
kM1(s)
attheright-handbeginningpointofthesegment,weshouldstartfromjkM1L.
overlappingarea
kM(s)
…
kL(s)
…
k1(s)
k(s)
…
kM1(s)
Figure3.theoverlappingmethodin[4]
Therefore,toproducealltheforwardvariables
i(s)
inonecomponentdecoderwithinoneiteration,
K
thewholeprocessingofthecodewordwithKsegmentsandLoverlappingrequiresNL
clock
cyclesinsteadofN cycles,andthedecodingspeedwillbesloweddownbyafactorof N .
K NLK
LargertheLandKis,slowerthedecodingspeedwillbe.Forinstance,assumingthattheblocklengthis2298[8],ifwesegmentitinto50sub-blocks,andtheoverlappinglengthis30,thenitrequires76clockcyclestoproducealltheforwardvariables,insteadof46cycles.Hence,theMAPdecodingspeedcanonlybeimprovedby30times,farlessthanK50,whichisoriginallysupposedtobe.
Tosolvesuchproblem,amethodwithstoragefeaturewasproposedin[5].Insteadofoverlappingtechnique[4],itutilizestheforwardandbackwardvariablescomputedinthepreviousiterationtogettheboundaryinformationforeachsub-block.Forexample,forthep+1thiterationandthesame
sub-blocklocatingfromkto
kM1,thecalculationof(2)willstartatkwithinitialcondition
k1
p1s
beapproximatedwith
ps,thefinalresultsinlastiteration.ThisisshowninFig.4.
(s)
(p)
k1
k1
Thebackwardvariableisprocessedinthesameway.
(p)(s)
kM
(p)(s)
k1
(p1)(s)
k
(p1)(s)
kM1
Figure4.themethodin[5]
Comparedwiththeoverlappingmethod,suchmethodrequiresnoredundantcomputations.Hence,thedecodingspeedcanincreaselinearlywiththenumberofsegments.However,itrequiresanextra
memoryofsize2m(K1)v
tostorethefinalresultsof
ps
inlastiteration,wherevisthe
k1
k1
numberofbitsusedtoquantizetheps.
Proposedmethod
k1
Forsomeapplicationssuchassmallmobileterminals,itispreferredthatthememoryrequiredshouldbeassmallaspossible.Tokeepthefeatureofthemethodsin[5]butsavethememory,inthispaperweproposedtostoretheindexofthemostprobablestateinsteadofalltherealvaluesofintermediate
boundary(e.g.
(p)(s)inFig.4),computedinthepreviousiteration.Withthestatenumberof2m,
weonlyneedtostorembits,ratherthan2mvbits.Thismaybealargesavingifmandvislarge.
Atp+1thiteration,forthesamesub-blocklocatingfromkto
kM1,thecalculationof(2)will
startatkandtheinitialconditionissetas(inlogarithmdomain)
logp1s 0
ss*
(7)
k1
p1
else
whereisapredeterminedparameterwhichonlydependsonthenumberofiterationsandcanbe
k1
optimizedviasimulation,s*argmaxlogpsistheindexofthemostprobablestateinthe
s
previousiteration.Withthismethod,theonlythingweneedtostoredfornextiterationisrequiresmbits.
Simulationresults
s*,which
Inthissectionwewillshowtheperformanceoftheproposedmethod.Thesimulationadoptstheturbocodedefinedincdma2000[9],theinterleaverlengthischosenasN=2298,themaximumiteration
numberis
Imax8.Thenumberofsub-blockis
K20
implyingthatthehardwareMAPprocesser
consistsof20parallelprocessorsandeachprocesses
2298/20115
bits.Bothlogmapand
max-logmapdecodingalgorithmsareconsideredandFER(frameerrorrate)andBER(biterrorrate)areobserved.
Figure5.FERperformanceofcoderate=1/3
Fig.5includesfourFERperformancecurvesforcoderate1/3,usingmax-logmaporlogmapdecodingalgorithmrespectively.AtFER=0.01,formax-logmapalgorithm,theperformancelossoftheproposedmethodisabout0.07dB,comparedwiththemethodin[5].Andforlogmap,theperformancelossatFER=0.01isabout0.1dB.
Figure6.FERperformanceofcoderate=3/4
InFig.6,wecanseethatforcoderate3/4,theperformancelossislargerthanthatofcoderate1/3.AtFFR=0.01,theperformancelossisabout0.3dBforbothlogmapandmax-logmapdecodingalgorithm.WehavealsothesimulatedBERperformancewhicharenotshownhereduetolimitedspace.TheconclusionforBERperformanceisroughlythesame.
Conclusion
Inthispaper,wesimplifiedthemethodin[5],byonlystoringtheindexofthemostprobablestate,insteadofallvaluesofintermediateboundary.Theproposedschemecanreducethesizeofmemorylargely,withsomeacceptableFERorBERperformanceloss.
References
“Throughput,latencyandcomplexityestimationfortheturbodecoder,”3GPPTSGRANWG1#46,R1-062155,
Tallinn,Estonia,Aug
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 财经数据分析方法指南
- 2026年孩子见人不打招呼家庭教育策略
- 2026年地铁运营公司员工应急疏散培训方案
- 2026年光伏结构工程师项目结构计算报告
- 2026二建《水利水电工程管理与实务》冲刺课程讲义
- 签了协议书能上学高中
- 宏的概念新版
- 心理健康 五年级 第十五课 《合作创造奇迹》
- 学生会礼仪方案模板
- 2026年度全镇食品药品安全工作会议暨专题培训会讲话
- 课程与教学论知到智慧树期末考试答案题库2025年浙江师范大学
- 2026年《必背60题》 马克思主义理论26届考研复试高频面试题包含详细解答
- 安徽2021-2025真题及答案
- 临床护理实践指南2024版
- (高清版)TDT 1055-2019 第三次全国国土调查技术规程
- 超市盘点盈亏分析报告
- 超全的中国大陆风投公司名单和联系方式
- 旧水泥路面改造方案
- 产前筛查筛查机构创建资料汇编产前筛查工作制度及岗位职责(产前筛查中心专职管理人员职责)
- 风机基础知识专题培训
- 中建某公司技术质量管理制度-secret
评论
0/150
提交评论