版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、BitTorrent 协议详解BitTorrent( 简称BT,比特洪流)是一个文件分发协议,它通过 URL识别内容并且和网络无缝结合。它 在HTTP平台上的优势在于,同时下在一个文件的下载者在下载的同时不断互相上传数据,使文件源可以在很有限的负载增加的情况下支持大量下载者同时下载。一个BT式文件分发需要以下实体:一个普通网络服务器一个静态元信息文件一个 BT Tracker一个“原始”下载者网络终端浏览者网络终端下载者这里假设理想情况下一个文件有多个下载者。架设一个BT服务器步骤如下:1. 开始运行 Tracker (已运行的跳过这一步);2. 开始运行普通网络服务器端程序,如Apache,
2、已运行的跳过这一步;3. 在网络服务器上将 .torrent 文件关联到 Mimetype 类型 application/x-bittorrent(已关联的跳过这一步);4. 用要发布的完整文件和Tracker的URL创建一个元信息文件(.torrent 文件);5. 将元信息文件放置在网络服务器上;6. 在网页上发布元信息文件( .torrent 文件)链接;7. 原始下载者提供完整的文件(原本)。通过BT下载步骤如下:1. 安装BT客户端程序(已安装的跳过这一步);2. 上网;3. 点击一个链到 .torrent 文件的链接;4. 选择本地存储路径,选定需要下载的文件(对有选择下载功能的B
3、T客户端用户);5. 等待下载完成;6. 用户退出下载(之前下载者不停止上传)。连接状况如下 :网站正常提供静态文件连接,并且启动客户端上的BT程序; Tracker即时接收所有下载者信息,并且给每个下载者一份随机的peer列表。通过HTTP或HTTPS协议实现;下载者每隔一段时间连一次Tracher,告知自己的进度,并和那些已经直接连接上的peer进行数据 的上传下载。这些连接遵循 BitTorrent peer 协议,通过TCP协议进行通信。原始下载者只上传不下载,他拥有整个文件,所以很必要向网络中传输完文件的所有部分。在一些 人气很旺的下载中,原始下载者经常可以在较短的时间内退出上传,由
4、其它已经下载到整个文件的下载者 继续提供上传。元信息文件和Tracker的回应信息都以一种简单高效可扩展的格式(Bencoding,B编码)传送。经过B编码方式编码后的信息,将由字符串和整型数字描述,嵌套在字典和列表中(像在Python中一样),忽略字典无法识别的关键值,用以增强可扩展能力。这样,新特性便可以在以后被加入。B 编码规则如下 :字符串的表示方法为:字符串长度(十进制表示),冒号,字符串。比如,字符串spam(不包括引号)将被表示为: 4:spam (不包括引号), 4 表示字符串长度。整型数据 表示成前面加 i 后面加 e 中间是十进制数,如 i3e 就相当于 3, i-3e 就
5、是-3。整型数据 没有长度限制。i-0e无效,所有以iO开头的除了代表0的iOe,其它都无效。列表编码 为一个 l 开头后面跟它所包含的项目 (已经编码过) 最后加一个 e ,比如 l4:spam4:eggse 就等于 spam, eggs 。字典编码为一个d开头后面跟一个交替关键值(key )及其对应值的列表最后加一个e。如: d3:cow3:moo4:spam4:eggse 相当于 cow: moo, spam: eggs d4:spaml1:a1:bee相当于 spam: a, b关键值必须是处理过的字符串(用原始字符串编码的,而且不是数字字母混合编码的)。元信息文件就是 B 编码的有以
6、下关键值的字典:announce (声明)Tracker 的 URL。info (信息)此关键值对应一个字典包含以下描述的关键值 :关键值name对应一个字符串,代表默认的下载文件或存成目录的名字。它是纯粹建议性的。关键值 piece length (块长)对应文件分割成的块的字节数。出于传输需要,文件被分割成大小相等的块,除了最后一块通常会小一些。块长一般来说是2的权值,大部分设块长为 256K(2的18次幂)。关键值 pieces (块)对应一个字符串,此字符串长度是 20的倍数。它可以再分成每 20字节一段 的多个字符串,分别对应块在索引中的SHA1校验码(hash)。还有关键值 len
7、gth (长度)和 files (文件),它们不能同时出现也不能都不出现。当 length 出 现说明这个元信息文件只是单文件下载,否则说明是多文件的目录结构下载。单文件情况下 , length 对应文件长度的字节数多文件情况 被看作是把许多单文件按文件列表中的顺序连成一个大文件下载,而关键值 files 就 对应文件列表,是一个字典的列表,其中每个字典又包含以下关键值:length (长度)文件长度的字节数。path (路径)一个包含字符串的列表,字符串就是子目录名,最后一项的字符串是文件名。 (一个长度为零的 length 表单是错误的。)在单文件情况下,关键值 name 是文件名;多文件
8、情况下,它就成了目录名。Tracker质询是双向的。Tracker通过HTTP GET参数获得信息,然后返回一个B编码后的信息。尽管Tracker 需要在服务器端执行,但它运行流畅像 Apache 的一个模块。Tracker的GET请求有如下关键值:info_hash20字节长的SHA1验证码,来自B编码过的元信息文件中的info值下,是元信息文件的一个支链。这个值是自动转换的。peer_id一个20字节长的字符串,是每个用户开始下载时随机生成的ID。这个值也是是自动转换的。ip一个可选择的参数给出 peer所在的IP (或DNS主机名),一般是和 Tracker同机器的原始下载 者得到后以便
9、散发文件。port监听端口,官方默认的是从 6881 端口开始试, 如果端口被占用则依次向后推一个端口找空闲端口, 到 6889 端口为止。uploaded目前总上传量,编码为十进制 ASCII 码。downloadedleft未下载的字节数,编码为十进制 ASCII 码。这个数不是通过文件长度和已下载数算出来的,因为 文件可能在被续传,还有一些已经下载的数据不能通过完整性检查必须重新下载。event这是个选择性的关键值, 选项有started completed或stopped (或empty,等同于没有运行)。 如果没有运行, 这个声明会定期间隔一定时间发出。 开始下载时发出 starte
10、d 值,完成下载时发出 completed 当文件完整后再开始,没有 completed 发出,下载者中止下载时发出 stopped 。Tracker的回应也是B编码字典。如果 Tracker回应中有关键值failure reason(失败原因),就会对应一个人可以读懂的字符串信息解释质询失败的原因,不需要其它关键值。否则,回应必须有两个关键 值:interval (间隔)对应下载者定期发出请求的间隔秒数;peers,peers对应一个与peers相通信的字典列表,peers,peer自选ID,IP地址或DNS主机名的字符串和端口号之一。记住,假如下载者发生一个事件或者想要更多的 peers,
11、 他们不会完全按照计划的间隔发送请求。如果你想对元信息文件或者 Tracker质询进行扩展,请与BramCohen进行协调,确保所有扩展都兼容。BitTorrent peer协议通过TCP协议进行操作。它不用调节任何socket选项就可以流畅运行。peer 之间的连接是对称的。两个方向送出的信息要协调一致,数据可以流入任一方。peer 协议是按照元信息文件所描述的索引的文件块,以零开始。当一个peer 完成一个块的下载并且检查与 hash 码匹配时,他向与之连接的所有 peers 声明他已得到了这个块。连接的两个终端有 2 个状态指标 (比特),被阻塞与否,被关注与否,被阻塞( choking
12、 )是表明在恢复 通畅之前数据不再发出的通知。发生阻塞的原因和技术问题稍后会提到。数据传输发生在一方关注对方且对方没有阻塞的情况下。关注状态必须一致保持- 如果一个没阻塞的peer没有别人需要的数据,别人对他就会失去关注,转而关注那些正在阻塞的peer。完全执行这种条件需要非常慎重,但这样的确可以让下载者知道哪些 peer 在阻塞消失后可以马上开始下载。/ 连接会逐渐断开不被关注和阻塞的 peer。连接开始与阻塞和不被关注 。当数据传输时,下载者要备好多份请求排成队列,以获得较高的TCP传输效率(这叫“管运请求”)。另一方面,不能被写入 TCP缓冲区的请求要被立即排入内存,而不是抑制一个应用程
13、序级的网络缓冲,一 旦阻塞出现,这些请求全部丢弃。peer 连线协议包括一次握手跟着不断的大小一致且确定的信息流。 握手的开始是字符十九 (十进制) , 跟着是字符串 BitTorrentprotocol 。开头的字符是长度固定的,希望其它新协议也能这样以便区分。此后所有送入协议的整数都编码为 4 字节 big-endian在现有的应用中头部数据之后是 8 个全部预留为 0 的字节,若果你想通过改变这 8 个预留字节以扩展 协议,请与 Bram Cohen 协调以保证所有扩展兼容。然后是来自元信息文件中B编码的info值中长20字节的SHA1验证码(和info_hash向Tracker声明的值
14、相同,但这里是原始值那里是引用)。如果双方的值不同,他们断开连接。一个异常是下载者想只用 一个端口进行多个连接下载,它们会先从接入连接得到一个验证码,然后和列表里面的对照,有相同的就 答复。验证码之后是 20 字节的在 Tracker 请求中报告的 peer id ,它包含在 Tracker 回应的 peer 列表中, 在向 Tracker 的请求中被报告。如果接受方 peer id 不符合发送方希望,连接断开。握手完毕。之后是长度固定的交互信息流。零长度信息用来保持连接,被忽略。这种信息一般2分钟发出一次,但是在等待数据期间很容易超时。所有非保持连接用信息开头的字节给出类型,可能值如下:0-
15、阻塞1-非阻塞2- 关注3- 非关注4- 已有5- 比特组6- 请求7- 块8- 取消“阻塞”、“通畅”、“关注”和“不关注”类信息没有荷载。“比特组”信息仅作为首信息发出。它负载(占用)一个比特组,下载者有索引的设为1,其它为 0。开始下载时没有任何数据的下载者跳过“比特组”信息。首字节高位到低位对应索引0-7,依次类推,第二字节对应 8-15 ,等等。尾部的剩余的比特位设为 0。“已有”信息负载一个数 (单精度数 ),即刚下载并核对完验证码的索引数。“请求”信息包括包含一个索引,开始和长度。后两者是字节偏移量。长度一般是2 的权值除非被文件尾截断。现行一般是 2的 15 次幂,并且关闭大于
16、 2的 1 7次幂长度的连接。“取消”信息负载和“请求”类信息有一样的负载。它通常在下载接近完成即“最后阶段”发出。当 下载快要完成时,剩下几个块都有从同一个线程下载的趋向,这样会很慢。为了确保剩余块下载迅速,一 旦还没有决定剩余块的下载请求向谁发出,先向所有他正在从对方下载数据的连接者发送要求所有剩余块 的请求。为避免低效,每当一个块开始下载就向其他 peer 发出取消信息。“块”信息包含一个索引,开始和块。记住它和“请求”信息是相关的。当传输速度很慢或“阻塞”“非阻塞(通畅)”信息高频率交替发出或两者同时发生,可能会载到一个不需要的块。下载者下载块的顺序是随机的,这样适当防止下载者与其他
17、Peers 仅有相同的块子集或超集。阻塞的发生有很多原因。TCP协议的信息拥挤控制在即时向多连接发送信息的过程中表现极差。同时,阻塞的存在使下载者们能够用以牙还牙式的算法来确保稳定的下载速率。下面描述的阻塞算法是目前基础的配置。重要的是所有新算法不光要在包含全部扩展算法的网络中运 行良好,也要在主要包含这个基础算法的网络中运行良好。一个优秀的阻塞算法有许多标准。它必须封锁一定同时上传的数量以获得良好的TCP表现,还要避免频繁的堵塞和通畅交替,即所谓“纤维化” ( fibrillation )。它应该用数据交换报答给自己数据的 peer 最后,它还应该偶尔尝试一下与未使用过的 peer 端连接,
18、找出比现有连接好的连接,这叫做尝试性疏通。现行的阻塞算法避免纤维化的手段是每 10秒转换被阻塞的名单。疏通 4个自己关注且能从他们身上得 到最高下载速率的 peer ,进行上传和数据交换。有较高上传速率但是不被关注下载者的peer 被疏通,一旦这些 peer 开始被关注,那些上传率最低的 peer 的就被阻塞。如果下载者有了完整的文件,他用自己的 上传率而不是下载率来决定疏通谁的连接。在尝试性疏通中,任何一次中都有一个 peer 被疏通不管他的上传率如何 (如果被关注,他会成为 4 个 提供下载的 peer 之一) 。被尝试性疏通的这种 peer 每 30 秒轮换一次。为了给它们一个上传整一个
19、块的机 会,新连接会以轮换中尝试性疏通次数的 3 倍开始连接。BT通信协议-BitTorrent is a protocol for distributing files. It identifies content by URL and is designed to integr ate seamlessly with the web. Its advantage over plain HTTP is that when multiple downloads of t he same file happen concurrently, the downloaders upload to ea
20、ch other, making it possible for t he file source to support very large numbers of downloaders with only a modest increase in its l oad.BitTorrent is a protocol for distributing files. It identifies content by URL and is designed to integr ate seamlessly with the web. Its advantage over plain HTTP i
21、s that when multiple downloads of t he same file happen concurrently, the downloaders upload to each other, making it possible for t he file source to support very large numbers of downloaders with only a modest increase in its l oad.2.A BitTorrent file distribution consists of these entities:An od
22、in ary web server*A static metainfo fileA BitTorrent trackerAn origi nal dow nl oaderThe end user web browsersThe end user downloadersThere are ideally many end users for a single file.To startserving, a host goes through the following steps1.Start runninga tracker (or, more likely, have one running
23、 already).Start running an ordinary web server, such as apache, or have one already.3.Associate the extension .torre nt with mimetype applicatio n/x-bittorre nton their web server (or have doneso already).4.Gen erate a meta info(.torrent) file using the complete file to be served and the URL of the
24、tracker.5.Put the metainfofile on the web server.6.Link to the meta info (.torre nt)file from some other web page.7. Start a downloader which already has the complete file (the origin).To start downloading, a user does the following:1. In stall BitTorre nt (or have done so already).2. Surf the web.3
25、. Click on a link to a .torrent file.4. Select where to save the file locally, or select a partial download to resume.5. Wait for download to complete.6.Tell downloaderto exit (it keeps uploading until this happens).The connectivity is as follows: The web site is serving up static files as normal, b
26、ut kicking off the BitTorrent helper app on the clients. The tracker is receiving information from all downloaders and givingthem random lists of peers. This is done over HTTP or HTTPS.Downloaders are periodically checking in with the tracker to keep itinformed of their progress, and are uploading t
27、o and downloadingfrom each other via direct connections. These connections use theBitTorrent peer protocol, which operates over TCP. The origin is uploading but not downloading at all, since it has theentire file. The origin is necessary to get the entire file into the network. Often for popular dow
28、nloads the origin can be taken downafter a while since several downloads may have completed and been left running indefinitely.Metainfo file and tracker responses are both sent in a simple, efficient, and extensible format call ed bencoding (pronounced bee encoding). Bencoded messages are nested dic
29、tionaries and lists (as in Python), which can contain strings and integers. Extensibility is supported by ignoring unex pected dictionary keys, so additional optional ones can be added later.Bencoding is done as follows:Strings are length-prefixed base ten followed by a colon and the string. For exa
30、mple 4:spam corresponds to spam.Integers are representedby an i followed by the number in base 10 followedby an e.For example i3e corresponds to 3and i-3e corresponds to -3.Integers have no size limitation. i-0e is in valid. Allen codi ngswitha leading zero, such as i03e ,are invalid, other than i0e
31、,which of course corresponds to 0.Lists are encoded as anl followed by their elements(also ben coded) followedby an e.Forexample l4:spam4:eggsecorresponds to spam, eggs.Dictionaries are encoded as a d followed by a list ofalternating keys and their correspondingvalues followed by an e. For example,d
32、3:cow3:moo4:spam4:eggsecorresponds to cow: moo, spam: eggs and d4:spaml1:a1:bee corresponds to spam: a, b . Keys mustbe stri ngs and appear in sorted order (sorted as raw stri ngs, not alpha numerics).Metainfo files are bencodeddictionarieswith the following keys:announceThe URL of the T
33、his maps to a dictionary, with keys described below.The name key maps to a string which is the suggested name to save the file (or directory) as. It is purely advisory.piece length maps to the number of bytes in each piece the file is split into. For the purpos es of transfer, files are split into f
34、ixed-size pieces which are all the same length except for possi bly the last one which may be truncated. Piece length is almost always a power of two, most co 18 20mmonly 2 = 256 K (BitTorrent prior to version 3.2 uses 2 = 1 M as default).pieces maps to a string whose length is a multiple of 20. It
35、is to be subdivided into strings of length 20, each of which is the SHA1 hash of the piece at the corresponding index.There is also a key length or a key files , but not both or neither. If length is present then t he download represents a single file, otherwise it represents a set of files which go
36、 in a director y structure.In the single file case, length maps to the length of the file in bytes.For the purposes of the other keys, the multi-file case is treated as only having a single file by concatenating the files in the order they appear in the files list. The files list is the value files
37、maps to, and is a list of dictionaries containing the following keys:lengthThe length of the file, in bytes.pathA list of strings corresponding to subdirectory names, the last of which is the actual file name (a zero length list is an error case).In the single file case, the name key is the name of
38、a file, in the muliple file case, its the na me of a directory.Tracker queries are two way. The tracker receives information via HTTP GET parameters and ret urns a bencoded message. Note that although the current tracker implementation has its own we b server, the tracker could run very nicely as, f
39、or example, an apache module.Tracker GET requests have the following keys:info_hashThe 20 byte sha1 hash of the bencoded form of the info value from the metainfo file. Note that this is a substring of the metainfo file. This value will almost certainly have to be escaped.peer_idA string of length 20
40、 which this downloader uses as its id. Each downloader generates its own i d at random at the start of a new download. This value will also almost certainly have to be esc aped.ipAn optional parameter giving the IP (or dns name) which this peer is at. Generally used for the origin if its on the same
41、 machine as the tracker.portThe port number this peer is listening on. Common behavior is for a downloader to try to listen on port 6881 and if that port is taken try 6882, then 6883, etc. and give up after 6889.uploadedThe total amount uploaded so far, encoded in base ten ascii.downloadedThe total
42、amount downloaded so far, encoded in base ten ascii.leftThe number of bytes this peer still has to download, encoded in base ten ascii. Note that this c ant be computed from downloaded and the file length since it might be a resume, and theres a chance that some of the downloaded data failed an inte
43、grity check and had to be re-downloade d.eventThis is an optional key which maps to started , completed , or stopped (or empty, which is the same as not being present). If not present, this is one of the announcements done at regul ar intervals. An announcement using started is sent when a download
44、first begins, and one usi ng completed is sent when the download is complete. No completed is sent if the file wascomplete when started. Downloaders send an announcement using stopped when they cease d ownloading.Tracker responses are bencoded dictionaries. If a tracker response has a key failure re
45、ason , then that maps to a human readable string which explains why the query failed, and no other k eys are required. Otherwise, it must have two keys: interval , which maps to the number of sec onds the downloader should wait between regular rerequests, and peers . peers maps to a list of dictiona
46、ries corresponding to peers, each of which contains the keys peer id, ip, and port, which map to the peers self-selected ID, IP address or dns name as a string, and port number, respectively. Note that downloaders may rerequest on nonscheduled times if an event happens or they need more peers.If you
47、 want to make any extensions to metainfo files or tracker queries, please coordinate with B ram Cohen to make sure that all extensions are done compatibly.BitTorrents peer protocol operates over TCP. It performs efficiently without setting any socket opt ions.Peer connections are symmetrical. Messag
48、es sent in both directions look the same, and data can flow in either direction.The peer protocol refers to pieces of the file by index as described in the metainfo file, starting at zero. When a peer finishes downloading a piece and checks that the hash matches, it announ ces that it has that piece
49、 to all of its peers.Connections contain two bits of state on either end: choked or not, and interested or not. Chokin g is a notification that no data will be sent until unchoking happens. The reasoning and common techniques behind choking are explained later in this document.Data transfer takes pl
50、ace whenever one side is interested and the other side is not choking. Inte rest state must be kept up to date at all times - whenever a downloader doesnt have something they currently would ask a peer for in unchoked, they must express lack of interest, despite bei ng choked. Implementing this prop
51、erly is tricky, but makes it possible for downloaders to know w hich peers will start downloading immediately if unchoked.Connections start out choked and not interested.When data is being transferred, downloaders should keep several piece requests queued up at o nce in order to get good TCP perform
52、ance (this is called pipelining.) On the other side, request s which cant be written out to the TCP buffer immediately should be queued up in memory rath er than kept in an application-level network buffer, so they can all be thrown out when a choke happens.The peer wire protocol consists of a hands
53、hake followed by a never-ending stream of length-prefi xed messages. The handshake starts with character ninteen (decimal) followed by the string BitT orrent protocol. The leading character is a length prefix, put there in the hope that other new pr otocols may do the same and thus be trivially dist
54、inguishable from each other.All later integers sent in the protocol are encoded as four bytes big-endian.After the fixed headers come eight reserved bytes, which are all zero in all current implementatio ns. If you wish to extend the protocol using these bytes, please coordinate with Bram Cohen to m
55、ake sure all extensions are done compatibly.Next comes the 20 byte sha1 hash of the bencoded form of the info value from the metainfo fil e. (This is the same value which is announced as info_hash to the tracker, only here its raw ins tead of quoted here). If both sides dont send the same value, the
56、y sever the connection. The o ne possible exception is if a downloader wants to do multiple downloads over a single port, they may wait for incoming connections to give a download hash first, and respond with the same o ne if its in their list.After the download hash comes the 20-byte peer id which
57、is reported in tracker requests and co ntained in peer lists in tracker responses. If the receiving sides peer id doesnt match the one th e initiating side expects, it severs the connection.Thats it for handshaking, next comes an alternating stream of length prefixes and messages. Me ssages of length zero are keepalives, and ignored. Keepalives are generally sent once every two minutes, but note that time
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 九年级政治尊重他人是我的需要课件
- 液压与气动技术 课件 模块四 课题14
- 单位管理制度集合大合集职工管理篇
- 单位管理制度集粹汇编员工管理
- 议论文结构的六种模式
- 单位管理制度汇编大合集人员管理
- 单位管理制度分享大全【人力资源管理】十篇
- 单位管理制度范例合集员工管理篇十篇
- 单位管理制度呈现合集【人力资源管理篇】十篇
- 万有引力定律复习课件
- 中国肿瘤药物治疗相关恶心呕吐防治专家共识(2022年版)解读
- PLC应用技术(三菱机型)三菱大中型PLC
- GB 21258-2024燃煤发电机组单位产品能源消耗限额
- 《用户体验设计导论》
- 美团外卖运营知识试题
- 航空概论学习通超星期末考试答案章节答案2024年
- 业务流程可视化改善
- 期末复(知识清单)2024-2025学年人教PEP版(2024)英语三年级上册
- 45001-2020职业健康安全管理体系危险源识别与风险评价及应对措施表(各部门)
- 人教版六年级科学重点知识点
- 春节:艺术的盛宴
评论
0/150
提交评论