热搜:NVER node 开发 php

chunked编码问题

2024-08-15 12:55:01
chunked编码问题

PHP采集到的数据是chunked传输编码,gzip压缩格式的
chunk编码的思路貌似是: 将数据分块传输,每一块分为头部和主体字段,头部包含主体信息的长度且以16进制表示,头部和主体以回车换行符分隔,最后一块以单行的0表示分块结束。。

响应头信息:

Array(    [0] => HTTP/1.1 200 OK    [1] => Server: Dict/34002    [2] => Date: Wed, 17 Dec 2014 06:49:22 GMT    [3] => Content-Type: text/html; charset=utf-8    [4] => Transfer-Encoding: chunked    [5] => Connection: keep-alive    [6] => Keep-Alive: timeout=60    [7] => Cache-Control: private    [8] => Last-Modified: Wed, 17 Dec 2014 04:57:49 GMT    [9] => Expires: Wed, 17 Dec 2014 06:49:22 GMT    [10] => Set-Cookie: uvid=VJEncoTSVYJC; expires=Thu, 31-Dec-37 23:55:55 GMT; domain=.dict.cn; path=/    [11] => Content-Encoding: gzip)


if($this->response_num==200)        {			if($this->is_chunked)			{				//读取chunk头部信息,获取chunk主体信息的长度				$chunk_size = (int)hexdec(fgets($this->conn));				//				while(!feof($this->conn) && $chunk_size > 0) 				{ 					//读取chunk头部指定长度的信息					$this->response_body .= fread( $this->conn, $chunk_size ); 					fseek($this->conn, 2, SEEK_CUR);					$chunk_size = (int)hexdec(fgets( $this->conn,4096)); 			   } 			}			else			{				$len=0;				//读取请求返回的主体信息				while($items = fread($this->conn, $this->response_body_length))				{					$len = $len+strlen($items);					$this->response_body = $items;										//当读取完请求的主体信息后跳出循环,不这样做,貌似会被阻塞!!!					if($len >= $this->response_body_length)					{						break;					}				}			}			            if($this->is_gzip)            {                $this->response_body = gzinflate(substr($this->response_body,10));            }						$this->getTrans($this->response_body);        }


基本上每次都会出现这个提示:
Warning: gzinflate(): data error in E:\CodeEdit\php\http\dict.php on line 384
偶尔能正常解析,应该是chunked解码有问题,查看过一些资料,也变换过集中解码方式,但还是功亏一篑


回复讨论(解决方案)

你可用 gzdecode 解码

你可用 gzdecode 解码



奇怪的是有时可以获取到结果,比如:
int.(打招呼)喂;你好
有时提示错误,比如:
Warning: gzinflate(): data error in E:\CodeEdit\php\http\dict.php on line 380

估计错误还是出现在chunked解码这块,这里的问题是返回的数据是先经过gzip压缩,然后通过chunked分块传输的,所以解码的过程就是反过来的

你可用 gzdecode 解码



if($this->is_chunked)			{				/* //读取chunk头部信息,获取chunk主体信息的长度				$chunk_size = (int)hexdec(trim(fgets($this->conn)));								while(!feof($this->conn) && $chunk_size > 0) 				{ 					//读取chunk头部指定长度的信息					$this->response_body .= fread( $this->conn, $chunk_size ); 					fseek($this->conn, 2, SEEK_CUR);					$next_line = trim(fgets($this->conn));					if($next_line === '0')					{						echo $next_line;exit();					}					else					{						$chunk_size = (int)hexdec($next_line);					}								   } */					while(!feof($this->conn))					{						$this->response_body .= fread($this->conn, 1024);					}					if(preg_match_all("#\r\n#i", $this->response_body, $match))					{						$result=preg_split("#\r\n#i", $this->response_body, -1, PREG_SPLIT_NO_EMPTY );						// echo "
";						// print_r($result); 						/* foreach($result as $v)						{							echo $v."

"; } echo "
"; */ /* echo hexdec($result[0])."
"; echo mb_strlen($result[1])+mb_strlen($result[2])."
"; */ $len = count($result); $this->response_body=''; for($i=1; $i<$len-1; $i++) { $this->response_body .= $result[$i]; } //echo strlen($this->response_body); exit(); } else { die("匹配结束符失败"); } }

基本思路,首先把头部改成connection:close,这样可以通过while(!feof($this->conn))一次性读取所有的数据

然后因为chunk分块传输的头部和主体之间是用回车换行分隔的,所以直接用正则分割,得到一个数组包含数据长度和数据的数组,第一项表示所有数据的总长度(而不是每一个chunk分块的长度,这个貌似和chunk编码有点出入,也难怪按照chunk编码会失败,),最后一个数组项为0表示结束.。。。反复测试,OK了