100分求解文件下载的问题!

qiujian5628 2006-10-11 01:43:04
我想下载一片文件,从网站入口一步一步点下载可以真实下载下来;但是直接下载文件的真实路径出错,提示文件不存在。
网站上的“保存”是一个按钮,鼠标放在按钮上状态栏提示
javascript:saveasxml("123456","123456");
如何编写下载程序 能够下载这个网站上的文件???
高分求解!!!!!!!
...全文
472 21 打赏 收藏 转发到动态 举报
写回复
用AI写文章
21 条回复
切换为时间正序
请发表友善的回复…
发表回复
qiujian5628 2006-11-01
  • 打赏
  • 举报
回复
再顶!
qiujian5628 2006-10-23
  • 打赏
  • 举报
回复
load("record49a36bde.xml");
load("text_view.xml");
这两个是哪两个文件?
蒋晟.Net大师 能再指导指导吗?
蒋晟 2006-10-20
  • 打赏
  • 举报
回复
#include "msxml2.h"

// ...
// Assume that COM is already initialized with CoInitialize(Ex)
// Error checking and handling elided for clarity

// load XML source document
IXMLDOMDocument40 * pSource;
::CoCreateInstance(CLSID_DOMDocument40, NULL, CLSCTX_INPROC_SERVER,
IID_IXMLDOMDocument40, (void**)&pSource);
pSource->put_async(VARIANT_FALSE);
pSource->load("record49a36bde.xml");

// load XSLT stylesheet document
IXMLDOMDocument40 * pStylesheet;
::CoCreateInstance(CLSID_DOMDocument40, NULL, CLSCTX_INPROC_SERVER,
IID_IXMLDOMDocument40, (void**)& pStylesheet);
pStylesheet->put_async(VARIANT_FALSE);
pStylesheet->load("text_view.xml");

// perform transformation
BSTR result;
pSource->transformNode(pStylesheet, &result);
::MessageBox(NULL, result, "Transform Result", MB_OK);

::SysFreeString(result);

qiujian5628 2006-10-20
  • 打赏
  • 举报
回复
m_sUrl = "patent_eng/XML/1019980023646/1019980023646.TAR";
pConnection = session.GetHttpConnection("patent2.kipris.or.kr");
pFile = pConnection->OpenRequest(CHttpConnection::HTTP_VERB_GET,
m_sUrl,
"http://patent2.kipris.or.kr/patent_eng/KP/KPXV1010.jsp?APPLNO=1019980023646&PUBREG=P",
1,
NULL,
"HTTP/1.1",
INTERNET_FLAG_RELOAD|INTERNET_FLAG_DONT_CACHE);

result = pFile->SendRequest();
pFile->QueryInfoStatusCode(dwRet);

最后得到的dwRet值总是404,Sever not Found.
该如何修改函数?还请各位多帮忙,拜谢了!
不成功的抓包:
GET /patent_eng/XML/1019980023646/1019980023646.TAR HTTP/1.1
Referer: http://patent2.kipris.or.kr/patent_eng/KP/KPXV1010.jsp?APPLNO=1019980023646&PUBREG=P
User-Agent: KoreaDownload
Host: patent2.kipris.or.kr
Cache-Control: no-cache

HTTP/1.1 404 Not Found
Date: Fri, 20 Oct 2006 06:37:53 GMT
Server: Oracle-Application-Server-10g/9.0.4.1.0 Oracle-HTTP-Server
Content-Length: 170
Cache-Control: private
Connection: close
Content-Type: text/html

成功的抓包:

GET /patent_eng/XML/1019980023646/1019980023646.TAR HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*
Referer: http://patent2.kipris.or.kr/patent_eng/KP/KPXV1010.jsp?APPLNO=1019980023646&PUBREG=P
Accept-Language: zh-cn
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)
Host: patent2.kipris.or.kr
Connection: Keep-Alive
Cookie: JSESSIONID=cbf2a908ce8d9cfb3543ac242278e7b99b8a2a36d69.oQzKqAzxmh8IoQzKqAzN-AXM-AHMcxaNa3eUePWMa3aIaxeM-x4QcgSS-xqRbN8QnhqS-AnzmNbAb3eLaNiI-huKa30xok5Nml1K-AHDq79DmQ4M-AHDq79DqMTJqwTFqwaTc3uKax8NaNexf2beejf5hzftfiT78QfznA5Pp7ftolbGmkTy; KP_CONFIG=1111211157517311151511115111100000000000; USER_PROF=3711313133133100000073333001110000000000730000000133130000007030003000000000000027331113333331200000200703030330230000003533332013032300000023B421A403A411A521A4000000000004000000000000000000000000000; c_user_id=

HTTP/1.1 200 OK
Date: Wed, 18 Oct 2006 07:02:31 GMT
Server: Oracle-Application-Server-10g/9.0.4.1.0 Oracle-HTTP-Server
Last-Modified: Wed, 18 Oct 2006 07:02:30 GMT
Accept-Ranges: bytes
Content-Length: 112640
Cache-Control: private
Connection: close
Content-Type: application/x-tar

qiujian5628 2006-10-19
  • 打赏
  • 举报
回复
哪位帮我解决了,我另开贴散100分,谢了
椅子 2006-10-19
  • 打赏
  • 举报
回复
要从这个js函数:saveasxml下手

qiujian5628 2006-10-19
  • 打赏
  • 举报
回复
http://patent2.kipris.or.kr/patent/XMLLIB/UNEXPAT.XSL
如何transform the XML into HTML?
对web方面不是很熟 想尽快能批量下载这个韩国网站的一些东西
还请楼上几位大哥再指点指点 小弟感激不尽
唉 web方面就没怎么接触 头脑总晕忽忽的 呵呵
蒋晟 2006-10-18
  • 打赏
  • 举报
回复
get http://patent2.kipris.or.kr/patent/XMLLIB/UNEXPAT.XSL and transform the XML into HTML. Parse the result HTML to find out how do you send the next request
sjjf 2006-10-18
  • 打赏
  • 举报
回复
mark
蒋晟 2006-10-18
  • 打赏
  • 举报
回复
call IHTMLAnchorElement::click
handle BeforeNavigate2 and get the URL of the file
download the file.
qiujian5628 2006-10-18
  • 打赏
  • 举报
回复
比如下载到的文件为1019980023646.tar 点击保存按钮
通过Commview抓取到含有 1019980023646.tar 字符的包只有一个如下:
如何构造http头抓取文件呢?

GET /patent_eng/XML/1019980023646/1019980023646.TAR HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*
Referer: http://patent2.kipris.or.kr/patent_eng/KP/KPXV1010.jsp?APPLNO=1019980023646&PUBREG=P
Accept-Language: zh-cn
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)
Host: patent2.kipris.or.kr
Connection: Keep-Alive
Cookie: JSESSIONID=cbf2a908ce8d9cfb3543ac242278e7b99b8a2a36d69.oQzKqAzxmh8IoQzKqAzN-AXM-AHMcxaNa3eUePWMa3aIaxeM-x4QcgSS-xqRbN8QnhqS-AnzmNbAb3eLaNiI-huKa30xok5Nml1K-AHDq79DmQ4M-AHDq79DqMTJqwTFqwaTc3uKax8NaNexf2beejf5hzftfiT78QfznA5Pp7ftolbGmkTy; KP_CONFIG=1111211157517311151511115111100000000000; USER_PROF=3711313133133100000073333001110000000000730000000133130000007030003000000000000027331113333331200000200703030330230000003533332013032300000023B421A403A411A521A4000000000004000000000000000000000000000; c_user_id=

HTTP/1.1 200 OK
Date: Wed, 18 Oct 2006 07:02:31 GMT
Server: Oracle-Application-Server-10g/9.0.4.1.0 Oracle-HTTP-Server
Last-Modified: Wed, 18 Oct 2006 07:02:30 GMT
Accept-Ranges: bytes
Content-Length: 112640
Cache-Control: private
Connection: close
Content-Type: application/x-tar

1019980023646.XML...................................................................................100644 .000455 .000454 .00000050721 10515350566 013353. 0....................................................................................................ustar.00webadmin........................dba.............................000000 .000000 ........................................................................................................................................................................<?xml version="1.0" encoding ="euc-kr" ?>
<?xml:stylesheet type ="text/xsl" href="http://patent2.kipris.or.kr/patent/XMLLIB/UNEXPAT.XSL" ?>
<ALLUNEXPAT>
<SDOBI><B190>措茄刮惫漂倾没(KR)</B190><B121>傍俺漂倾傍焊(A)</B121><B510 VER="6"><B511>B62D 21/00</B511></B510><B110>漂2000-0002743</B110><B430>2000斥01岿15老</B430><B200><B210>10-1998-
..........
mynamelj 2006-10-17
  • 打赏
  • 举报
回复
如果是单纯的javascrip脚本可以分析得出来,但如果里面有ASP动态生成的代码那就难了.
尘雨 2006-10-17
  • 打赏
  • 举报
回复
httpheader 中判断referer(从哪一个url访问过来的)通常用于防盗链,你可以使用commview先抓取鼠标点击下载的http头,再用wininet api把这些header初始化之后提交,请求文件,就可以得到了

一句话,先抓包,构造包,提交
qiujian5628 2006-10-17
  • 打赏
  • 举报
回复
楼上的两位高手能否再说详细一点,我查看源码了,但是还是没有头绪,还请赐教!
qiujian5628 2006-10-17
  • 打赏
  • 举报
回复
对网络编程不是很懂,是不是找到上一个链接,直接另存为保存起来,查看源码,然后分析赋值吗?
mynamelj 2006-10-16
  • 打赏
  • 举报
回复
javascript:saveasxml("123456","123456");
是哪个网站的?把它的脚本下载下来看看不就行了。

这种下载方法实际上是别为了实现防盗URL地址.
wangk 2006-10-16
  • 打赏
  • 举报
回复
用察看源代码,然后分析看看javascript:saveasxml是如何实现的。
qiujian5628 2006-10-16
  • 打赏
  • 举报
回复
接着顶!
qiujian5628 2006-10-12
  • 打赏
  • 举报
回复
顶 这可是100分呀 大家帮个忙 帮我顶到解决为止吧 谢谢哦
qiujian5628 2006-10-11
  • 打赏
  • 举报
回复
楼上的不诚心,我说过直接这么下载提示下载不下来;
UrlDownloadFile()我提供真实的URL都不能下载 只有通过网站 点击SAVE按钮 才能下载!
加载更多回复(1)

18,357

社区成员

发帖
与我相关
我的任务
社区描述
VC/MFC 网络编程
c++c语言开发语言 技术论坛(原bbs)
社区管理员
  • 网络编程
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧