关于取webbrowser中网页源文件的N种错误方法(高手请进)

玄之丞 2006-10-20 10:01:35

以下仅为个人意见,请各位高手指正:

方法一: 使用 get_innerHTML 或者 get_outerHTML
得到的webbrowser 解析后的文件,与真实的原文件不符.

大家可以试一个在建一个htm文件所含以下内容
<table><tr><td></td></tr></table>
取出来比较一下.

==================================================

方法二:使用 NMHTTP 获取
低效.慢且不说,很多页面需要SESSION值或COOKIE,
使用GET方法无法取的.

==================================================
方法三:使用查看源文件命令
并没有获取到源文件的内容.

==================================================

方法四:使用DCOM接口中.IPersistStreamInit接口指针，然后把网页写到IStream流中去。
HTML文件是没有问题.
请试一下XML文件,不知是不是我方法有误,XML文件我只能取到一个 '?'

请教如何在webbrowser中取得网页的真实源文件.

...全文

351 6 打赏收藏转发到动态举报

写回复

用AI写文章

6 条回复

切换为时间正序

请发表友善的回复…

发表回复

玄之丞 2006-10-20

打赏
举报

有没有这样的com 接口?

玄之丞 2006-10-20

打赏
举报

这样说来,应该如何取出当前webbrowser的源代码呢?

就像使用IOleCommandTarget 查看源代码,在记事本中显示的内容.

蒋晟 2006-10-20

打赏
举报

1 the right method is IHTMLDocument3::get_documentElement and IHTMLElement::get_outerHtml. It is a better form of the pased HTML
2 You should store cookie in the response header and post it in the header of the next request (don't limit yourself to get).
3 use IOleCommandTarget. But the source code will be displayed in a notepad window
4 IE has a bug in displaying XML files (I am not sure the bug is fixed in IE7 or not). Transform them into HTML before writing to the HTML document.

玄之丞 2006-10-20

打赏
举报

为什么我给不了你分呢?

在管理页面是这样
回复人：jiangsheng(蒋晟.Net[MVP]) () 信誉：100 2006-10-20 12:09:29 得分: 100

在这边就看不到.

上次那贴也是这样.
不知道为什么.

玄之丞 2006-10-20

打赏
举报

thank you very much.

蒋晟 2006-10-20

打赏
举报

void CAView::DocumentComplete(LPDISPATCH pDisp, VARIANT* URL)
{
IDispatchPtr spDisp;
HRESULT hr;

hr = m_pBrowserApp->QueryInterface(IID_IDispatch, (void**) &spDisp);
// Is the IDispatch* passed to us for the top-level window ?
if (pDisp == spDisp)
{
IHTMLDocument2Ptr spDoc;

// Get the active document
spDoc = GetHtmlDocument();
if ( spDoc )
{
IHTMLWindow2Ptr spWin;

// Get the top-level window
spDisp = spDoc->Script;
spWin = spDisp;
if ( spWin )
{
// Get the document
spDoc = spWin->document;
if ( spDoc )
{
IDispatchExPtr spDispEx;

// Get the document's IDispatchEx
spDoc->QueryInterface( IID_IDispatchEx,
(void**)&spDispEx );
if ( spDispEx )
{
_bstr_t bstrName("XMLDocument");
DISPID dispid;

// Get the XMLDocument expando property
spDispEx->GetDispID( bstrName,
fdexNameCaseSensitive,
&dispid );
if ( SUCCEEDED(hr) && dispid != DISPID_UNKNOWN )
{
VARIANT var;
DISPPARAMS dpNoArgs = {NULL, NULL, 0, 0};

// Get the XMLDocument value
hr = spDispEx->Invoke( dispid,
IID_NULL,
LOCALE_USER_DEFAULT,
DISPATCH_PROPERTYGET,
&dpNoArgs,
&var,
NULL,
NULL );
if ( SUCCEEDED(hr) && var.vt == VT_DISPATCH )
{
IXMLDOMDocument* pXMLDoc=NULL;

// Get the IXMLDOMDocument interface
var.pdispVal->QueryInterface(
IID_IXMLDOMDocument,
(void**)&pXMLDoc );
VariantClear( &var );
if ( pXMLDoc )
{
// Get the root element
IXMLDOMElement* pXMLElem=NULL;

pXMLDoc->get_documentElement( &pXMLElem );
if ( pXMLElem )
{
BSTR bstr;
USES_CONVERSION;

// Get/display the tag name
pXMLElem->get_tagName( &bstr );
AfxMessageBox( OLE2T(bstr) );
pXMLElem->Release();
}
pXMLDoc->Release();
}
}
}
}
}
}
}
}
}