HTML to DOM

16369 단어 html

Although you can now natively parse HTML using DOMParser and XMLHttpRequest , this is a new feature that is not yet supported by all browsers in use in the wild. The code snippets on this page will let your site work until these new features are more widely available.
Safely parsing simple HTML to DOM
When using XMLHttpRequest to get the HTML of a remote webpage, it is often advantageous to turn that HTML string into DOM for easier manipulation. However, there are potential dangers involved in injecting remote content in a privileged context in your extension, so it can be desirable to parse the HTML safely.
The function below will safely parse simple HTML and return a DOM object which can be manipulated like web page elements. This will remove tags like <script> , <style> , <head> , <body> , <title> , and <iframe> . It will also remove all JavaScript, including element attributes that contain JavaScript.

function HTMLParser(aHTMLString){

  var html = document.implementation.createDocument("http://www.w3.org/1999/xhtml", "html", null),

    body = document.createElementNS("http://www.w3.org/1999/xhtml", "body");

    html.documentElement.appendChild(body);



    body.appendChild(Components.classes["@mozilla.org/feed-unescapehtml;1"].getService(Components.interfaces.nsIScriptableUnescapeHTML).parseFragment(aHTMLString, false, null, body));



  return body;

},

It works by creating a content-level (this is safer than chrome-level) <div> in the current page, then parsing the HTML fragment and attaching that fragment to the <div> . The <div> is returned, and it is never actually appended to the current page. The returned <body> object is of type Element
Here is a sample that counts the number of paragraphs in a string:

var DOMPars = HTMLParser('<p>foo</p><p>bar</p>');

alert(DOMPars.getElementsByTagName('p').length);

If method HTMLParser() returns variable html (instead of body ), you have all document object with its complete functions list, therefore you can retrieve info within div tag like this:

var DOMPars = HTMLParser("<div id='userInfo'>John was a mediocre programmer, but people liked him <strong>anyway</strong>.</div>");

alert(DOMPars.getElementById('userInfo').innerHTML);

To parse a complete HTML page, load it into an iframe whose type is content (not chrome). See
Using a hidden iframe element to parse HTML to a window's DOM below.
Parsing Complete HTML to DOM
Loading an html document seems much more simpler if its loaded using the XMLHttpRequest object. For that matter we are going to load our HTML document first:

var request = XMLHttpRequest();

request.open("GET", "http://example.org/file.html", false);

request.send(null);

our next step is to create the DOM that we need to feed our newly gathered html information:

var doc = document.implementation.createHTMLDocument("example");

doc.documentElement.innerHTML = request.responseText;

after this any manipulation that we might want to do will be something as simple as the following:

doc.body.textContent = "This is inside the body!";

Using a hidden iframe element to parse HTML to a window's DOM
Sample code may need more work. Create your own function using unique name, ID, and so forth.

var frame = document.getElementById("sample-frame");

if (!frame) {

    // create frame

        frame = document.createElement("iframe"); // iframe (or browser on older Firefox)

        frame.setAttribute("id", "sample-frame");

        frame.setAttribute("name", "sample-frame");

        frame.setAttribute("type", "content");

        frame.setAttribute("collapsed", "true");

        document.getElementById("main-window").appendChild(frame);

        // or 

            // document.documentElement.appendChild(frame);



    // set restrictions as needed

        frame.webNavigation.allowAuth = false;

        frame.webNavigation.allowImages = false;

        frame.webNavigation.allowJavascript = false;

        frame.webNavigation.allowMetaRedirects = true;

        frame.webNavigation.allowPlugins = false;

        frame.webNavigation.allowSubframes = false;



    // listen for load

        frame.addEventListener("load", function (event) {

          // the document of the HTML in the DOM

            var doc = event.originalTarget;

          // skip blank page or frame

            if (doc.location.href == "about:blank" || doc.defaultView.frameElement) return;



          // do something with the DOM of doc

              alert(doc.location.href);



          // when done remove frame or set location "about:blank"

          setTimeout(function (){

             var frame = document.getElementById("sample-frame");

             // remove frame

             // frame.destroy(); // if using browser element instead of iframe

             frame.parentNode.removeChild(frame);

             // or set location "about:blank"

             // frame.contentDocument.location.href = "about:blank";

           },10);

        }, true);

} 





// load a page

   frame.contentDocument.location.href = "http://www.mozilla.org/"; 

// or 

// frame.webNavigation.loadURI("http://www.mozilla.org/",Components.interfaces.nsIWebNavigation,null,null,null);

If you are starting with an HTML string, you can convert it to a
data URI and use that to load in the browser element.
Using a hidden XUL iframe (alternate example)
Sometimes, a browser element is overkill, or does not meet your needs, or you can't fulfill its requirements. While working on Donkeyfire , I discovered the iframe XUL element, and it is very easy to implement it.
As an example, I will show a browser overlay .xul file, and some JavaScript code to access it.
Here is some XUL code you can add to your browser overlay .xul file. Don't forget to modify the id and name!

<vbox hidden="false" height="0">

  <iframe type="content" src="" name="donkey-browser" hidden="false" id="donkey-browser" height="0"/>

</vbox>

Then, in your extension's "load"event handler:

onLoad: function() {

    donkeybrowser = document.getElementById("donkey-browser");

    if (donkeybrowser) {

        donkeybrowser.style.height = "0px";

        donkeybrowser.webNavigation.allowAuth = true;

        donkeybrowser.webNavigation.allowImages = false;

        donkeybrowser.webNavigation.allowJavascript = false;

        donkeybrowser.webNavigation.allowMetaRedirects = true;

        donkeybrowser.webNavigation.allowPlugins = false;

        donkeybrowser.webNavigation.allowSubframes = false;

        donkeybrowser.addEventListener("DOMContentLoaded", function (e) { donkeyfire.donkeybrowser_onPageLoad(e); }, true);

    }

With that code, we obtain a reference to the iframe element we declared in the .xul file. The most interesting piece of code here is the DOMContentLoaded event listener we define for the element. Let's take a look at the donkeyfire.donkeybrowser_onPageLoad() handler:

donkeybrowser_onPageLoad: function(aEvent) {

    var doc = aEvent.originalTarget;

    var url = doc.location.href;

    if (aEvent.originalTarget.nodeName == "#document") { // ok, it's a real page, let's do our magic

        dump("[DF] URL = "+url+"
");

        var text = doc.evaluate("/html/body/h1",doc,null,XPathResult.STRING_TYPE,null).stringValue;

        dump("[DF] TEXT in /html/body/h1 = "+text+"
");

    }

},

As you can see, we obtain full access to the DOM of the page we loaded in background, and we can even evaluate XPath expressions. In the example, we dump() to the console the page's URL and the text contained in the first h1 tag of the page's <body> .
But, we still need to see how to execute the famous loadURI() method using our iframe:

donkeybrowser.webNavigation.loadURI("http://developer.mozilla.org",Components.interfaces.nsIWebNavigation, null, null, null);

Also, I recommend you take a look at the nsIWebNavigation interface.

MDNhttps://developer.mozilla.org/en-US/Add-ons/Code_snippets/HTML_to_DOM에서 온 것으로 비교적 완전할 것이다.

이 내용에 흥미가 있습니까?

현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:

다른 사람의 웹사이트 편집: contenteditable 및 designMode

그래도 우리가 그렇게 할 수 있다고 생각하는 것은 멋진 일입니다. 제가 강조하고 싶었던 일종의 관련 API가 실제로 몇 개 있기 때문에 오늘 그것을 가져왔습니다. contenteditable는 "true" 값이 할당...

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.

[데이터 구조] (이 진 트 리) 두 그루 의 이 진 트 리 가 비슷 한 지 판단 합 니 다.

Express 애플리케이션에 데이터베이스 상호 작용 추가

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다