자바스크립트에서 깨끗하게 링크 목록 얻기
실용적인 스크래핑이나 좀더 고도의 것을 요구하고 있는 분은 Chrome 확장의 Scraper등의 이용을 검토해, 부디.
사양
「예쁘게」의 내역.
코드
copy() 명령을 사용하고 있으므로 Chrome 전제입니다.
브라우저의 Console에 붙여서 실행하십시오.
// 検索ワードは適宜変更してください。
const targetLinkWords = ['www.bbc.com'];
const createLinkList = (el) => {
let existsList = [];
let res = '';
Array.prototype.filter.call(el, (node) => {
// hrefの値重複とtargetLinkWordsに登録されたワードを含まない場合、除外
if (existsList.indexOf(node.href) === -1 &&
targetLinkWords.find((val) => {return node.href.indexOf(val) !== -1;})) {
existsList.push(node.href);
res = `${res}\r\n` + (node.text.trim() === '' ?
'テキストなし':node.text.replace(/\r?\n/g, '')) + `||${node.href}`;
}
});
return res;
};
const result = createLinkList(document.querySelectorAll('a'));
console.log(result);
copy(result);
결과
BBC NEWS Tech 페이지에서 시도해 보았습니다.
Homepage||https://www.bbc.com/
Skip to content||https://www.bbc.com/news/technology#skip-to-content
Accessibility Help||https://www.bbc.com/accessibility/
Sign in||https://session.bbc.com/session?ptrt=https%3A%2F%2Fwww.bbc.com%2Fnews%2Ftechnology&context=news_gnl&userOrigin=news_gnl
Notifications||https://www.bbc.com/news/technology#
News||https://www.bbc.com/news
Sport||https://www.bbc.com/sport
Reel||https://www.bbc.com/reel
Worklife||https://www.bbc.com/worklife
Travel||https://www.bbc.com/travel
Future||https://www.bbc.com/future
Culture||https://www.bbc.com/culture
Music||https://www.bbc.com/culture/music
Weather||https://www.bbc.com/weather
More||https://www.bbc.com/news/technology#orb-footer
Video||https://www.bbc.com/news/video_and_audio/headlines
World||https://www.bbc.com/news/world
Asia||https://www.bbc.com/news/world/asia
UK||https://www.bbc.com/news/uk
Business||https://www.bbc.com/news/business
TechTech selected||https://www.bbc.com/news/technology
Science||https://www.bbc.com/news/science_and_environment
Stories||https://www.bbc.com/news/stories
Entertainment & Arts||https://www.bbc.com/news/entertainment_and_arts
Health||https://www.bbc.com/news/health
World News TV||https://www.bbc.com/news/world_radio_and_tv
In Pictures||https://www.bbc.com/news/in_pictures
Reality Check||https://www.bbc.com/news/reality_check
Newsbeat||https://www.bbc.com/news/newsbeat
Special Reports||https://www.bbc.com/news/special_reports
Explainers||https://www.bbc.com/news/explainers
Long Reads||https://www.bbc.com/news/the_reporters
Have Your Say||https://www.bbc.com/news/have_your_say
Africa||https://www.bbc.com/news/world/africa
Australia||https://www.bbc.com/news/world/australia
Europe||https://www.bbc.com/news/world/europe
Latin America||https://www.bbc.com/news/world/latin_america
・・・
제대로 복사되었습니다.
Reference
이 문제에 관하여(자바스크립트에서 깨끗하게 링크 목록 얻기), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다
https://qiita.com/deadmau5/items/76fd4a7f17eef0007e6d
텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.
우수한 개발자 콘텐츠 발견에 전념
(Collection and Share based on the CC Protocol.)
// 検索ワードは適宜変更してください。
const targetLinkWords = ['www.bbc.com'];
const createLinkList = (el) => {
let existsList = [];
let res = '';
Array.prototype.filter.call(el, (node) => {
// hrefの値重複とtargetLinkWordsに登録されたワードを含まない場合、除外
if (existsList.indexOf(node.href) === -1 &&
targetLinkWords.find((val) => {return node.href.indexOf(val) !== -1;})) {
existsList.push(node.href);
res = `${res}\r\n` + (node.text.trim() === '' ?
'テキストなし':node.text.replace(/\r?\n/g, '')) + `||${node.href}`;
}
});
return res;
};
const result = createLinkList(document.querySelectorAll('a'));
console.log(result);
copy(result);
BBC NEWS Tech 페이지에서 시도해 보았습니다.
Homepage||https://www.bbc.com/
Skip to content||https://www.bbc.com/news/technology#skip-to-content
Accessibility Help||https://www.bbc.com/accessibility/
Sign in||https://session.bbc.com/session?ptrt=https%3A%2F%2Fwww.bbc.com%2Fnews%2Ftechnology&context=news_gnl&userOrigin=news_gnl
Notifications||https://www.bbc.com/news/technology#
News||https://www.bbc.com/news
Sport||https://www.bbc.com/sport
Reel||https://www.bbc.com/reel
Worklife||https://www.bbc.com/worklife
Travel||https://www.bbc.com/travel
Future||https://www.bbc.com/future
Culture||https://www.bbc.com/culture
Music||https://www.bbc.com/culture/music
Weather||https://www.bbc.com/weather
More||https://www.bbc.com/news/technology#orb-footer
Video||https://www.bbc.com/news/video_and_audio/headlines
World||https://www.bbc.com/news/world
Asia||https://www.bbc.com/news/world/asia
UK||https://www.bbc.com/news/uk
Business||https://www.bbc.com/news/business
TechTech selected||https://www.bbc.com/news/technology
Science||https://www.bbc.com/news/science_and_environment
Stories||https://www.bbc.com/news/stories
Entertainment & Arts||https://www.bbc.com/news/entertainment_and_arts
Health||https://www.bbc.com/news/health
World News TV||https://www.bbc.com/news/world_radio_and_tv
In Pictures||https://www.bbc.com/news/in_pictures
Reality Check||https://www.bbc.com/news/reality_check
Newsbeat||https://www.bbc.com/news/newsbeat
Special Reports||https://www.bbc.com/news/special_reports
Explainers||https://www.bbc.com/news/explainers
Long Reads||https://www.bbc.com/news/the_reporters
Have Your Say||https://www.bbc.com/news/have_your_say
Africa||https://www.bbc.com/news/world/africa
Australia||https://www.bbc.com/news/world/australia
Europe||https://www.bbc.com/news/world/europe
Latin America||https://www.bbc.com/news/world/latin_america
・・・
제대로 복사되었습니다.
Reference
이 문제에 관하여(자바스크립트에서 깨끗하게 링크 목록 얻기), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://qiita.com/deadmau5/items/76fd4a7f17eef0007e6d텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.
우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)