TensorFlow를 사용하여 유해한 의견을 검색하기 위해 Github 작업을 만듭니다.js

25674 단어 tensorflow machinelearning javascript tensorflowjs

최초 mypersonal site에 게시된 게시물
주말에 나는 잠재적인 유해 평론과 홍보 평론을 자동으로 검출하기 위해 몇 시간 동안 Github 조작을 구축했다.

TensorFlow를 사용합니다.js와 itstoxicity pre-trained model는 다음과 같은 7개 유형에 따라 독성 수준을 평가한다.

신원공격

모욕

음란

심각독성

노골적인 성행위

위협

독성

사용자가 새 설명 또는 설명 PR을 게시할 때 이 작업을 트리거합니다.만약 내용이 유독 내용으로 분류될 가능성이 높다면 로봇은 주석을 만들어 저자를 표시하고 내용을 업데이트할 것을 건의할 것이다.
다음은 빠른 프레젠테이션입니다.

repo

설치 프로그램

코드를 검토하기 전에 JavaScript 작업이라는 점에 유의하십시오.나는 동작도 Docker 용기에서 할 수 있다는 것을 읽었지만 간단하게 보기 위해 JS를 계속 사용했다.
우선, 나는 하나의 동작을 창조했다.yml 파일은 프로젝트 폴더의 루트 디렉터리에 있습니다.
이 파일에서 다음 코드를 작성했습니다.

name: "Safe space"
description: "Detect the potential toxicity of PR comments"
inputs:
  GITHUB_TOKEN:
    required: true
  message:
    required: false
  toxicity_threshold:
    required: false
runs:
  using: "node12"
  main: "dist/index.js"

앞의 몇 줄은 자명하지 않다.그리고 inputs 속성은 3개의 다른 요소를 포함한다.

GITHUB_TOKEN는 작업 흐름 운행 중 신분 검증에 필요한 비밀 영패로 자동으로 생성된다.

(message 속성은 선택할 수 있으며, 로봇이 발표한 평론 내용을 맞춤형으로 설정하려면 조작에서 유독 평론이 검출되면 이 속성을 사용할 수 있다.

toxicity_threshold 속성도 선택할 수 있습니다. 사람들이 사용자 정의 한도값을 설정할 수 있고 기계 학습 모델은 논평을 예측할 때 이 한도값을 사용합니다.

마지막으로 runs 아래에 노드를 표시하는 버전을 설정합니다.js 우리는 우리의 조작이 실행되고 코드가 있는 파일을 조작할 수 있기를 희망합니다.

동작 코드

JavaScript 작업을 만들려면 최소 2개의 노드를 설치하고 설치해야 합니다.js 모듈: @actions/core과@actions/github.이 동작은 장량류를 사용했기 때문이다.js 모델, 나는 @tensorflow-models/toxicity와@tensorflow/tfjs도 설치했다.
그리고 내 dist/index.js 파일에서 나는 동작 코드를 쓰기 시작했다.
핵심 설정은 다음과 같습니다.

async function run() {
  const tf = require("@tensorflow/tfjs");
  const toxicity = require("@tensorflow-models/toxicity");
  await tf.setBackend("cpu");

  try {
    const githubToken = core.getInput("GITHUB_TOKEN");
    const customMessage = core.getInput("message");
    const toxicityThreshold = core.getInput("toxicity_threshold");
    const { context } = github;
  } catch (error) {
    core.setFailed(error.message);
  }
}

run();

mainrun 함수에 필요한 패키지가 있고 TensorFlow에 백엔드를 설정합니다.js.그리고 try/catch 문장에서 코드는 앞에서 언급한 3개의 매개 변수를 얻었고 우리는 곧 이 매개 변수를 사용할 것이다.
마지막으로, 우리는 동작을 촉발할 때 사건의 상하문을 얻는다.

사용자가 질문 또는 PR에 대한 의견을 게시할 때 로봇 의견을 작성합니다.

여러 이벤트가 Github 작업을 트리거할 수 있습니다.이 작업은 질문이나 PR에 게시된 의견을 얻는 데 관심이 있기 때문에, 이벤트의 유효한 부하를 보기 시작해서 속성이 정의되었는지 확인해야 합니다. comment그리고 새로운 주석을 추가하거나 편집할 때만 예측을 실행하는 것이 아니라, 작업의 종류 (여기는 created 와 edited 를 볼 수 있습니다.
자세한 내용은 official Github documentation를 참조하십시오.
그리고 저는 정확한 문제나 PR에 대한 평론을 요청하는 데 필요한 파라미터를 방문하여 기계 학습 모델을 불러왔습니다. 예측 결과 중 하나인 속성match이 사실이라면 이 평론이 독으로 분류되었다는 것을 의미하며 경고 메시지가 있는 새로운 평론을 생성합니다.

if (context.payload.comment) {
  if (
    context.payload.action === "created" ||
    context.payoad.action === "edited"
  ) {
    const issueNumber = context.payload.issue.number;
    const repository = context.payload.repository;
    const octokit = new github.GitHub(githubToken);
    const threshold = toxicityThreshold ? toxicityThreshold : 0.9;
    const model = await toxicity.load(threshold);
    const comments = [];
    const commentsObjects = [];
    const latestComment = [context.payload.comment.body];
    const latestCommentObject = context.payload.comment;
    let toxicComment = undefined;

    model.classify(latestComment).then((predictions) => {
      predictions.forEach((prediction) => {
        if (toxicComment) {
          return;
        }
        prediction.results.forEach((result, index) => {
          if (toxicComment) {
            return;
          }
          if (result.match) {
            const commentAuthor = latestCommentObject.user.login;
            toxicComment = latestComment;
            const message = customMessage
              ? customMessage
              : `<img src="https://media.giphy.com/media/3ohzdQ1IynzclJldUQ/giphy.gif" width="400"/> </br>
                                      Hey @${commentAuthor}! 👋 <br/> PRs and issues should be safe environments but your comment: <strong>"${toxicComment}"</strong> was classified as potentially toxic! 😔</br>
                                      Please consider spending a few seconds editing it and feel free to delete me afterwards! 🙂`;

            return octokit.issues.createComment({
              owner: repository.owner.login,
              repo: repository.name,
              issue_number: issueNumber,
              body: message,
            });
          }
        });
      });
    });
  }
}

사용자가 PR 검토를 제출할 때 로봇 리뷰 작성

공관 평론을 검사하는 코드는 매우 비슷한데, 주로 앞의 몇 줄에 차이가 있다.우리가 찾는 것은 유효 하중의 comment 속성이 아니라 review이다. 내가 흥미를 느끼는 조작은submitted이다.

if (context.payload.review) {
  if (context.payload.action === "submitted") {
    const issueNumber = context.payload.pull_request.number;
    const repository = context.payload.repository;
    const octokit = new github.GitHub(githubToken);
    const threshold = toxicityThreshold ? toxicityThreshold : 0.9;
    const model = await toxicity.load(threshold);
    const reviewComment = [context.payload.review.body];
    const reviewObject = context.payload.review;
    let toxicComment = undefined;
    model.classify(reviewComment).then((predictions) => {
      predictions.forEach((prediction) => {
        if (toxicComment) {
          return;
        }
        prediction.results.forEach((result, index) => {
          if (toxicComment) {
            return;
          }
          if (result.match) {
            const commentAuthor = reviewObject.user.login;
            toxicComment = reviewComment[0];
            const message = customMessage
              ? customMessage
              : `<img src="https://media.giphy.com/media/3ohzdQ1IynzclJldUQ/giphy.gif" width="400"/> </br>
                                      Hey @${commentAuthor}! 👋 <br/> PRs and issues should be safe environments but your comment: <strong>"${toxicComment}"</strong> was classified as potentially toxic! 😔</br>
                                      Please consider spending a few seconds editing it and feel free to delete me afterwards! 🙂`;

            return octokit.issues.createComment({
              owner: repository.owner.login,
              repo: repository.name,
              issue_number: issueNumber,
              body: message,
            });
          }
        });
      });
    });
  }
}

작업 사용

저장소에서 작업을 사용하려면 워크플로우 파일을 만들어야 합니다.
우선 저장소에는 .github 폴더가 있어야 하고 그 안에 workflows 폴더가 있어야 한다.그리고 실행할 작업에 대한 상세한 정보를 포함하는 새 .yml 파일을 추가할 수 있습니다.

on: [issue_comment, pull_request_review]

jobs:
  toxic_check:
    runs-on: ubuntu-latest
    name: Safe space
    steps:
      - uses: actions/checkout@v2
      - name: Safe space - action step
        uses: charliegerard/safe-space@master
        with:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

이 코드 예시에서, 문제의 주석 주위에 이벤트가 발생했을 때,pull request review 이벤트가 발생했을 때, 이 동작을 터치하기를 희망합니다.
그리고 우리는 기본actions/checkout@v2 조작을 사용하는 것부터 시작하여 마지막으로 이 독성 분류 조작과 필요한 GITHUB_TOKEN 파라미터를 포함하는 추가 파라미터를 추가해야 한다고 덧붙였다.
옵션 속성message 및 toxicity_threshold을 사용하려면 다음과 같이 하십시오.

on: [issue_comment, pull_request_review]

jobs:
  toxic_check:
    runs-on: ubuntu-latest
    name: Safe space
    steps:
      - uses: actions/checkout@v2
      - name: Safe space - action step
        uses: charliegerard/safe-space@master
        with:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          message: "Hello"
          toxicity_threshold: 0.7

만약 네가 자신의 행동을 발전시키고 있다면, 너는 노선을 바꾸어 그것을 테스트할 수 있다

uses: charliegerard/safe-space@master

대상

uses: ./

자신의 Github 조작을 구축하려면 주의해야 할 점은 조작 유형issue_comment과 pull_request_review을 사용할 때 코드를 주(일반적으로'주'라고 부른다) 지점으로 전송한 후에야 코드가 다른 지점에서 작동하는지 테스트할 수 있다는 것이다.만약 당신이 하나의 단독 지점에서 모든 내용을 개발한다면, 평론을 작성하거나 홍보를 심사할 때 이 동작을 터치하지 않을 것입니다.
이렇게!🎉

잠재적 개선 사항

현재 나는 사용자에게 유독 평론의 내용을 업데이트한 후 로봇에서 수동으로 주석을 삭제하도록 초청했지만, 이것은 편집할 때 자동으로 완성될 수 있다고 생각한다.사용자가 댓글을 편집할 때, 나는 다시 검사를 실행할 수 있으며, 만약 그것이 안전하다고 예측된다면bot댓글을 자동으로 삭제해서 사용자가 이렇게 할 필요가 없다.

Reference

이 문제에 관하여(TensorFlow를 사용하여 유해한 의견을 검색하기 위해 Github 작업을 만듭니다.js), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://dev.to/devdevcharlie/creating-a-github-action-to-detect-toxic-comments-using-tensorflow-js-13bo

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

addChildViewController 메모

AWS SAM에 따라 Lambda를 병합하는 방법

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다