Google vision API를 사용하여 OCR 애플리케이션 구축

28726 단어 node javascript googlecloud

이 강좌에서는 Node에서 OCR 응용 프로그램을 구축합니다.js는 Google vision API를 사용합니다.
OCR 응용 프로그램에서 이미지를 텍스트 인식합니다.그림에서 텍스트를 가져오는 데 사용할 수 있습니다.

Google vision API 시작

Google Vision API 사용을 시작하려면 다음 링크를 참조하십시오.
https://cloud.google.com/vision/docs/setup .
Google vision API를 설정하는 방법에 대한 지침에 따라 Google 응용 프로그램 자격 증명을 가져옵니다. 이것은 서비스 키를 포함하는 JSON 파일입니다. 설정이 완료되면 컴퓨터로 다운로드됩니다.구글 응용 프로그램 인증은 우리가 곧 개발할 응용 프로그램과 떨어질 수 없기 때문에 매우 유용하다.

노드를 사용합니다.js 클라이언트 라이브러리

노드를 사용합니다.js 클라이언트 라이브러리, 아래 링크를 방문하여 시작하십시오.
https://cloud.google.com/vision/docs/quickstart-client-libraries
이 페이지에는 가장 좋아하는 프로그래밍 언어에서 Google Vision API를 사용하는 방법이 표시됩니다.현재 우리는 이미 페이지의 내용을 보았고, 우리는 직접 코드에서 그것을 실현할 수 있다.
ocrGoogle이라는 디렉터리를 만들고 가장 좋아하는 코드 편집기에서 엽니다.
달아나다

npm init -y

패키지를 작성합니다.json 파일.그런 다음 실행

npm install --save @google-cloud/vision

Google vision API를 설치합니다.리소스 폴더를 만들고 이미지를 wakeupcat.jpg 에서 다운로드한 다음 인덱스를 만듭니다.js 파일을 다음 코드로 채우기

process.env.GOOGLE_APPLICATION_CREDENTIALS = 'C:/Users/lenovo/Documents/readText-f042075d9787.json'

async function quickstart() {
  // Imports the Google Cloud client library
  const vision = require('@google-cloud/vision');

  // Creates a client
  const client = new vision.ImageAnnotatorClient();

  // Performs label detection on the image file
  const [result] = await client.labelDetection('./resources/wakeupcat.jpg');
  const labels = result.labelAnnotations;
  console.log('Labels:');
  labels.forEach(label => console.log(label.description));
}

quickstart()

첫 번째 줄에서 Google\u Application\u 자격 증명의 환경 변수를 이전에 다운로드한 JSON 파일로 설정합니다.비동기 함수quickstart는 구글 논리를 포함하고 마지막 줄에서 이 함수를 호출합니다.
달아나다

node index.js

이미지를 처리하려면 이미지의 태그를 콘솔에 인쇄해야 합니다.

이것은 보기에는 괜찮지만, 우리는 라벨 검사를 사용하고 싶지 않으니, 색인을 계속 업데이트해 주십시오.js는 다음과 같습니다.

// Imports the Google Cloud client library
const vision = require('@google-cloud/vision');


process.env.GOOGLE_APPLICATION_CREDENTIALS = 'C:/Users/lenovo/Documents/readText-f042075d9787.json'

async function quickstart() {
    try {
        // Creates a client
        const client = new vision.ImageAnnotatorClient();

        // Performs text detection on the local file
        const [result] = await client.textDetection('./resources/wakeupcat.jpg');
        const detections = result.textAnnotations;
        const [ text, ...others ] = detections
        console.log(`Text: ${ text.description }`);
    } catch (error) {
        console.log(error)
    }

}

quickstart()

위의 논리는 이미지의 텍스트를 되돌려줍니다. 일부 변경 사항을 제외하고는 앞의 논리와 같습니다.

클라이언트를 사용하고 있습니다.클라이언트가 아닌 textDetection 메서드태그 체크.

우리는 검출 진열을 텍스트와 다른 두 부분으로 분해할 것이다.text 변수는 이미지의 전체 텍스트를 포함합니다.
지금 달리기

node index.js

이미지의 텍스트를 되돌려줍니다.

Express 설치 및 사용회사 명

express를 설치해야 합니다.js, 서버와 Google Vision API를 요청하는 API를 만듭니다.

npm install express --save

이제 색인을 업데이트할 수 있습니다.js 받는 사람

const express = require('express');
// Imports the Google Cloud client library
const vision = require('@google-cloud/vision');
const app = express();

const port = 3000

process.env.GOOGLE_APPLICATION_CREDENTIALS = 'C:/Users/lenovo/Documents/readText-f042075d9787.json'

app.use(express.json())

async function quickstart(req, res) {
    try {
        // Creates a client
        const client = new vision.ImageAnnotatorClient();

        // Performs text detection on the local file
        const [result] = await client.textDetection('./resources/wakeupcat.jpg');
        const detections = result.textAnnotations;
        const [ text, ...others ] = detections
        console.log(`Text: ${ text.description }`);
        res.send(`Text: ${ text.description }`)
    } catch (error) {
        console.log(error)
    }

}

app.get('/detectText', async(req, res) => {
    res.send('welcome to the homepage')
})

app.post('/detectText', quickstart)

//listen on port
app.listen(port, () => {
    console.log(`app is listening on ${port}`)
})

불면증을 열고 http://localhost:3000/detectTextpost 요청을 보내면 그림의 텍스트가 응답으로 전송됩니다.

multer를 사용하여 이미지 업로드

만약 우리가 그림이 있는 프로그램을 사용할 수 있을 뿐이거나, 백엔드에서 처리하고자 하는 그림을 매번 편집해야 한다면, 이 프로그램은 재미가 없을 것이다.우리는 모든 그림을 루트에 업로드하여 처리하기를 원합니다. 이를 위해multer라는 npm 패키지를 사용했습니다.Multer를 사용하면 라우팅에 이미지를 전송할 수 있습니다.

npm install multer --save

multer를 구성하려면 multerLogic이라는 파일을 만듭니다.js 다음 코드로 편집하기

const multer = require('multer')
const path = require('path')

const storage = multer.diskStorage({
    destination: function (req, file, cb) {
      cb(null, path.join(process.cwd() + '/resources'))
    },
    filename: function (req, file, cb) {
      cb(null, file.fieldname + '-' + Date.now() + path.extname(file.originalname))
    }
})

const upload = multer( { storage: storage, fileFilter } ).single('image')

function fileFilter(req, file, cb) {
    const fileType = /jpg|jpeg|png/;

    const extname = fileType.test(path.extname(file.originalname).toLowerCase())

    const mimeType = fileType.test(file.mimetype)

    if(mimeType && extname){
        return cb(null, true)
    } else {
        cb('Error: images only')
    }
}

const checkError = (req, res, next) => {
    return new Promise((resolve, reject) => {
        upload(req, res, (err) => {
            if(err) {
                res.send(err)
            } 
            else if (req.file === undefined){
                res.send('no file selected')
            }
            resolve(req.file)
        })
    }) 
}

module.exports = { 
  checkError
}

우리들은 위의 논리를 이해하는 데 시간을 좀 쓰자.이것은 모두multer 논리입니다. 이 논리는 우리로 하여금 이미지를 detectText 루트로 보낼 수 있게 합니다.우리는 두 개의 속성을 가진 저장소를 지정한다

목적지: 업로드 파일의 저장 위치를 지정한 다음

파일 이름: 파일을 저장하기 전에 파일 이름을 바꿀 수 있습니다.여기에서 우리는 필드 이름 (글자 그대로 필드 이름, 이미지), 현재 날짜, 원본 파일의 확장자를 연결하여 파일의 이름을 바꿉니다.

메모리와 파일 필터를 포함하는 대상을 호출하는 multer와 같은 변수 upload를 만들었습니다.그리고 파일 형식을 검사하기 위해 함수 fileFilter를 만듭니다. (여기에서 png, jpg, jpeg 파일 형식을 지정합니다.)
다음에 오류를 검사하는 함수 checkError를 만듭니다. 이것은 req로 해석하겠다는 약속을 되돌려줍니다.파일, 오류가 없으면 오류를 정확하게 처리합니다. 마지막으로, checkError를 내보냅니다.이것이 바로 해석이다. 이제 우리는 우리의 코드를 계속할 수 있다.
checkError를 사용하려면 색인에서 사용해야 합니다.js는 다음과 같습니다.

const { checkError } = require('./multerLogic')

그리고 다음과 같이 quickstart 함수를 편집합니다

async function quickstart(req, res) {
    try {

        //Creates a client
        const client = new vision.ImageAnnotatorClient();
        const imageDesc = await checkError(req, res)
        console.log(imageDesc)
        // Performs text detection on the local file
        // const [result] = await client.textDetection('');
        // const detections = result.textAnnotations;
        // const [ text, ...others ] = detections
        // console.log(`Text: ${ text.description }`);
        // res.send(`Text: ${ text.description }`)

    } catch (error) {
        console.log(error)
    }

}

checkError 함수 (약속으로 돌아가기) 를 호출하고 해석된 req를 할당합니다.파일을 imageDesc에 보내고 imageDesc를 콘솔에 인쇄합니다.불면증

우리는 아래의 결과를 컨트롤러에 인쇄해야 한다.

자, 이제 이미지를 업로드하고 실행하기 시작했습니다. 업로드된 이미지를 처리하기 위해 코드를 업데이트할 때가 되었습니다.다음 코드를 사용하여quickstart 함수를 편집하고,

//Creates a client
        const client = new vision.ImageAnnotatorClient();
        const imageDesc = await checkError(req, res)
        console.log(imageDesc)
        //Performs text detection on the local file
        const [result] = await client.textDetection(imageDesc.path);
        const detections = result.textAnnotations;
        const [ text, ...others ] = detections
        res.send(`Text: ${ text.description }`)

마지막으로 불면증으로 우리의 노선에 대한 POST 요청을 하면 비슷한 결과를 얻을 수 있을 것이다.

이 강좌는 Google vision API를 사용하여 Github repohere를 구축하는 방법에 대한 간단한 예시입니다.
더 강력한 버전에 대해서는 this repo 을 참조하십시오.
트위터에 팔로우 해주세요. 감사합니다. 즐거운 하루 되세요.

Reference

이 문제에 관하여(Google vision API를 사용하여 OCR 애플리케이션 구축), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://dev.to/oviecodes/building-an-ocr-app-using-google-vision-api-1mg0

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

사전 프로세서 ID#error의 목적은 무엇입니까?

GitHub 100M 이상의 파일 업로드 실패 해결

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다