YOLO를 사용하여 단일 이미지에서 여러 QR 코드 디코딩을 위한 매개변수를 최적화하는 방법

Dynamsoft Barcode Reader SDK 개발자는 다양한 바코드 스캔 시나리오에 대해 알고리즘 매개변수를 사용자 정의할 수 있습니다. 디코딩 성능을 위해 미리 구성된 매개변수 템플릿이 몇 가지 있습니다. 궁극적인 디코딩 속도, 최고의 디코딩 정확도 또는 속도와 정확도 간의 균형을 추구하더라도 항상 적절한 것이 있습니다. 매개변수가 중요하므로 어떻게 올바르게 설정할 수 있습니까? 대부분의 바코드 스캔 시나리오의 경우 기본 매개변수가 잘 작동해야 합니다. 이 문서에서는 단일 이미지에서 여러 QR 코드를 디코딩하는 복잡한 시나리오에 대해 설명합니다. Dynamsoft Barcode Reader SDK에 사용되는 바코드 유형과 예상 QR 코드 수를 결정하기 위해 YOLO 소형 모델을 교육합니다. 두 매개변수가 디코딩 성능에 어떤 영향을 미치는지 알 수 있습니다.

설치

Dynamsoft 바코드 리더 Python

pip install dbr

OpenCV 파이썬

pip install opencv-python

다크넷

git clone https://github.com/AlexeyAB/darknet.git

Dynamsoft Barcode SDK의 사전 구성된 매개변수 템플릿을 가져오고 사용하는 방법

방문 Dynamsoft Barcode Reader online demo

모드를 선택한 다음 Advanced Settings 를 클릭합니다.

Template 섹션까지 아래로 스크롤합니다. ResultCoordinateType를 Pixel로 변경 후 복사 버튼을 클릭합니다.

템플릿을 JSON 파일로 저장합니다. 비교를 위해 모든 템플릿을 최고 속도에서 최고 적용 범위까지 저장하고 이름을 l1.json ~ l5.json 로 지정합니다.

다음은 여러 QR 코드가 포함된 테스트 이미지입니다.

다양한 매개변수 템플릿을 기반으로 QR 코드 디코딩 성능을 비교하는 Python 프로그램을 작성할 수 있습니다.

import cv2 as cv
import numpy as np
import time
from dbr import *
import os

reader = BarcodeReader()
# Apply for a trial license: https://www.dynamsoft.com/customer/license/trialLicense?product=dbr
license_key = "LICENSE-KEY"
reader.init_license(license_key)

def decode(filename, template_name):
    frame = cv.imread(filename)

    template_path = os.path.dirname(os.path.abspath(__file__)) + os.path.sep + template_name
    settings = reader.reset_runtime_settings() 
    error = reader.init_runtime_settings_with_file(template_path, EnumConflictMode.CM_OVERWRITE)

    before = time.time()
    results = reader.decode_buffer(frame)
    after = time.time()

    COLOR_RED = (0,0,255)
    thickness = 2
    if results != None:
        found = len(results)
        for result in results:
            text = result.barcode_text 
            points = result.localization_result.localization_points
            data = np.array([[points[0][0], points[0][1]], [points[1][0], points[1][1]], [points[2][0], points[2][1]], [points[3][0], points[3][1]]])
            cv.drawContours(image=frame, contours=[data], contourIdx=-1, color=COLOR_RED, thickness=thickness, lineType=cv.LINE_AA)
            cv.putText(frame, result.barcode_text, points[0], cv.FONT_HERSHEY_SIMPLEX, 0.5, COLOR_RED)

        cv.putText(frame, '%.2f s, Qr found: %d' % (after - before, found), (20, 20), cv.FONT_HERSHEY_SIMPLEX, 0.5, COLOR_RED)
    else:
        cv.putText(frame, '%.2f s, Qr found: %d' % (after - before, 0), (20, 20), cv.FONT_HERSHEY_SIMPLEX, 0.5, COLOR_RED)

    cv.imshow(template_name, frame)

decode("test.jpg", "l1.json")
decode("test.jpg", "l2.json")
decode("test.jpg", "l3.json")
decode("test.jpg", "l4.json")
decode("test.jpg", "l5.json")
cv.waitKey(0)

성능

속도와 정확도
l1
l2
l3
4
5

경과 시간
0.07초
0.13초
6.23초
10.63초
22.83초

QR 카운트
1
1
4
5
6

L1

다중 QR 코드 시나리오에서 가능한 한 많은 QR 코드를 찾기를 바랍니다. l5 템플릿이 가장 많은 QR 코드를 찾았지만 시간 비용은 견딜 수 없습니다. 이미지에 QR 코드가 몇 개 있는지 알면 QR 디코딩 속도를 높일 수 있습니까? 기계 학습을 사용하여 QR 코드 감지를 수행하여 이 가설을 검증해 보겠습니다.

YOLOv4로 QR 코드 감지기 교육

우리는 YOLO 모델을 사용하여 QR 코드 감지기를 훈련합니다.

boofcv 에서 QR 코드로 공개 데이터 세트를 가져옵니다.

QR 이미지에 labelImg 주석을 추가하십시오.

다운로드 yolov4-tiny.conv.29

darknet/cfg/yolov4-tiny-custom.cfg를 기반으로 구성 파일 사용자 정의:

batch=64              # line 6
subdivisions=16       # line 7
width=640             # line 8
height=640            # line 9

max_batches = 6000    # line 20

steps=4800,5400       # line 22

filters=18            # 212
classes=1             # 220

filters=18            # 263
classes=1             # 269

obj.data 파일 생성:

QR_CODE

obj.names 파일 생성:

classes = 1
train  = data/train.txt
valid  = data/test.txt
names = data/obj.names
backup = backup/

다음 스크립트를 사용하여 훈련 및 검증 데이터를 생성합니다.

import os
import re
from shutil import copyfile
import argparse
import math
import random

def iterate_dir(source, ratio):
    source = source.replace('\\', '/')
    train_dir = 'data/obj/train'
    test_dir = 'data/obj/test'

    if not os.path.exists(train_dir):
        os.makedirs(train_dir)
    if not os.path.exists(test_dir):
        os.makedirs(test_dir)

    images = [f for f in os.listdir(source)
              if re.search(r'([a-zA-Z0-9\s_\\.\-\(\):])+(?i)(.jpg|.jpeg|.png)$', f)]

    num_images = len(images)
    num_test_images = math.ceil(ratio*num_images)

    image_files = []

    for i in range(num_test_images):
        idx = random.randint(0, len(images)-1)
        filename = images[idx]
        image_files.append("data/obj/test/" + filename)
        copyfile(os.path.join(source, filename),
                os.path.join(test_dir, filename))
        txt_filename = os.path.splitext(filename)[0]+'.txt'
        copyfile(os.path.join(source, txt_filename),
                os.path.join(test_dir, txt_filename))

        images.remove(images[idx])

    with open("data/test.txt", "w") as outfile:
        for image in image_files:
            outfile.write(image)
            outfile.write("\n")
        outfile.close()

    image_files = []

    for filename in images:
        image_files.append("data/obj/train/" + filename)
        copyfile(os.path.join(source, filename),
                os.path.join(train_dir, filename))
        txt_filename = os.path.splitext(filename)[0]+'.txt'
        copyfile(os.path.join(source, txt_filename),
                os.path.join(train_dir, txt_filename))

    with open("data/train.txt", "w") as outfile:
        for image in image_files:
            outfile.write(image)
            outfile.write("\n")
        outfile.close()

def main():
    parser = argparse.ArgumentParser(description="Partition dataset of images into training and testing sets",
                                    formatter_class=argparse.RawTextHelpFormatter)
    parser.add_argument(
        '-i', '--imageDir',
        help='Path to the folder where the image dataset is stored. If not specified, the CWD will be used.',
        type=str,
        default=os.getcwd()
    )
    parser.add_argument(
        '-r', '--ratio',
        help='The ratio of the number of test images over the total number of images. The default is 0.1.',
        default=0.1,
        type=float)
    args = parser.parse_args()
    iterate_dir(args.imageDir, args.ratio)

if __name__ == '__main__':
    main()

스크립트를 실행합니다.

python partition_dataset.py -i ../images -r 0.1

모델 학습:

darknet detector test data/obj.data yolov4-tiny-custom.cfg backup/yolov4-tiny-custom_last.weights sample/test.png

모델 검증:

darknet detector test data/obj.data yolov4-tiny-custom.cfg backup/yolov4-tiny-custom_last.weights sample/test.png

다음 코드에서는 기계 학습 모델을 사용하여 이미지에서 QR 코드를 먼저 감지합니다. QR 코드가 발견되면 expected_barcodes_count = 1 및 barcode_format_ids = EnumBarcodeFormat.BF_QR_CODE 매개변수를 설정할 수 있습니다.

import cv2 as cv
import numpy as np
import time
from dbr import *
import os

# Initialize Dynamsoft Barcode Reader
reader = BarcodeReader()
# Apply for a trial license: https://www.dynamsoft.com/customer/license/trialLicense
license_key = "LICENSE-KEY"
reader.init_license(license_key)

# Load YOLOv4-tiny model
class_names = open('obj.names').read().strip().split('\n')
net = cv.dnn.readNetFromDarknet('yolov4-tiny-custom.cfg', 'yolov4-tiny-custom_last.weights')
net.setPreferableBackend(cv.dnn.DNN_BACKEND_OPENCV)

model = cv.dnn_DetectionModel(net)
width = 640
height = 640

CONFIDENCE_THRESHOLD = 0.2
NMS_THRESHOLD = 0.4
COLOR_RED = (0,0,255)
COLOR_BLUE = (255,0,0)

def decode(filename, template_name):
    frame = cv.imread(filename)

    if frame.shape[1] > 1024 or frame.shape[0] > 1024:
        width = 1024
        height = 1024
    model.setInputParams(size=(width, height), scale=1/255, swapRB=True)

    template_path = os.path.dirname(os.path.abspath(__file__)) + os.path.sep + template_name
    settings = reader.reset_runtime_settings() 
    error = reader.init_runtime_settings_with_file(template_path, EnumConflictMode.CM_OVERWRITE)

    # YOLO detection
    yolo_start = time.time()
    classes, scores, boxes = model.detect(frame, CONFIDENCE_THRESHOLD, NMS_THRESHOLD)
    yolo_end = time.time()
    print("YOLO detection time: %.2f s" % (yolo_end - yolo_start))

    index = 0
    dbr_found = 0
    total_dbr_time = 0
    for (classid, score, box) in zip(classes, scores, boxes):
        label = "%s : %f" % (class_names[classid], score) 
        tmp = frame[box[1]:box[1] + box[3], box[0]: box[0] + box[2]]

        # Set parameters for DBR
        settings = reader.get_runtime_settings()
        settings.expected_barcodes_count = 1
        settings.barcode_format_ids = EnumBarcodeFormat.BF_QR_CODE
        reader.update_runtime_settings(settings)

        before = time.time()
        results = reader.decode_buffer(tmp)
        after = time.time()

        total_dbr_time += after - before

        if results != None:
            found = len(results)
            for result in results:
                text = result.barcode_text 
                dbr_found += 1
                points = result.localization_result.localization_points
                data = np.array([[points[0][0], points[0][1]], [points[1][0], points[1][1]], [points[2][0], points[2][1]], [points[3][0], points[3][1]]])
                cv.drawContours(image=tmp, contours=[data], contourIdx=-1, color=(0, 0, 255), thickness=2, lineType=cv.LINE_AA)
                cv.putText(frame, text, (box[0], box[1] + 10), cv.FONT_HERSHEY_SIMPLEX, 0.5, COLOR_RED)
        else:
            found = 0

        index += 1
        cv.rectangle(frame, box, COLOR_BLUE, 2)
        cv.putText(frame, label, (box[0], box[1] - 10), cv.FONT_HERSHEY_SIMPLEX, 0.5, COLOR_BLUE, 2)

    cv.putText(frame, 'DBR+YOLO %.2f s, DBR found: %d, YOLO found: %d' % (yolo_end - yolo_start + total_dbr_time, dbr_found, len(classes)), (0, 15), cv.FONT_HERSHEY_SIMPLEX, 0.5, COLOR_RED)
    cv.imshow(template_name, frame)

decode("test.jpg", "l1.json")
decode("test.jpg", "l2.json")
decode("test.jpg", "l3.json")
decode("test.jpg", "l4.json")
decode("test.jpg", "l5.json")
cv.waitKey(0)

성능

속도와 정확도
l1
l2
l3
4
5

경과 시간
0.23초
0.4초
2.84초
3.99초
7.79초

QR 카운트
4
4
5
6
6

L1 + 욜로

L2 + 욜로

L3++ 욜로

L4 + 욜로

L5 + 욜로

머신러닝을 사용하기 전에 이 테스트 이미지에서 6개의 QR 코드를 찾으려면 l5 템플릿을 사용해야 하며 약 22.83초가 걸립니다. YOLO 감지 사용 후 시간 비용이 3.99초로 감소합니다.

Boofcv QR 이미지 세트에 대한 벤치마크

가장 균형 잡힌 l3 템플릿으로 설정된 boofcv QR 이미지를 벤치마킹합니다. 디코딩 속도가 크게 향상되었습니다.

소스 코드

https://github.com/yushulx/barcode-qrcode-images/tree/main/darknet/sample/qr_decoding

Reference

이 문제에 관하여(YOLO를 사용하여 단일 이미지에서 여러 QR 코드 디코딩을 위한 매개변수를 최적화하는 방법), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://dev.to/yushulx/how-to-optimize-parameters-for-multiple-qr-code-decoding-in-a-single-image-with-yolo-1c26

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)