공식 벤치 표시 도구: Rally

이 보도는 ZOZO Advent Calendar 2021의 6일째 보도이다.

개요

Rally는 Elastic사의 공식 벤치 표기 도구로 json이 정의한 시나리오를 바탕으로 Elasticsearch에 부하를 가하는 파이톤제 CLI 도구다.
과거 실제로 새로운 영역을 획기적으로 늘렸을 때의 성능 검증 시 사용이 편리했기 때문에 이번에는 대략적인 사용법을 소개한다.

용례

벤치 표시 도구로 사용

위에서 말한 바와 같이 Rally는 벤치 표시 도구로 업그레이드할 때의 부하 검증, 새 플러그인을 가져올 때의 부하 검증 등 이용할 수 있는 도구이다.대체로 사용한 소감이지만 검색 요구가 아니라 혼합 요구에 대한 부하 상황을 측정하는 데 뛰어나다고 생각합니다.도량으로 대체로 다음과 같은 것을 얻을 수 있다.

연결 요청

처리량

지연

오류율

Elasticsearch의 집단 관련

GC duration

섀시의 메르지용 시간

인덱스 크기

또한 도구에는 Elasticsearch를 만드는 기능도 포함되어 있어 별도로 셀을 만들 필요가 없고 일본어 검색에 필요한 플러그인 주위의 설치와 사전의 설치에 정의를 추가하면 된다.
기존 폴더에 벤치 표시를 하고 싶을 때 명령줄의 매개 변수에 연결 정보를 기술해서 테스트할 수도 있습니다.

데이터 추출 도구로 사용

Rally는 데이터를 기반으로 트랙 -> 검증 블렌드를 만드는 도구입니다.또한 특정 환경에서 데이터를 추출하여 다른 환경으로 옮기는 도구로도 사용할 수 있다.평소 혼자만의 도구 등으로 쉽게 할 수 있는 이 조작은 롤리를 사용하는 일로 상당히 추천할 수 있기 때문이다.

릴리로 보자.

전제 조건

이번 검증은 다음과 같은 환경에서 이뤄졌다.

macOS BigSur

Python : 3.8.1

Java : openjdk 11.0.9

Rally는 파이톤이 만든 도구이기 때문에 동작이 3.8 이상이어야 한다.
자바의 버전에 관해서는 시작하고자 하는 Elasticsearch의 버전에 따라 다를 수 있습니다. 자세한 내용은 아래 링크를 보십시오.

설치하다.

pip로 설치할 수 있습니다.위에서 말한 바와 같다.8 이상 필요합니다.

pip3 install esrally

가볍게 뛰어봐요.

정부에서 제공하는 부하 테스트 방안(track)은 다음 명령을 통해 확인할 수 있습니다.

% esrally list tracks

Available tracks:

Name           Description                                                              Documents    Compressed Size    Uncompressed Size    Default Challenge        All Challenges
-------------  -----------------------------------------------------------------------  -----------  -----------------  -------------------  -----------------------  ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
geonames       POIs from Geonames                                                       11,396,503   252.9 MB           3.3 GB               append-no-conflicts      append-no-conflicts,append-no-conflicts-index-only,append-fast-with-conflicts,significant-text
percolator     Percolator benchmark based on AOL queries                                2,000,000    121.1 kB           104.9 MB             append-no-conflicts      append-no-conflicts
http_logs      HTTP server log data                                                     247,249,096  1.2 GB             31.1 GB              append-no-conflicts      append-no-conflicts,runtime-fields,append-no-conflicts-index-only,append-sorted-no-conflicts,append-index-only-with-ingest-pipeline,update,append-no-conflicts-index-reindex-only
geoshape       Shapes from PlanetOSM                                                    60,523,283   13.4 GB            45.4 GB              append-no-conflicts      append-no-conflicts
metricbeat     Metricbeat data                                                          1,079,600    87.7 MB            1.2 GB               append-no-conflicts      append-no-conflicts
geopoint       Point coordinates from PlanetOSM                                         60,844,404   482.1 MB           2.3 GB               append-no-conflicts      append-no-conflicts,append-no-conflicts-index-only,append-fast-with-conflicts
nyc_taxis      Taxi rides in New York in 2015                                           165,346,692  4.5 GB             74.3 GB              append-no-conflicts      indexing-querying,append-no-conflicts,append-no-conflicts-index-only,append-sorted-no-conflicts-index-only,update,append-ml,date-histogram
geopointshape  Point coordinates from PlanetOSM indexed as geoshapes                    60,844,404   470.8 MB           2.6 GB               append-no-conflicts      append-no-conflicts,append-no-conflicts-index-only,append-fast-with-conflicts
so             Indexing benchmark using up to questions and answers from StackOverflow  36,062,278   8.9 GB             33.1 GB              append-no-conflicts      append-no-conflicts
eql            EQL benchmarks based on endgame index of SIEM demo cluster               60,782,211   4.5 GB             109.2 GB             default                  default
nested         StackOverflow Q&A stored as nested docs                                  11,203,029   663.3 MB           3.4 GB               nested-search-challenge  nested-search-challenge,index-only
noaa           Global daily weather measurements from NOAA                              33,659,481   949.4 MB           9.0 GB               append-no-conflicts      sql,append-no-conflicts,append-no-conflicts-index-only,top_metrics,aggs
pmc            Full text benchmark with academic papers from PMC                        574,199      5.5 GB             21.7 GB              append-no-conflicts      indexing-querying,append-no-conflicts,append-no-conflicts-index-only,append-sorted-no-conflicts,append-fast-with-conflict

이번에 가장 가벼운 퍼블릭터를 사용하여 검증합니다.

% esrally race --distribution-version=7.15.0 --track=percolator

[INFO] Race id is [6d4f67de-b291-4950-8318-072a638027c8]
[INFO] Preparing for race ...
[INFO] Downloading track data (121.1 kB total size)                               [100.0%]
[INFO] Decompressing track data from [/Users/pakio/.rally/benchmarks/data/percolator/queries-2.json.bz2] to [/Users/pakio/.rally/benchmarks/data/percolator/queries-2.json] (resulting size: [0.10] GB) ... [OK]
[INFO] Preparing file offset table for [/Users/pakio/.rally/benchmarks/data/percolator/queries-2.json] ... [OK]
[INFO] Racing on track [percolator], challenge [append-no-conflicts] and car ['defaults'] with version [7.15.0].

Running delete-index                                                           [100% done]
Running create-index                                                           [100% done]
Running check-cluster-health                                                   [100% done]
Running index                                                                  [100% done]
Running refresh-after-index                                                    [100% done]
Running force-merge                                                            [100% done]
Running refresh-after-force-merge                                              [100% done]
Running wait-until-merges-finish                                               [100% done]
Running percolator_with_content_president_bush                                 [100% done]
Running percolator_with_content_saddam_hussein                                 [100% done]
Running percolator_with_content_hurricane_katrina                              [100% done]
Running percolator_with_content_google                                         [100% done]
Running percolator_no_score_with_content_google                                [100% done]
Running percolator_with_highlighting                                           [100% done]
Running percolator_with_content_ignore_me                                      [ 46% done]
Running percolator_with_content_ignore_me                                      [100% done]
Running percolator_no_score_with_content_ignore_me                             [100% done]

------------------------------------------------------
    _______             __   _____
   / ____(_)___  ____ _/ /  / ___/_________  ________
  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
 / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
------------------------------------------------------

|                                                         Metric |                                       Task |       Value |   Unit |
|---------------------------------------------------------------:|-------------------------------------------:|------------:|-------:|
|                     Cumulative indexing time of primary shards |                                            |     2.94787 |    min |
|             Min cumulative indexing time across primary shards |                                            |   0.0317167 |    min |
|          Median cumulative indexing time across primary shards |                                            |    0.507367 |    min |
|             Max cumulative indexing time across primary shards |                                            |     0.80695 |    min |
|            Cumulative indexing throttle time of primary shards |                                            |   0.0583667 |    min |
|    Min cumulative indexing throttle time across primary shards |                                            |           0 |    min |
| Median cumulative indexing throttle time across primary shards |                                            |           0 |    min |
|    Max cumulative indexing throttle time across primary shards |                                            |   0.0303167 |    min |
|                        Cumulative merge time of primary shards |                                            |   0.0116333 |    min |
|                       Cumulative merge count of primary shards |                                            |           1 |        |
|                Min cumulative merge time across primary shards |                                            |           0 |    min |
|             Median cumulative merge time across primary shards |                                            |           0 |    min |
|                Max cumulative merge time across primary shards |                                            |   0.0116333 |    min |
|               Cumulative merge throttle time of primary shards |                                            |           0 |    min |
|       Min cumulative merge throttle time across primary shards |                                            |           0 |    min |
|    Median cumulative merge throttle time across primary shards |                                            |           0 |    min |
|       Max cumulative merge throttle time across primary shards |                                            |           0 |    min |
|                      Cumulative refresh time of primary shards |                                            |     1.15005 |    min |
|                     Cumulative refresh count of primary shards |                                            |          59 |        |
|              Min cumulative refresh time across primary shards |                                            |   0.0188167 |    min |
|           Median cumulative refresh time across primary shards |                                            |     0.22465 |    min |
|              Max cumulative refresh time across primary shards |                                            |    0.275417 |    min |
|                        Cumulative flush time of primary shards |                                            |      0.0307 |    min |
|                       Cumulative flush count of primary shards |                                            |           9 |        |
|                Min cumulative flush time across primary shards |                                            |           0 |    min |
|             Median cumulative flush time across primary shards |                                            |           0 |    min |
|                Max cumulative flush time across primary shards |                                            |      0.0307 |    min |
|                                        Total Young Gen GC time |                                            |      15.778 |      s |
|                                       Total Young Gen GC count |                                            |        8995 |        |
|                                          Total Old Gen GC time |                                            |       0.319 |      s |
|                                         Total Old Gen GC count |                                            |           3 |        |
|                                                     Store size |                                            |   0.0781473 |     GB |
|                                                  Translog size |                                            | 3.07336e-07 |     GB |
|                                         Heap used for segments |                                            |   0.0661278 |     MB |
|                                       Heap used for doc values |                                            |   0.0077858 |     MB |
|                                            Heap used for terms |                                            |    0.036499 |     MB |
|                                            Heap used for norms |                                            |           0 |     MB |
|                                           Heap used for points |                                            |           0 |     MB |
|                                    Heap used for stored fields |                                            |    0.021843 |     MB |
|                                                  Segment count |                                            |          45 |        |
~~~ 中略 ~~~
|                                                 Min Throughput | percolator_no_score_with_content_ignore_me |       15.05 |  ops/s |
|                                                Mean Throughput | percolator_no_score_with_content_ignore_me |       15.08 |  ops/s |
|                                              Median Throughput | percolator_no_score_with_content_ignore_me |       15.07 |  ops/s |
|                                                 Max Throughput | percolator_no_score_with_content_ignore_me |       15.11 |  ops/s |
|                                        50th percentile latency | percolator_no_score_with_content_ignore_me |     11.0354 |     ms |
|                                        90th percentile latency | percolator_no_score_with_content_ignore_me |     12.6811 |     ms |
|                                        99th percentile latency | percolator_no_score_with_content_ignore_me |     13.2097 |     ms |
|                                       100th percentile latency | percolator_no_score_with_content_ignore_me |     15.3898 |     ms |
|                                   50th percentile service time | percolator_no_score_with_content_ignore_me |     6.67196 |     ms |
|                                   90th percentile service time | percolator_no_score_with_content_ignore_me |     7.16561 |     ms |
|                                   99th percentile service time | percolator_no_score_with_content_ignore_me |     7.63037 |     ms |
|                                  100th percentile service time | percolator_no_score_with_content_ignore_me |     10.8921 |     ms |
|                                                     error rate | percolator_no_score_with_content_ignore_me |           0 |      % |

위에서 말한 바와 같이 명령이 궤도를 사용할 수 있는 기준을 확인했다.

기존 인덱스에서 사용자 정의 트랙 만들기

위 절차는 릴리가 공식적으로 제공한 트랙을 사용해 검증했지만, 실제 사용 상황에서는 독자적인 데이터를 사용하려는 것으로 보인다.
이런 일도 램리가 기존 클러스터를 연결해 임의의 색인을 기반으로 트랙을 만드는 기능을 제공하는 것으로 구상됐다.
다음 명령을 사용하여 수행할 수 있습니다.

# https, 認証ありクラスタの場合
esrally create-track --track={トラック名} --target-hosts={https抜きのクラスターのアドレス:ポート} --client-options="use_ssl:true,verify_certs:true,basic_auth_user:'{ユーザ名}',basic_auth_password:'{パスワード}',http_compression:true" --indices="{生成元インデックス名}" --output-path={出力先}

# http, 認証なしクラスタの場合
esrally create-track --track={トラック名} --target-hosts={http抜きのクラスターのアドレス:ポート} --client-options="http_compression:true" --indices="{生成元インデックス名}" --output-path={出力先}

    ____        ____
   / __ \____ _/ / /_  __
  / /_/ / __ `/ / / / / /
 / _, _/ /_/ / / / /_/ /
/_/ |_|\__,_/_/_/\__, /
                /____/
 
[INFO] Connected to Elasticsearch cluster [XXXX] version [7.15.0].
 
Extracting documents for index [example...   1000/1000 docs [100.0% done]
Extracting documents for index [example]...  nnnn/nnnn docs [100.0% done]
 
[INFO] Track example has been created. Run it with: esrally --track-path=/path/to/tracks/example
 
---------------------------------
[INFO] SUCCESS (took 266 seconds)
---------------------------------

여기서 생성된 파일을 확인하세요.

% cd tracks/example
% tree .
.
├── track.json
├── example-documents-1k.json
├── example-documents-1k.json.bz2
├── example-documents.json
├── example-documents.json.bz2
└── example.json

트럭의 정의와 함께 인덱스에서 얻은 데이터를 확인하고 그 중에서 1000개의 데이터만 추출하여 인덱스 정의가 각각 만들어졌다.

사용자 정의 트랙 실행

상기 절차를 거쳐 제작된 트럭에 따라 기준을 정하는 방법.

esrally race --track-path=./tracks/example --distribution-version=7.15.0

이 때 시작하는 Elasticsearch에 플러그인이 가져오지 않았음을 주의하십시오.

플러그인 클러스터 시작

또한 Rally는 플러그인을 설치한 상태에서 클러스터의 기준도 지원합니다.
하나의 명령을 설치할 수 있는 기본은 Elastic search가 공식적으로 제공한 것으로 아래 명령을 통해 확인할 수 있다.

% esrally list elasticsearch-plugins

Available Elasticsearch plugins:

Name                     Configuration
-----------------------  ---------------
analysis-icu
analysis-kuromoji
analysis-phonetic
analysis-smartcn
analysis-stempel
analysis-ukrainian
discovery-azure-classic
discovery-ec2
discovery-file
discovery-gce
ingest-attachment
lang-javascript
lang-python
mapper-attachments
mapper-murmur3
mapper-size
repository-azure
repository-gcs
repository-hdfs
repository-s3
store-smb
transport-nio
transport-nio            transport
transport-nio            http

여기에 설명된 플러그인에서 실행하려면 명령줄 매개 변수에 -elasticsearch-plugins="{プラグイン名}"를 추가합니다.

esrally race --distribution-version=7.15.0 --track=percolator --elasticsearch-plugins="analysis-icu,analysis-kuromoji"

이외에도 플러그인을 설치한 상태에서 시작할 수 있습니다. 자세한 내용은 공식 문서를 보십시오.

사전이 추가된 그룹을 시작합니다

일본어 검색의 경우 일반적으로 시논임과 사용자 정의 사전 등을 설정한다.Rally도 사용자 정의 사전을 포함하는 그룹을 시작할 수 있지만 프로그램이 좀 복잡하기 때문에 아래 절차를 남겨 두십시오.

모든 테스트 환경 정의(Car)

Rally에서 실행 기준의 각 환경을 Car라고 합니다.Car는 ~/.rally/benchmarks/teams/default/cars/v1로 정의되어 XXX.ini 단일 디렉터리나 같은 이름의 디렉터리와 함께 처리됩니다.

% vi with-dictionary.ini

with-dictionary.ini

[meta]
description=Includes custom dictionary
type=mixin

[config]
base=with-dictionary,vanilla

% mkdir -p with-dictionary/templates/config

이렇게 하면 카의 정의와 설정 사전의 디렉터리가 완성된다.이 폴더에 사용자 정의 사전이나 시가 파일을 설정하십시오.
이때 카의 일람표를 출력하고 일람표에 새로 정의된 카가 포함되어 있는지 확인합니다.성공을 표시합니다.

% esrally list cars

Available cars:

Name                     Type    Description
-----------------------  ------  --------------------------------------
16gheap                  car     Sets the Java heap to 16GB
1gheap                   car     Sets the Java heap to 1GB
24gheap                  car     Sets the Java heap to 24GB
2gheap                   car     Sets the Java heap to 2GB
4gheap                   car     Sets the Java heap to 4GB
8gheap                   car     Sets the Java heap to 8GB
defaults                 car     Sets the Java heap to 1GB
basic-license            mixin   Basic License
debug-non-safepoints     mixin   More accurate CPU profiles
ea                       mixin   Enables Java assertions
fp                       mixin   Preserves frame pointers
g1gc                     mixin   Enables the G1 garbage collector
parallelgc               mixin   Enables the Parallel garbage collector
trial-license            mixin   Trial License
unpooled                 mixin   Enables Netty's unpooled allocator
with-dictionary          mixin   Includes custom dictionary
x-pack-ml                mixin   X-Pack Machine Learning
x-pack-monitoring-http   mixin   X-Pack Monitoring (HTTP exporter)
x-pack-monitoring-local  mixin   X-Pack Monitoring (local exporter)
x-pack-security          mixin   X-Pack Security

실행할 카를 지정합니다.

상기 절차에 따라 제작된 카나 기본 이외의 카를 사용할 때 레이스를 실행할 때 명령줄 파라미터--car="{Car名}"를 부여한다.

% esrally race --distribution-version=7.15.0 --track=percolator --car="with-dictionary"

총결산

엘라스틱이 조심스럽게 만든 벤치 표기 도구에 대해 롤리가 소개했다.
한편으로는 사용하기 편하지만 다른 한편으로는 정식 문서 이외의 정식 사용 예를 찾지 못한다. 솔직히 첫인상은 파악하기 어려웠는데 이번 검증은 나의 인상에 큰 변화를 주었다.
Java를 설치하면 GiitHub Actions 정기 실행 기준 등 꿈이 부풀어 오르는 등 트랙 정의와 카 정의를 창고에서 얻을 수 있다.

Reference

이 문제에 관하여(공식 벤치 표시 도구: Rally), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://zenn.dev/pakio/articles/esrally-tutorial

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)