성형된 CSV 파일을 터미널에 표시

17576 단어 CSV 리눅스 command Node.js Terminal

stackover flow - Command line CSV 뷰어?
질문에 나와 있듯이 CSV를 형식화 된 형태로 표시하는 CLI 도구를 찾고 있었으므로 우선 해결책을 작성합니다.

일본어의 경우는 인코딩이 Shift-jis이거나 EUC-JP의 CSV 파일(모두는 Excel 때문에, 게이츠 때문)하므로, utf-8로 변환하고 나서 표시하는 방법도 씁니다.

CSV 표시 명령

column

tty-table (node.js 명령)

column 명령

보통 cat 이나 head 로 표시하면 ,,344,,,22,, 같아서 굉장히 보기 힘든 경험 있다고 생각합니다.

$ head macrodata.csv                                                                                                                                       
year,quarter,realgdp,realcons,realinv,realgovt,realdpi,cpi,m1,tbilrate,unemp,pop,infl,realint
1959.0,1.0,2710.349,1707.4,286.898,470.045,1886.9,28.98,139.7,2.82,5.8,177.146,0.0,0.0
1959.0,2.0,2778.801,1733.7,310.859,481.301,1919.7,29.15,141.7,3.08,5.1,177.83,2.34,0.74
1959.0,3.0,2775.488,1751.8,289.226,491.26,1916.4,29.35,140.5,3.82,5.3,178.657,2.74,1.09
1959.0,4.0,2785.204,1753.7,299.356,484.052,1931.3,29.37,140.0,4.33,5.6,179.386,0.27,4.06
1960.0,1.0,2847.699,1770.5,331.722,462.199,1955.5,29.54,139.6,3.5,5.2,180.007,2.31,1.19
1960.0,2.0,2834.39,1792.9,298.152,460.4,1966.1,29.55,140.2,2.68,5.2,180.671,0.14,2.55
1960.0,3.0,2839.022,1785.8,296.375,474.676,1967.8,29.75,140.9,2.36,5.6,181.528,2.7,-0.34
1960.0,4.0,2802.616,1788.2,259.764,476.434,1966.6,29.84,141.1,2.29,6.3,182.287,1.21,1.08
1961.0,1.0,2819.264,1787.7,266.405,475.854,1984.5,29.81,142.1,2.37,6.8,182.992,-0.4,2.77

column 명령을 사용하면 공백을 좋은 느낌으로 조정합니다.
내가 사용하는 ubuntu16.04, archlinux는 표준으로 들어있었습니다.

$ column -s, -t macrodata.csv | head                                                                                                                             
year    quarter  realgdp    realcons  realinv   realgovt  realdpi  cpi      m1      tbilrate  unemp  pop      infl   realint
1959.0  1.0      2710.349   1707.4    286.898   470.045   1886.9   28.98    139.7   2.82      5.8    177.146  0.0    0.0
1959.0  2.0      2778.801   1733.7    310.859   481.301   1919.7   29.15    141.7   3.08      5.1    177.83   2.34   0.74
1959.0  3.0      2775.488   1751.8    289.226   491.26    1916.4   29.35    140.5   3.82      5.3    178.657  2.74   1.09
1959.0  4.0      2785.204   1753.7    299.356   484.052   1931.3   29.37    140.0   4.33      5.6    179.386  0.27   4.06
1960.0  1.0      2847.699   1770.5    331.722   462.199   1955.5   29.54    139.6   3.5       5.2    180.007  2.31   1.19
1960.0  2.0      2834.39    1792.9    298.152   460.4     1966.1   29.55    140.2   2.68      5.2    180.671  0.14   2.55
1960.0  3.0      2839.022   1785.8    296.375   474.676   1967.8   29.75    140.9   2.36      5.6    181.528  2.7    -0.34
1960.0  4.0      2802.616   1788.2    259.764   476.434   1966.6   29.84    141.1   2.29      6.3    182.287  1.21   1.08
1961.0  1.0      2819.264   1787.7    266.405   475.854   1984.5   29.81    142.1   2.37      6.8    182.992  -0.4   2.77

column 명령 사용법

$ column -h
使い方:
 column [オプション] [<ファイル>...]

入力を列ごとに整形します。

オプション:
 -t, --table                      create a table
 -n, --table-name <name>          table name for JSON output
 -O, --table-order <columns>      specify order of output columns
 -N, --table-columns <names>      comma separated columns names
 -E, --table-noextreme <columns>  don't count long text from the columns to column width
 -d, --table-noheadings           don't print header
 -e, --table-header-repeat        repeat header for each page
 -H, --table-hide <columns>       don't print the columns
 -R, --table-right <columns>      right align text in these columns
 -T, --table-truncate <columns>   truncate text in the columns when necessary
 -W, --table-wrap <columns>       wrap text in the columns when necessary
 -J, --json                       use JSON output format for table

 -r, --tree <column>              column to use tree-like output for the table
 -i, --tree-id <column>           line ID to specify child-parent relation
 -p, --tree-parent <column>       parent to specify child-parent relation

 -c, --output-width <width>       width of output in number of characters
 -o, --output-separator <文字列>  表出力時の列区切りを指定します。(既定値はスペース 2 つ)
 -s, --separator <文字列>         表の区切り文字列を指定します
 -x, --fillrows                   列の前に行を埋めます

 -h, --help                       このヘルプを表示します
 -V, --version                    バージョンを表示します

詳しくは column(1) をお読みください。

node.js의 tty-table 명령

nkf 명령 등으로 utf8 형식으로 변환한 csv 파일 kakeibo2016Jun-utf8.csv 를 사용합니다. 보통 head 로 표시하면 쉼표 투성이로 보기 어렵습니다.

$ head kakeibo2016Jun-utf8.csv
日付,品名/店名,カテゴリ,支払方法,金額
1月1日,神社,娯楽費,現金,135
1月1日,飲み代,飲み代,クレジット,23410
1月1日,飲み代清算,飲み代,現金,-20000
1月7日,飲み代,,,
,,,,
,,,,
,,,,
,,,,
,,,,

node.js 의 tty-table 를 사용하면, 깨끗한 표시의 테이블을 볼 수 있습니다.
먼저 node.js 의 패키지 관리자 npm 를 설치하고 npm 에서 tty-table 를 설치합니다.

$ sudo pacman -S nodejs npm
$ sudo npm i -g tty-table
$ head kakeibo2016Jun-utf8.csv | tty-table
   ┌──────┬───────────┬──────────┬──────────┬────────┐
  │ 日付 │ 品名/店名 │ カテゴリ │ 支払方法 │  金額  │
  ├──────┼───────────┼──────────┼──────────┼────────┤
  │ 1月1 │   神社    │  娯楽費  │   現金   │  135   │
  │  日  │           │          │          │        │
  ├──────┼───────────┼──────────┼──────────┼────────┤
  │ 1月1 │  飲み代   │  飲み代  │ クレジッ │ 23410  │
  │  日  │           │          │    ト    │        │
  ├──────┼───────────┼──────────┼──────────┼────────┤
  │ 1月1 │ 飲み代清  │  飲み代  │   現金   │ -20000 │
  │  日  │    算     │          │          │        │
  ├──────┼───────────┼──────────┼──────────┼────────┤
  │ 1月7 │  飲み代   │          │          │        │
  │  日  │           │          │          │        │
  ├──────┼───────────┼──────────┼──────────┼────────┤
  │      │           │          │          │        │
  ├──────┼───────────┼──────────┼──────────┼────────┤
  │      │           │          │          │        │
  ├──────┼───────────┼──────────┼──────────┼────────┤
  │      │           │          │          │        │
  ├──────┼───────────┼──────────┼──────────┼────────┤
  │      │           │          │          │        │
  ├──────┼───────────┼──────────┼──────────┼────────┤
  │      │           │          │          │        │
  └──────┴───────────┴──────────┴──────────┴────────┘

텍스트로 붙여 넣으면 모양이 무너져 보이지만 터미널에는 깨끗하게 성형 된 테이블이 표시됩니다.

tty-table 사용법

$ tty-table -h
オプション:
  --version           バージョンを表示                                    [真偽]
  --csv-delimiter     Set the field delimiter. One character only.
                                                               [デフォルト: ","]
  --csv-escape        Set the escape character. One character only.
  --csv-rowDelimiter  String used to delimit record rows. You can also use a
                      special constant: "auto","unix","max","windows","unicode".
                                                                  [デフォルト: "
                                                                              "]
  --format            Set input data format
                           [選択してください: "json", "csv"] [デフォルト: "csv"]
  --options‐*         Specify an optional setting where * is the setting name.
                      See README.md for a complete list.
  -h                  ヘルプを表示                                        [真偽]

Copyight github.com/tecfu 2018

파이썬 csvkit 패키지에 포함 된 csvlook 명령

===2018/12/29 추가===

설치는 pip나 conda로 실시합니다.

$ pip install csvkit
# または
$ conda install csvkit

$ csvlook imdb-250.csv

| Title                                                                       | title trim                                                     |  Year | Rank |
| --------------------------------------------------------------------------- | -------------------------------------------------------------- | ----- | ---- |
| Sherlock Jr. (1924)                                                         | SherlockJr.(1924)                                              | 1,924 |  221 |
| The Passion of Joan of Arc (1928)                                           | ThePassionofJoanofArc(1928)                                    | 1,928 |  212 |
| His Girl Friday (1940)                                                      | HisGirlFriday(1940)                                            | 1,940 |  250 |
| Tokyo Story (1953)                                                          | TokyoStory(1953)                                               | 1,953 |  248 |
| The Man Who Shot Liberty Valance (1962)                                     | TheManWhoShotLibertyValance(1962)                              | 1,962 |  237 |
| Persona (1966)                                                              | Persona(1966)                                                  | 1,966 |  200 |
| Stalker (1979)                                                              | Stalker(1979)                                                  | 1,979 |  243 |
| Fanny and Alexander (1982)                                                  | FannyandAlexander(1982)                                        | 1,982 |  210 |
...(略)

csvkit에는 엑셀 파일(.xlsx)을 csv화하는 명령도 있어

$ in2csv imdb-250.xlsx | head | csvlook

과 같이 하면 엑셀 파일을 csv로 변환하여 표준 출력한 것 위 10행을 주워 위와 같이 괘선을 넣어 표시해 줍니다.

참고: Data Science at the Command Line[PDF] 의 Chapter3

인코딩 문제

위의 방법으로 잘 성형되지 않으면 인코딩이 shift_jis 등이므로 오류가 발생했을 수 있습니다.
nkf 또는 iconv 에서 utf8 형식으로 수정한 다음 collumn 또는 tty-table 를 다시 시도해 봅시다.

모두는 엑셀 때문에 게이츠의 ry

파일 형식 표시

file -i <filepath>

nkf --guess <filepath>

uchardet <filepath>

$ file -i kakeibo2016Jun.csv
kakeibo2016Jun.csv: text/plain; charset=unknown-8bit

$ yaourt -S nkf
$ nkf --guess kakeibo2016Jun.csv
Shift_JIS (CRLF)
$ nkf -g kakeibo2016Jun.csv
Shift_JIS

$ sudo pacman -S uchardet
$ uchardet kakeibo2016Jun.csv
SHIFT_JIS

참고 : Linux/UNIX에서 파일 문자 코드(UTF-8 or Shift_JIS or EUC-JP…) 확인

utf-8로 변환

shift-jis 형식의 kakeibo2016Jun.csv를 utf8 형식의 kakeibo2016Jun-utf8.csv로 변환합니다.

iconv 명령으로 변환

shift-jis->utf8

$ iconv -f sjis -t utf8 -o kakeibo2016Jun{-utf8,}.csv

iconv 사용법iconv -f ENCODING -t ENCODING INPUTFILE

-f : 변환원의 문자 코드

-t : 변환 후의 문자 코드

-o : 출력하는 파일

INPUTFILE : 변환원의 파일명

iconv 사용법

$ iconv --help                                                                                                        
使用法: iconv [OPTION...] [FILE...]
与えられたファイルのエンコーディングをあるエンコーディングから別のエンコーディングに変換します。
 入力/出力形式の指定:
  -f, --from-code=NAME       元のテキストのエンコーディング
  -t, --to-code=NAME         出力用のエンコーディング

 情報:
  -l, --list
                             全ての既知の符号化された文字集合を一覧表示します

 出力制御:
  -c                         出力から無効な文字を取り除く
  -o, --output=FILE          出力ファイル
  -s, --silent               警告を抑制する
      --verbose              経過情報を表示する

  -?, --help                 このヘルプ一覧を表示する
      --usage                短い使用方法を表示する
  -V, --version              プログラムのバージョンを表示する

長い形式のオプションで必須または任意の引数は、それに対応する短い形式のオプションでも同様に必須または任意です。

For bug reporting instructions, please see:
<https://bugs.archlinux.org/>.

nkf 명령으로 변환

nkf-w로 utf8 형식으로 변환

$ nkf -w kakeibo2016Jun.csv > kakeibo2016Jun-utf8-nkf.csv
$ nkf -g kakeibo2016Jun-utf8-nkf.csv  # ファイル形式確認
UTF-8

--overwrite 선택적으로 파일을 덮어 쓰는 것 같습니다.

nkf 명령 사용법

$ nkf --help
Usage:  nkf -[flags] [--] [in file] .. [out file for -O flag]
 j/s/e/w  Specify output encoding ISO-2022-JP, Shift_JIS, EUC-JP
          UTF options is -w[8[0],{16,32}[{B,L}[0]]]
 J/S/E/W  Specify input encoding ISO-2022-JP, Shift_JIS, EUC-JP
          UTF option is -W[8,[16,32][B,L]]
 m[BQSN0] MIME decode [B:base64,Q:quoted,S:strict,N:nonstrict,0:no decode]
 M[BQ]    MIME encode [B:base64 Q:quoted]
 f/F      Folding: -f60 or -f or -f60-10 (fold margin 10) F preserve nl
 Z[0-4]   Default/0: Convert JISX0208 Alphabet to ASCII
          1: Kankaku to one space  2: to two spaces  3: HTML Entity
          4: JISX0208 Katakana to JISX0201 Katakana
 X,x      Convert Halfwidth Katakana to Fullwidth or preserve it
 O        Output to File (DEFAULT 'nkf.out')
 L[uwm]   Line mode u:LF w:CRLF m:CR (DEFAULT noconversion)
 --ic=<encoding>        Specify the input encoding
 --oc=<encoding>        Specify the output encoding
 --hiragana --katakana  Hiragana/Katakana Conversion
 --katakana-hiragana    Converts each other
 --{cap, url}-input     Convert hex after ':' or '%'
 --numchar-input        Convert Unicode Character Reference
 --fb-{skip, html, xml, perl, java, subchar}
                        Specify unassigned character's replacement
 --in-place[=SUF]       Overwrite original files
 --overwrite[=SUF]      Preserve timestamp of original files
 -g --guess             Guess the input code
 -v --version           Print the version
 --help/-V              Print this help / configuration
Network Kanji Filter Version 2.1.4 (2015-12-12) 
Copyright (C) 1987, FUJITSU LTD. (I.Ichikawa).
Copyright (C) 1996-2015, The nkf Project.

추천은 확인도 출력도 간이한 커멘드로 끝난다 nkf 를 사용하는 것입니까.file 이나 iconv 는 디폴트로 들어 있다고 생각합니다.uchardet 는 옵션도 적고, 넣지 않아도 좋다고 생각했습니다.

참고 : 【linux】 파일의 문자 코드를 변환한다. vi, iconv, nkf (nkf의 문자 코드 판정이나 일괄 변환은 편리)

Reference

이 문제에 관하여(성형된 CSV 파일을 터미널에 표시), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://qiita.com/u1and0/items/b8c412bebeb86c042829

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

Commands Tutorial - Part 1

오류 bad interpreter :Not such file or directory를 어떻게 처리했습니까?

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다