Data Processing in Shell
옵션 -
--
의 차이
-
: 축약--
: 서술형
data processing with csvkit
# Take top 15 rows from sorted output and save to new file
csvsort -c 2 Spotify_Popularity.csv | head -n 15 > Spotify_Popularity_Top15.csv
# Convert the Spotify201809 tab into its own csv file
in2csv Spotify_201809_201810.xlsx --sheet "Spotify201809" > Spotify201809.csv
# Check to confirm name and location of data file
ls
# Preview file preview using a csvkit function
csvlook Spotify201809.csv
# Create a new csv with 2 columns: track_id and popularity
csvcut -c "track_id","popularity" Spotify201809.csv > Spotify201809_subset.csv
# Convert the Spotify201809 tab into its own csv file
in2csv Spotify_201809_201810.xlsx --sheet "Spotify201809" > Spotify201809.csv
# Check to confirm name and location of data file
ls
# Preview file preview using a csvkit function
csvlook Spotify201809.csv
# Create a new csv with 2 columns: track_id and popularity
csvcut -c "track_id","popularity" Spotify201809.csv > Spotify201809_subset.csv
# While stacking the 2 files, create a data source column
csvstack -g "Sep2018","Oct2018" Spotify201809_subset.csv Spotify201810_subset.csv > Spotify_all_rankings.csv
pulling data from databases
supported by sql2csv : firebird, microsoft sql server, mysql, postgresql
- not MongoDB
# Verify database name
ls
# Save query to new file Spotify_Popularity_5Rows.csv
sql2csv --db "sqlite:///SpotifyDatabase.db" \
--query "SELECT * FROM Spotify_Popularity LIMIT 5" \
> Spotify_Popularity_5Rows.csv
# Verify newly created file
ls
# Print preview of newly created file
csvlook Spotify_Popularity_5Rows.csv
pushing data in to DB
# Preview file
ls
# Upload Spotify_MusicAttributes.csv to database
csvsql --db "sqlite:///SpotifyDatabase.db" --insert Spotify_MusicAttributes.csv
# Store SQL query as shell variable
sqlquery="SELECT * FROM Spotify_MusicAttributes"
# Apply SQL query to re-pull new table in database
sql2csv --db "sqlite:///SpotifyDatabase.db" --query "$sqlquery"
# Store SQL for querying from SQLite database
sqlquery_pull="SELECT * FROM SpotifyMostRecentData"
# Apply SQL to save table as local file
sql2csv --db "sqlite:///SpotifyDatabase.db" --query "$sqlquery_pull" > SpotifyMostRecentData.csv
# Store SQL for UNION of the two local CSV files
sqlquery_union="SELECT * FROM SpotifyMostRecentData UNION ALL SELECT * FROM Spotify201812"
# Apply SQL to union the two local CSV files and save as local file
csvsql --query "$sqlquery_union" SpotifyMostRecentData.csv Spotify201812.csv > UnionedSpotifyData.csv
# Push UnionedSpotifyData.csv to database as a new table
csvsql --db "sqlite:///SpotifyDatabase.db" --insert UnionedSpotifyData.csv
$
: shell variable
python in shell
- pip list
- crontab -l
# Add CRON job that runs create_model.py every minute
echo "* * * * * python create_model.py" | crontab
# Verify that the CRON job has been scheduled via CRONTAB
crontab -l
Author And Source
이 문제에 관하여(Data Processing in Shell), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://velog.io/@jee-9/Data-Processing-in-Shell저자 귀속: 원작자 정보가 원작자 URL에 포함되어 있으며 저작권은 원작자 소유입니다.
우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)