コラム

2021.04.28

【API・データ検証奮闘記】#7.WorkflowでJsonファイルを展開してみた

本コラムでは、APIやデータに関連する言語などを無邪気に触ってみた備忘録として、ライトに記載していきます!
「ちょっと違くない」「他にいい方法あるのに」といったご意見もあるかと思いますが、何卒お手柔らかに!!

<プロフィール>
富松 良介
2017年、株式会社サイバー・コミュニケーションズ(CCI)入社。Oracle Bluekai・Treasure Data等のDMPや、AWS・GCP等のPublicCloud領域を担務。2019年6月よりデータの利活用を推進するコンサルティング会社「株式会社DataCurrent」に出向し、事業会社の基盤構築・運用や自社ソリューション開発を担当。

●WorkflowでJsonファイルを展開してみた

ボス
ボス
データを格納して欲しい。Json形式らしいがよろしく頼む。
モブ
モブ
!?
かしこまり!!

TreasureData workflowを活用したJsonファイル展開の手順から格納した結果までをご紹介致します。

ステップ1. どうやってJsonファイルを展開できるか調査してみた

モブ1
モブ1
Json形式ってどうやって取り込む!?
モブ2
モブ2
1レコード単位に加工して取り込めばいけそう!?
モブ
モブ
Treasure Data Workflowで加工することにしよう!!
ボス
ボス
それでやってくれ。

●ステップ2. データ加工してみた

早速Workflowでデータを加工してみる。

インポート対象のJsonファイルは下記のとおりとなっている

全部展開して格納する

timezone: Asia/Tokyo
_export:
  td:
    database: your_database
    table: your_table

+json_parse:
  td_load>: json_parse_1.yml
  database: ${td.database}
  table: ${td.table}
in:
  type: gcs
  bucket: your_bucket
  path_prefix: path/results-${session_date_compact}.json
  auth_method: json_key
  json_keyfile: 
    content: ${secret:gcp.credential}
  parser:
    type: json
    escape: "\\\\"
    strip_whitespace: true
    delimiter: ","
    charset: UTF-8
    skip_header_lines: 0
    newline: LF
    strip_quote: true
    trim_if_not_quoted: false
    quote: "\\\""
    stop_on_invalid_record: false
    columns:
    - {name: event_date, type: string}
    - {name: event_timestamp, type: string}
    - {name: event_name, type: string}
    - {name: event_params_key, type: string, element_at: /event_params/key}
    - {name: event_params_value_string_value, type: string, element_at: /event_params/value/string_value}
    - {name: event_params_value_int_value, type: string, element_at: /event_params/value/int_value}
    - {name: event_params_value_float_value, type: string, element_at: /event_params/value/float_value}
    - {name: event_params_value_double_value, type: string, element_at: /event_params/value/double_value}
    - {name: event_previous_timestamp, type: string}
    - {name: event_value_in_usd, type: string}
    - {name: event_bundle_sequence_id, type: string}
    - {name: event_server_timestamp_offset, type: string}
    - {name: user_id, type: string}
    - {name: user_pseudo_id, type: string}
    - {name: user_properties_key, type: string, element_at: /user_properties/key}
    - {name: user_properties_value_string_value, type: string, element_at: /user_properties/value/string_value}
    - {name: user_properties_value_int_value, type: string, element_at: /user_properties/value/int_value}
    - {name: user_properties_value_float_value, type: string, element_at: /user_properties/value/float_value}
    - {name: user_properties_value_double_value, type: string, element_at: /user_properties/value/double_value}
    - {name: user_properties_value_set_timestamp_micros, type: string, element_at: /user_properties/value/set_timestamp_micros}
    - {name: user_first_touch_timestamp, type: string}
    - {name: user_ltv_revenue, type: string, element_at: /user_ltv/revenue}
    - {name: user_ltv_currency, type: string, element_at: /user_ltv/currency}
    - {name: device_category, type: string, element_at: /device/category}
    - {name: device_mobile_brand_name, type: string, element_at: /device/mobile_brand_name}
    - {name: device_mobile_model_name, type: string, element_at: /device/mobile_model_name}
    - {name: device_mobile_marketing_name, type: string, element_at: /device/mobile_marketing_name}
    - {name: device_mobile_os_hardware_model, type: string, element_at: /device/mobile_os_hardware_model}
    - {name: device_operating_system, type: string, element_at: /device/operating_system}
    - {name: device_operating_system_version, type: string, element_at: /device/operating_system_version}
    - {name: device_vendor_id, type: string, element_at: /device/vendor_id}
    - {name: device_advertising_id, type: string, element_at: /device/advertising_id}
    - {name: device_language, type: string, element_at: /device/language}
    - {name: device_is_limited_ad_tracking, type: string, element_at: /device/is_limited_ad_tracking}
    - {name: device_time_zone_offset_seconds, type: string, element_at: /device/time_zone_offset_seconds}
    - {name: device_browser, type: string, element_at: /device/browser}
    - {name: device_browser_version, type: string, element_at: /device/browser_version}
    - {name: device_web_info_browser, type: string, element_at: /device/web_info/browser}
    - {name: device_web_info_browser_version, type: string, element_at: /device/web_info/browser_version}
    - {name: device_web_info_hostname, type: string, element_at: /device/web_info/hostname}
    - {name: geo_continent, type: string, element_at: /geo/continent}
    - {name: geo_country, type: string, element_at: /geo/country}
    - {name: geo_region, type: string, element_at: /geo/region}
    - {name: geo_city, type: string, element_at: /geo/city}
    - {name: geo_sub_continent, type: string, element_at: /geo/sub_continent}
    - {name: geo_metro, type: string, element_at: /geo/metro}
    - {name: app_info_id, type: string, element_at: /app_info/id}
    - {name: app_info_version, type: string, element_at: /app_info/version}
    - {name: app_info_install_store, type: string, element_at: /app_info/install_store}
    - {name: app_info_firebase_app_id, type: string, element_at: /app_info/firebase_app_id}
    - {name: app_info_install_source, type: string, element_at: /app_info/install_source}
    - {name: traffic_source_name, type: string, element_at: /traffic_source/name}
    - {name: traffic_source_install_source, type: string, element_at: /traffic_source/install_source}
    - {name: traffic_source_medium, type: string, element_at: /traffic_source/medium}
    - {name: traffic_source_source, type: string, element_at: /traffic_source/source}
    - {name: stream_id, type: string}
    - {name: platform, type: string}
    - {name: event_dimensions_hostname, type: string, element_at: /event_dimensions/hostname}
    - {name: ecommerce_total_item_quantity, type: string, element_at: /ecommerce/total_item_quantity}
    - {name: ecommerce_purchase_revenue_in_usd, type: string, element_at: /ecommerce/purchase_revenue_in_usd}
    - {name: ecommerce_purchase_revenue, type: string, element_at: /ecommerce/purchase_revenue}
    - {name: ecommerce_refund_value_in_usd, type: string, element_at: /ecommerce/refund_value_in_usd}
    - {name: ecommerce_refund_value, type: string, element_at: /ecommerce/refund_value}
    - {name: ecommerce_shipping_value_in_usd, type: string, element_at: /ecommerce/shipping_value_in_usd}
    - {name: ecommerce_shipping_value, type: string, element_at: /ecommerce/shipping_value}
    - {name: ecommerce_tax_value_in_usd, type: string, element_at: /ecommerce/tax_value_in_usd}
    - {name: ecommerce_tax_value, type: string, element_at: /ecommerce/tax_value}
    - {name: ecommerce_unique_items, type: string, element_at: /ecommerce/unique_items}
    - {name: ecommerce_transaction_id, type: string, element_at: /ecommerce/transaction_id}
    - {name: items_item_id, type: string, element_at: /items/item_id}
    - {name: items_item_name, type: string, element_at: /items/item_name}
    - {name: items_item_brand, type: string, element_at: /items/item_brand}
    - {name: items_item_variant, type: string, element_at: /items/item_variant}
    - {name: items_item_category, type: string, element_at: /items/item_category}
    - {name: items_item_category2, type: string, element_at: /items/item_category2}
    - {name: items_item_category3, type: string, element_at: /items/item_category3}
    - {name: items_item_category4, type: string, element_at: /items/item_category4}
    - {name: items_item_category5, type: string, element_at: /items/item_category5}
    - {name: items_price_in_usd, type: string, element_at: /items/price_in_usd}
    - {name: items_price, type: string, element_at: /items/price}
    - {name: items_quantity, type: string, element_at: /items/quantity}
    - {name: items_item_revenue_in_usd, type: string, element_at: /items/item_revenue_in_usd}
    - {name: items_item_revenue, type: string, element_at: /items/item_revenue}
    - {name: items_item_refund_in_usd, type: string, element_at: /items/item_refund_in_usd}
    - {name: items_item_refund, type: string, element_at: /items/item_refund}
    - {name: items_coupon, type: string, element_at: /items/coupon}
    - {name: items_affiliation, type: string, element_at: /items/affiliation}
    - {name: items_location_id, type: string, element_at: /items/location_id}
    - {name: items_item_list_id, type: string, element_at: /items/item_list_id}
    - {name: items_item_list_name, type: string, element_at: /items/item_list_name}
    - {name: items_item_list_index, type: string, element_at: /items/item_list_index}
    - {name: items_promotion_id, type: string, element_at: /items/promotion_id}
    - {name: items_promotion_name, type: string, element_at: /items/promotion_name}
    - {name: items_creative_name, type: string, element_at: /items/creative_name}
    - {name: items_creative_slot, type: string, element_at: /items/creative_slot}
    
out: {mode: append}
exec: {}
filters:
- type: add_time
  to_column: {name: time}
  from_value: {mode: upload_time}

できた!!!

ステップ3. 取得結果

意識高いモブ
意識高いモブ
ちなみに、GoogleAnalytics(Firebase)のデータを取得したよ。

最後に

弊社では、定点的なリサーチやトレンドの分析をおこなっています。性別や年代等の属性を検索トレンドのダッシュボード提供等様々なパッケージをご用意しておりますので、お気軽にお問い合わせください。

本データに関するお問い合わせは下記にて承ります。
株式会社DataCurrent
info@datacurrent.co.jp