Automate Google BigQuery data import with Google Cloud Functions

 3r33333. 3r3-31. We are constantly working with Google BigQuery - importing data about users, their orders and advertising costs from various sources in order to be able to combine them with each other. What does this give us? For example, if you have an online store and a customer places an order by phone, and then logs into the site, then using Google BigQuery, you can link all of his actions backdating. You can track the client’s entire journey through the marketing funnel - from the first hit on the site to buying at the brick and mortar store, and evaluate advertising campaigns taking into account such offline sales. Google BigQuery using r3r310. Google Cloud features
: FTP; FTPS; HTTP (s); Intercom; MySQL and SFTP. The principle of operation is the same: using the HTTP POST request, a Cloud function is called that receives data from the source and loads it into the Google BigQuery table. If the table already exists in the selected dataset, it will be overwritten. 3r3307.  3r33333. 3r3307.  3r33333.

Basic requirements 3r33232. 3r3307.  3r33333.
 3r33333.
Project in the Google Cloud Platform with activated billing. 3r31-10.  3r33333.
Access to editing (the “Editor” role of BigQuery data) and the execution of tasks (the role User of BigQuery tasks) for the service account of the Cloud-function in the BigQuery project where the table will be loaded; 3r31-10.  3r33333.
HTTP client to perform POST requests calling the Cloud function. 3r31-10.  3r33333. 3r3331. 3r3307.  3r33333. 3r3307.  3r33333.
Steps for setting

3r3307.  3r33333. 3r395.  3r33333.
Go to Google Cloud Platform Console and log in with your Google account, or sign up if you don’t have an account yet. 3r31-10.  3r33333.
Go to the project with activated billing or 3r348. create
new billing account for the project. 3r31-10.  3r33333.
Go to section Cloud Functions and click "Create function". Please note that will be charged for using the Cloud functions. fee . 3r31-10.  3r33333.
Fill in the following fields:
 3r33333. 3r33112. 3r3307.  3r33333. Title: for example, ftp-bq-integration or any other suitable name; 3r3307.  3r33333. 3r3307.  3r33333. Allocated memory: 2 GB or less, depending on the size of the file being processed; 3r3307.  3r33333. 3r3307.  3r33333. Trigger: 3r390. HTTP; 3r3307.  3r33333. 3r3307.  3r33333. [b] Source code: Built-in editor; 3r3307.  3r33333. 3r3307.  3r33333. Runtime: Python 3.X. 3r3307.  3r33333. 3r3307.  3r33333. 3r395.  3r33333.
Copy the contents of the main.py file into the built-in editor, the main.py tab. 3r31-10.  3r33333.
Copy the contents of the requirements.txt file into the built-in editor, the requirements.txt tab. 3r31-10.  3r33333.
Specify ftp /ftps /https as the called function, and so on, depending on the module you are using. 3r31-10.  3r33333.
In the advanced settings, increase the wait time from 60 seconds. up to 540 seconds or less, depending on the size of the file being processed. 3r31-10.  3r33333.
Complete the creation of the Cloud function by clicking on the "Create" button. 3r31-10.  3r33333. 3r33112. 3r3307.  3r33333. 3r3307.  3r33333.

3r3118. FTP
/ FTPS 3r36363. / SFTP

3r3307.  3r33333. This module is designed to transfer files from FTP (FTPS, SFTP) - servers to Google BigQuery using the Google Cloud function. The solution allows you to automatically download data to Google BigQuery from a file that is regularly updated on an FTP server. 3r3307.  3r33333. 3r3307.  3r33333. The file that needs to be received from the corresponding server can be in any suitable extension (.json, .txt, .csv), but must be in one of the following formats: JSON (newline-delimited) or Comma-separated values ​​(CSV). 3r3307.  3r33333. 3r3307.  3r33333.

An example of using 3r33272. 3r3307.  3r33333.
from urllib import urlencode
from httplib2 import Http
3r33333. trigger_url = "https://REGION-PROJECT_ID.cloudfunctions.net/ftp/"
headers = {"Content-Type": "application /json"}
payload = {
"ftp":
{
"user": "ftp.user_name",
"psswd": "ftp.password",
"path_to_file": "ftp: //server_host /path /to /file /"
},
"bq":
{
"project_id": "my_bq_project",
"dataset_id": "my_bq_dataset",
"table_id": "my_bq_table",
"delimiter": ",",
"source_format": "NEWLINE_DELIMITED_JSON",
"location": "US"
}
}
Http (). Request (trigger_url, "POST", urlencode (payload), headers = headers)
3r3302. 3r3307.  3r33333. 3r3307.  3r33333.
3r33170. HTTP (s)

3r3307.  3r33333. Module for transferring files from HTTPS servers to Google BigQuery. 3r3307.  3r33333. 3r3307.  3r33333.

An example of using 3r33272. 3r3307.  3r33333.
from urllib import urlencode
from httplib2 import Http
3r33333. trigger_url = "https://REGION-PROJECT_ID.cloudfunctions.net/https/"
headers = {"Content-Type": "application /json"}
payload = {
"https":
{
"path_to_file": "https: //server_host /path /to /file /",
"user": "https.user_name",
"psswd": "https.password"
},
"bq":
{
"project_id": "my_bq_project",
"dataset_id": "my_bq_dataset",
"table_id": "my_bq_table",
"delimiter": ",",
"source_format": "CSV",
"location": "US"
}
}
3r33333. Http (). Request (trigger_url, "POST", urlencode (payload), headers = headers)
3r3302. 3r3307.  3r33333. 3r3307.  3r33333.
Intercom 3r36332.

3r3307.  3r33333. Module to automate data transfer from Intercom to Google BigQuery using the Google Cloud function. Currently, the module allows you to import entities from Intercom such as: users, companies, contacts, admins, conversations, teams, tags, segments. At the same time, the module does not support custom attributes. 3r3307.  3r33333. 3r3307.  3r33333.

An example of using 3r33272. 3r3307.  3r33333.
from urllib import urlencode
from httplib2 import Http
3r33333. trigger_url = "https://REGION-PROJECT_ID.cloudfunctions.net/intercom/"
headers = {"Content-Type": "application /json"}
payload = {
"intercom": {
"accessToken": "INTERCOM ACCESS TOKEN",
"entities":[
"users",
"companies",
"contacts",
"admins",
"conversations",
"teams",
"tags",
"segments"
]3r33333.},
"bq": {
"project_id": "YOUR GCP PROJECT",
"dataset_id": "YOUR DATASET NAME",
"location": "US"
}
}
Http (). Request (trigger_url, "POST", urlencode (payload), headers = headers)
3r3302. 3r3307.  3r33333. 3r3307.  3r33333.
MySQL

3r3307.  3r33333. Used to transfer files from MySQL servers to Google BigQuery using the Google Cloud function. This solution allows you to automatically download data to Google BigQuery from tables that are regularly updated on the MySQL server. 3r3307.  3r33333. 3r3307.  3r33333.
An example of using 3r33272. 3r3307.  3r33333.
from urllib import urlencode
from httplib2 import Http
3r33333. trigger_url = "https://REGION-PROJECT_ID.cloudfunctions.net/mysql/"
headers = {"Content-Type": "application /json"}
payload = {
"mysql":
{
"user": "mysql.user",
"psswd": "mysql.password",
"host": "host_name",
"port": 330?
"database": "database_name",
"table_id": "table_name",
"query": "SELECT * FROM table_name" 3r3193319.},
"bq": 3rr3319. {
"Project_id": "my_bq_projec",
"Dataset_id": "my_bq_dataset",
"Table_id": "my_bq_table"
}
) ", urlencode (payload), headers = headers)

 3r33333. More detailed documentation for each module can be found in the readme files in each section. 3r3307.  3r33333. 3r3307.  3r33333. This is just the beginning, and now we are working on scripts for Bitrix and amoCRM, because we see that they are the most popular among our customers. Share what methods you use to merge data and what integrations you lack for this. 3r33333. 3r33333. 3r33333. 3r33312. ! function (e) {function t (t, n) {if (! (n in e)) {for (var r, a = e.document, i = a.scripts, o = i.length; o-- ;) if (-1! == i[o].src.indexOf (t)) {r = i[o]; break} if (! r) {r = a.createElement ("script"), r.type = "text /jаvascript", r.async =! ? r.defer =! ? r.src = t, r.charset = "UTF-8"; var d = function () {var e = a.getElementsByTagName ("script")[0]; e.parentNode.insertBefore (r, e)}; "[object Opera]" == e.opera? a.addEventListener? a.addEventListener ("DOMContentLoaded", d,! 1): e.attachEvent ("onload", d ): d ()}}} t ("//mediator.mail.ru/script/2820404/"""_mediator") () (); 3r33333. 3r33333. 3r33333. 3r33333. 3r33333. 3r33333. 3r33333.
+ 0 -

Add comment