Mungeol Heo

Saturday, July 2, 2022

Data Architecture, GCP

Data

BigQuery

App Architecture, GCP

App

App, headless

App, headless, solution

GCP

DNS	domain name system
CDN	content delivery network

Armor	defense against web and DDoS attacks
Apigee Sense	behavior detection to protect APIs
reCAPTCHA Enterprise	protect your website from fraudulent activity, spam, and abuse

VPC	virtual network for google cloud resources
NAT	giving private instances internet access

Load Balancing	distributing traffic				global, regional

Run	running containerized apps	serverless	auto scaling		regional
App Engine	apps and backends	serverless	auto scaling		regional
GKE	running containerized apps		auto scaling	HA	regional

Functions	creating functions that respond to cloud events	serverless	auto scaling		regional

SQL	MySQL, PostgreSQL, SQL server		scalable	HA	regional
Spanner	cloud-native relational database		auto sharding	HA	multi region

Firestore	cloud-native document database	serverless	auto scaling		multi region

Memorystore	managed Redis and Memcached		scalable	HA	regional

pub/sub	event ingestion and delivery

Storage	object storage

API Gateway	develop, deploy, secure, and manage APIs
Endpoints
Apigee	API management, development, and security platform

Security command center	a platform for defending against threats to your google cloud assets

Operations suite
Monitoring	infrastructure and application health
Logging	audit, platform, and application logs management
Error Reporting	exception monitoring and alerting
Debugger	app state inspection and in-production debugging
Trace	collecting latency data from an app
Profiler	app performance

Firebase	app development platform

Discovery solutions for Retail	search and recommendation

commercetools	https://console.cloud.google.com/marketplace/product/commercetools-public/commercetools-platform
Elastic Path	https://aws.amazon.com/marketplace/pp/prodview-ili75oolxnkcg?sr=0-2&ref_=beagle&applicationId=AWSMPContessa
x2bee	https://x2bee.plateer.com/gac/concept

CI/CD

솔루션 후보

commercetools	100% cloud-native, 100% API-first and 100% global	Headless	GCP
Elastic Path	The enterprise cloud infrastructure, 99.99% uptime	Headless	AWS
x2bee		Headless	국내

글로벌 커머스 솔루션 경쟁력

Saturday, June 25, 2022

AI - Monitoring

Data drift

What

Feature
Population
Covariate shift

Enough of the data needs to be labeled to introduce new classes
Retrain the model

Concept drift

What

Pt(X, y) != Pt+1(X, y)
Sudden
Gradual
Incremental
Recurring

The old data needs to be relabeled. Retrain the model
Use an ensemble approach to train your new model

Prediction drift

Label drift

Training, serving skew

Tensorflow data validation

Thursday, June 9, 2022

Help 4 Other

Google Sheets

IMPORTRANGE does not support importing data from a connected sheet

Use Pivot table or Extract first
Then use the function on these data

Python

sqlalchemy.exc.ObjectNotExecutableError

Check the version
sqlalchemy==1.4.42

TypeError: Casting to unit-less dtype 'datetime64' is not supported. Pass e.g. 'datetime64[ns]' instead.

Check the version
pandas==1.3.5

__init__() got multiple values for argument 'schema'

pip install sqlalchemy==1.4.46

VS code

(.venv) workspace-vs % python -V

zsh: command not found: python

add "python.experiments.optOutFrom": ["pythonTerminalEnvVarActivation"] to settings.json

Monday, May 16, 2022

Help 4 GCP

Cloud Workflows

Access Denied: BigQuery BigQuery: Permission denied while getting Drive credentials

Option 1

http.post + googleapis.bigquery.v2.jobs.query
call: http.post args: url: ${"https://bigquery.googleapis.com/bigquery/v2/projects/"+project+"/queries"} headers: Content-type: "application/json" auth: type: OAuth2 scope: ["https://www.googleapis.com/auth/drive","https://www.googleapis.com/auth/cloud-platform","https://www.googleapis.com/auth/bigquery"] body: query: select * from sheets.sheets_data timeoutMs: 200000 useLegacySql: false result: response
Google Sheets > Share > Add the Service Account runs query > Viewer > Done

Option 2

Scheduled query + googleapis.bigquerydatatransfer.v1
call: googleapis.bigquerydatatransfer.v1.projects.locations.transferConfigs.startManualRuns args: parent: ${scheduled_query_name} body: requestedRunTime: ${time.format(sys.now())} result: response

BigQuery

string_field_0 after a Google Sheets file created as an external table
- Add a number column if all columns are in text format
Permission denied while getting Drive credentials

Google Sheets > Share > Add the Service Account runs query > Viewer > Done

Cloud Functions

Your client does not have permission to get URL
- Cloud Functions Developer
  - It will take time to be applied
failed to export: failed to write image to the following tags

Use “gcloud beta function” and “--docker-registry=artifact-registry”

Your client does not have permission to get URL /yourUrl from this server

auth:

type: OIDC

Cloud Run

DefaultCredentialsError: Neither metadata server or valid service account credentials are found
- Use a service account
- Cloud Run Invoker
  - It will take time to be applied

Data Fusion

The client is not authorized to make this request

Add the related role to the data fusion service account
E.g., Add Cloud SQL Client to service-[project-number]@gcp-sa-datafusion.iam.gserviceaccount.com

MongoSocketException, UnknownHostException

Use all shard hosts

MongoSocketReadException, Prematurely reached end of stream

Add ssl=true

mongodb-plugins, Authentication failed

Add authSource=admin

Data Studio

interval 1 day, Invalid formula

Use interval 24 hour

Cloud Storage

Error getting access token from metadata server at

val hadoopConf = spark.sparkContext.hadoopConfiguration
hadoopConf.set("google.cloud.auth.service.account.enable", "true")
hadoopConf.set("google.cloud.auth.service.account.json.keyfile", "yourKey.json")