More than 5 years have passed since last update.

憂いの篩

Last updated at 2018-02-12Posted at 2016-09-08

興味ありすぎて検証したいけど後回しになっているものを吐き出してスッキリする場所。

Kubernetes

oracle/navarkos
- Federated Clusters間でPodをリバランスする

Pod Disruption Budget

Spent ~2 hours yesterday trying out Kubernetes Pod Disruption Policies and see if they work as advertised.

I hit multiple issues and ended up writing a few pages long friction log instead.

Great in theory, but doesn’t work the way users expect.https://t.co/ddgim1Hq3L
— Ahmet Alp Balkan (@ahmetb) 2017年12月14日

Helm

Stages
- helm installはK8Sによるローリングアップデート等基本的なデプロイ戦略にしか対応しない。そうではなく、デプロイに複数ステージを含められないか、という提案。たとえばDBマイグレーションを実行してからデプロイ、のように。
  - Brigadeを使う方向性らしい
    - 例: https://github.com/technosophos/hello-helm/blob/7b8b7fb4dd0d9cb31b7e789b8874800b2a80ae31/brigade.js#L22-L27

Servish Mesh

istioとlinkerdのメモリ消費の比較
- https://groups.google.com/forum/#!topic/istio-users/Rd3aJSzvWaw
- Squeezing blood from a stone: small-memory JVM techniques for microservice sidecars » Buoyant » open source service mesh for cloud native applications
スケーラビリティ
- Scale issues in istio-pilot discovery service, affecting alpha perf targets and installation · Issue #1485 · istio/istio

Service Proxy

lyft/envoy

API Gateway

Kong
- Proxy部分にはnginx利用
- OpenTracing非対応
  - issueはある
Ambassador
- Proxy部分にはenvoy利用
- OpenTracing非対応
- istioと組み合わせられる
  - iostioがinternal service communication、ambassadorがapi gateway
  - ambassadorの機能はistioに徐々に移植されていく

Distributed Tracing

OpenTracing
- Samplingの仕様がまだ定まっていない
  - https://github.com/opentracing/specification/issues/11
  - jaegerはサンプリングの独自実装あり、Instanaはなし
- grpc-opentracing-java
  - GRPC Server、Client用のInterceptorあり
    - https://github.com/grpc-ecosystem/grpc-opentracing/tree/master/java
  - http server/client部分はistioなどでカバーできるはず
  - あとはJava内のメソッドコールなどのトレースをどうやってとるか
Tracers
- http://opentracing.io/documentation/pages/supported-tracers
- LightStep
  - 各言語のtracerがgrpc対応やauto instrumentation対応してない。あえてこれを選ぶ理由はない
- sky-walking
  - jdbcなどのauto instrumentationとopentracingに対応
    - skywalkingとgrpc-opentracing-javaの組み合わせはあり
    - またはopentracingを使って自前実装
      - https://github.com/wu-sheng/sky-walking/issues/169
- Hawkular APM
  - https://medium.com/opentracing/building-the-open-source-hawkular-apm-at-red-hat-c3e4f0157513
  - 担当者がjaegerに移行(http://www.hawkular.org/hawkular-apm/)
- jaeger
  - 強制・ランダムなどアなどさまざまなサンプリング戦略に対応
    - https://github.com/uber/jaeger-client-node/tree/master/src/samplers
  - jaeger-client-node
    - tchannel-node用ハンドラ付き
      - tchannelによるRPCをjaeger経由で自動的にトレースできる
        
        tchannel経由で他サービスに通信するときは自動的にトレースしてくれる
      - 他は自前でInstrument
      - jaeger-clientはopentracing tracerの実装を提供。他のopentracing tracer同様に使える
- Instana
  - instana/ruby-sensor
    - 基本的にはInstrumentationは自動
      - grpc対応済み(https://github.com/instana/ruby-sensor/pull/77)
    - 自動的にInstrumentできないところに関しては、
      - opentracing-ruby準拠なので、他のopentracing-rubyクライアントと同様にも使える
      - 独自で非同期処理、別スレッド、ジョブキューに投げたジョブまでトレースできるAPIを用意

Web Server vs Load Balancer

Interesting article: https://thehftguy.com/2016/10/03/haproxy-vs-nginx-why-you-should-never-use-nginx-for-load-balancing/

Nginx as web server
Nginx + nginx-module-vts as load balancer
- https://github.com/vozlt/nginx-module-vts
- nginx-vts-exporter to integrate with prometheus
HAProxy as load balancer

Observability

Monitoring

Relevant discussions & articles:

2015.10.26 database - Usecases: InfluxDB vs. Prometheus - Stack Overflow
2017.4.17 Prometheus and InfluxDB: InfluxDB as a Remote Storage Back End

SparkSQL

Dashboarding

Relevant discussion:

OSSes:

Redash
Superset
Atlasboard
Cyclotron

BI用にはどのDashboardを選ぶべき？

Redash
- SparkSQL対応の可能性あり
  - 標準的なSQLにとどまらないクエリをサポートできる
    https://discuss.redash.io/t/creating-a-new-query-runner-data-source-in-redash/347
  - 参考: MemSQL対応 https://github.com/getredash/redash/pull/1746
- SQLでUIに表示されるfilterやparameterも定義できる
  - https://news.ycombinator.com/item?id=13598402
Supserset
- SparkSQL
  - SQLAlchemyの対応待ちだけど難しそう
    - https://github.com/apache/incubator-superset/issues/241
  - pyhive経由でいけるという情報もあり
    - https://github.com/apache/incubator-superset/pull/803/files
- JOINも一応可能
  - https://github.com/apache/incubator-superset/issues/875
- PostgresのTIMESTAMP WITH TIMEZONEに対応してない
  - https://github.com/apache/incubator-superset/issues/1900

ログ蓄積・検索基盤のセルフホスティング

ELK?
Grafana + Graylog2?
- Graylog2に蓄積したログでGrafanaのグラフにアノテーションをつけることができる
  - http://www.no-tomato.com/2015/02/04/annotations-with-grafana-and-graylog2/

　アラートに対するアクションをダッシュボードから即時実行したい

Grafanaプラグインでkubectl runパネルつくる?

経営・インフラ・アプリの指標が複数のダッシュボードとデータソースに分散している状況をなんとかしたい

ログとメトリクスを同じサービスから閲覧、検索したい
- Grafana + Prometheus + Elasticsearch
  - Grafana 2.6からはテキストデータもダッシュボードに表示できるようになった https://sematext.com/blog/2015/12/14/using-grafana-with-elasticsearch-for-log-analytics-2/
- Elastic StackでK8Sモニタリング
  - Elasticsearchに全Podのログとメトリクスを
  - Kibanaでログもメトリクスもまとめて見られる。DatadogとStackdriverLoggingのようにメトリクスとログで異なるサービスを行き来しなくてもよい
  - Kibanaでログをストリーミング/tailできる https://github.com/sivasamyk/logtrail
  - Kibanaでログの集約結果をグラフ化できる https://www.scaleway.com/docs/how-to-use-the-elk-stack-instant-apps/
ビジネスよりの指標もSQLで分析、ダッシュボード化したい
- GrafanaにMySQLソースが追加された
  - https://github.com/grafana/grafana/pull/5364
  - http://play.grafana.org/dashboard/db/new-features-in-v4-3?orgId=1
Grafanaでログtail?
- Grafanaプラグインにjs+htmlで作ったパネルが含められるなら、それで実装できる?
- deflectorというalias対象にクエリを発行しないと重いらしい
  https://groups.io/g/grafana/topic/3040000

分散トレーシング

loki: Pull型のZipkin https://github.com/weaveworks-experiments/loki
jaegar: https://github.com/uber/jaeger

こんなOSSをつくりたい

lvizのgolang実装
- lviz参考URL: https://blog.codeship.com/logfmt-a-log-format-thats-easy-to-read-and-write/
  - 著名な構造化されたログフォーマットであればいいから、logfmtじゃなくてbunyanベースで考えてもよいかもしれない
  - 仕様はないけどjsonlogベースで考えてもよいのかも
    - https://github.com/docker/docker/blob/master/pkg/jsonlog/jsonlog.go
    - Log Everything as JSON. Make Your Life Easier. | Treasure Data Blog
Google Stackdriver Loggingログをlogfmt形式でtailするもの
- これもbunyan形式のほうがいいかもしれない

このOSSを使ってみたい

高速ロギングライブラリuber-go/zap
- このPRがマージされて1.0になったら

Datadogにこんなのほしい・ないかな

ioping monitor
- iopingでio latencyがわかるらしいので、それがdatadogから見れるといいなぁ
  https://www.cyberciti.biz/faq/linux-freebsd-openbsd-macosx-find-disk-io-latency-with-ioping/

MySQL

マルチソースレプリケーション
- http://yoshiki-utakata.hatenablog.com/entry/2017/12/15/100000

Fargade

"起動までの時間はdesired countによらずおおよそ60秒程度"
- https://qiita.com/nabeken/items/69b47e2d346a61d34176
- EKSとの統合が予定されているが、これくらい時間かかるならC5インスタンスを普通に増やしても同じくらいの時間で収まりそう

JVM

JITコンパイル済みコードのキャッシュのけん
https://stackoverflow.com/questions/1992486/why-doesnt-the-jvm-cache-jit-compiled-code
Java9のAOTはこの文脈でどれくらい効果があるんだろう

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up