SlideShare a Scribd company logo
Our Environment Challenges and Solutions Conclusions PostgreSQL at 20TB and Beyond Analytics at a Massive Scale Chris Travers Adjust GmbH January 25, 2018
Our Environment Challenges and Solutions Conclusions About Adjust Adjust is a market leader in mobile advertisement attribution. We basically act as a referee in pay-per-install advertising. We focus on fairness, fraud-prevention and other ways of ensuring that advertisers pay for the services they receiving fairly and with accountability.
Our Environment Challenges and Solutions Conclusions Our Environment About Adjust Traffic Statistics Analytics Environment Challenges and Solutions Staffing Throughput Autovacuum Data Modeling Backups and Operations Conclusions
Our Environment Challenges and Solutions Conclusions Basic Facts About Us • We are a PostgreSQL/Kafka shop • Around 200 employees worldwide • Link advertisements to installs • Delivering near-real-time analytics to software vendors • Our Data is “Big Data”
Our Environment Challenges and Solutions Conclusions Just How Big? • Over 100k requests per second • Over 2 trillion data points tracked in 2017 • Over 400 TB of data to analyze • Very high data velocity
Our Environment Challenges and Solutions Conclusions General Architecture • Requests come from Internet • Written to backends • Materialized to analytics shards • Shown in dashboard
Our Environment Challenges and Solutions Conclusions Common Elements of Infrastructure • Bare metal • Stripped down Gentoo • Lots of A/B performance testing • Approx 50% more throughput than stock Linux systems • Standard PostgreSQL + extensions
Our Environment Challenges and Solutions Conclusions Backend Servers • Original point of entry • Data distributed by load balancer • No data-dependent routing. • Data distributed more or less randomly • Around 20TB per backend server gets stored. • More than 20 backend servers
Our Environment Challenges and Solutions Conclusions Materializer • Aggregates new events • Copies the aggregations to the shards • Runs every few minutes • New data only
Our Environment Challenges and Solutions Conclusions Materializer and MapReduce Our materializer aggregates data from many servers and transfers it to many servers. It functions sort of like a mapreduce with the added complication that it is a many server to many server transformation.
Our Environment Challenges and Solutions Conclusions Analytics Shards • Each about 2TB each • 16 shards currently, may grow • Our own custom analytics software for managing and querying • Custom sharding/locating software • Paired for redundancy
Our Environment Challenges and Solutions Conclusions Staffing: Challenges • No Junior Database Folks • Demanding environment • Very little room for error • Need people who are deeply grounded in both theory and practice
Our Environment Challenges and Solutions Conclusions Staffing: Solutions • Look for people with enough relevant experience they can learn • Be picky with what we are willing to teach new folks • Look for self-learners with enough knowledge to participate • Expect people to grow into the role • We also use code challenges
Our Environment Challenges and Solutions Conclusions Throughput challenges • Most data is new • It is a lot of data • Lots of btrees with lots of random inserts • Ideally, every wrote inserted once and updated once • Long-term retention
Our Environment Challenges and Solutions Conclusions Throughput Solutions Backend Servers • Each point of entry server has its own database • Transactional processing is separate. • Careful attention to alignment issues • We write our own data types in C to help • Tables partitioned by event time
Our Environment Challenges and Solutions Conclusions Throughput Solutions Analytics Shards • Pre-aggregated data for client-facing metrics • Sharded at roughly 2TB per shard • 16 shards currently • Custom sharding framework optimized to reduce network usage • Goal is to have dashboards load fast. • We know where data is on these shards.
Our Environment Challenges and Solutions Conclusions Throughput Solutions Materializer • Two phases • First phase runs on original entry servers • Aggregates and copies data to analytics shards • Second phase runs on analytics shards • further aggregates and copies.
Our Environment Challenges and Solutions Conclusions Materializer: A Special Problem • Works great when you only have one data center • Foreign data wrapper bulk writes are very slow across data centers • This is a known issue with the Postgres FDW • This is a blocking issue.
Our Environment Challenges and Solutions Conclusions Materializer: Solution • C extension using COPY • Acts as libpq client • Wrote a global transaction manager • Throughput restored.
Our Environment Challenges and Solutions Conclusions Introducing Autovacuum • Queries have to provide consistent snapshots • All updates in PostgreSQL are copy-on-write • In our case, we write once and then update once. • Have to clean up old data at some point • By default, 50 rows plus 20% of table being “dead” triggers autovacuum
Our Environment Challenges and Solutions Conclusions Autovacuum problems • For small tables, great but we have tables with 200M rows • 20% of 200M rows is 40 million dead tuples.... • Autovacuum does nothing and then undertakes a heavy task.... • performance suffers and tables bloat.
Our Environment Challenges and Solutions Conclusions Autovacuum Solutions • Change to 150k rows plus 0% • Tuning requires a lot of hand-holding • Roll out change to servers gradually to avoid overloading system.
Our Environment Challenges and Solutions Conclusions Why it Matters • Under heavy load, painful to change • Want to avoid rewriting tables • Want to minimize disk usage • Want to maximize alignment to pages • Lots of little details really matter
Our Environment Challenges and Solutions Conclusions Custom 1-byte Enums • Country • Language • OS Name • Device Type
Our Environment Challenges and Solutions Conclusions IStore Like HStore but for Integers • Like HStore but for integers • Supports addition, etc, between values of same key • Useful for time series and other modelling problems • Supports GIN indexing among others
Our Environment Challenges and Solutions Conclusions The Elephant in the Room How do we aggregate that much data? • Basically incremental Map Reduce • Map and first phase aggregation on backends • Reduce and second phase aggregation on shards • Further reduction and aggregation possible on demand
Our Environment Challenges and Solutions Conclusions Operations Tools • Sqitch • Rex • Our own custom tools
Our Environment Challenges and Solutions Conclusions Backups • Home grown system • Base backup plus WAL • Runs as a Rex task • We can also do logical backups (but...)
Our Environment Challenges and Solutions Conclusions Ongoing Distributed Challenges • Major Upgrades • Storage Space • Multi-datacenter challenges • Making it all fast
Our Environment Challenges and Solutions Conclusions Overview This environment is all about careful attention to detail and being willing to write C code when needed. Space savings, better alignment, and other seemingly small gains add up over tens of billions of rows.
Our Environment Challenges and Solutions Conclusions Major Points of Interest • We are using PostgreSQL as a big data platform. • We expect this architecture to scale very far. • Provides near-real-time analytics on user actions.
Our Environment Challenges and Solutions Conclusions PostgreSQL makes all this Possible In buiding our 400TB analytics environment we have yet to outgrow PostgreSQL. In fact, this is one of the few pieces of our infrastructure we are perfectly confident in scaling.

More Related Content

What's hot (20)

Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceHadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Cloudera, Inc.
 
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Databricks
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
Omid Vahdaty
 
Cloudera Hadoop Distribution
Cloudera Hadoop DistributionCloudera Hadoop Distribution
Cloudera Hadoop Distribution
Thisara Pramuditha
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
MongoDB
 
Sql vs NoSQL-Presentation
 Sql vs NoSQL-Presentation Sql vs NoSQL-Presentation
Sql vs NoSQL-Presentation
Shubham Tomar
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
Harri Kauhanen
 
Don’t Bug Out! The Ins and Outs of Debugging FME Workflows
Don’t Bug Out! The Ins and Outs of Debugging FME WorkflowsDon’t Bug Out! The Ins and Outs of Debugging FME Workflows
Don’t Bug Out! The Ins and Outs of Debugging FME Workflows
Safe Software
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
Derek Stainer
 
Hadoop and Big Data
Hadoop and Big DataHadoop and Big Data
Hadoop and Big Data
Harshdeep Kaur
 
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
Severalnines
 
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best PracticesOracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
Markus Michalewicz
 
High performance queues with Cassandra
High performance queues with CassandraHigh performance queues with Cassandra
High performance queues with Cassandra
Mikalai Alimenkou
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
Jurriaan Persyn
 
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Amy W. Tang
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
Alluxio, Inc.
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
Yoshinori Matsunobu
 
MySQL Day Paris 2018 - MySQL InnoDB Cluster; A complete High Availability sol...
MySQL Day Paris 2018 - MySQL InnoDB Cluster; A complete High Availability sol...MySQL Day Paris 2018 - MySQL InnoDB Cluster; A complete High Availability sol...
MySQL Day Paris 2018 - MySQL InnoDB Cluster; A complete High Availability sol...
Olivier DASINI
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQL
Yoshinori Matsunobu
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j Internals
Tobias Lindaaker
 
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceHadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Cloudera, Inc.
 
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Databricks
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
Omid Vahdaty
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
MongoDB
 
Sql vs NoSQL-Presentation
 Sql vs NoSQL-Presentation Sql vs NoSQL-Presentation
Sql vs NoSQL-Presentation
Shubham Tomar
 
Don’t Bug Out! The Ins and Outs of Debugging FME Workflows
Don’t Bug Out! The Ins and Outs of Debugging FME WorkflowsDon’t Bug Out! The Ins and Outs of Debugging FME Workflows
Don’t Bug Out! The Ins and Outs of Debugging FME Workflows
Safe Software
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
Derek Stainer
 
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
Severalnines
 
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best PracticesOracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
Markus Michalewicz
 
High performance queues with Cassandra
High performance queues with CassandraHigh performance queues with Cassandra
High performance queues with Cassandra
Mikalai Alimenkou
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
Jurriaan Persyn
 
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Amy W. Tang
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
Alluxio, Inc.
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
Yoshinori Matsunobu
 
MySQL Day Paris 2018 - MySQL InnoDB Cluster; A complete High Availability sol...
MySQL Day Paris 2018 - MySQL InnoDB Cluster; A complete High Availability sol...MySQL Day Paris 2018 - MySQL InnoDB Cluster; A complete High Availability sol...
MySQL Day Paris 2018 - MySQL InnoDB Cluster; A complete High Availability sol...
Olivier DASINI
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQL
Yoshinori Matsunobu
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j Internals
Tobias Lindaaker
 

Similar to PostgreSQL at 20TB and Beyond (20)

PostgreSQL as a Big Data Platform
PostgreSQL as a Big Data Platform PostgreSQL as a Big Data Platform
PostgreSQL as a Big Data Platform
Chris Travers
 
No stress with state
No stress with stateNo stress with state
No stress with state
Uwe Friedrichsen
 
Why retail companies can't afford database downtime
Why retail companies can't afford database downtimeWhy retail companies can't afford database downtime
Why retail companies can't afford database downtime
DBmaestro - Database DevOps
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Randy Shoup
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, when
Eugenio Minardi
 
DNN-Connect 2019: DNN Horror Stories
DNN-Connect 2019: DNN Horror StoriesDNN-Connect 2019: DNN Horror Stories
DNN-Connect 2019: DNN Horror Stories
Will Strohl
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic Tool
EDB
 
A data analyst view of Bigdata
A data analyst view of Bigdata A data analyst view of Bigdata
A data analyst view of Bigdata
Venkata Reddy Konasani
 
Scaling apps for the big time
Scaling apps for the big timeScaling apps for the big time
Scaling apps for the big time
proitconsult
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Databricks
 
CD presentation march 12th, 2018
CD presentation march 12th, 2018CD presentation march 12th, 2018
CD presentation march 12th, 2018
Ran Levy
 
Voldemort Nosql
Voldemort NosqlVoldemort Nosql
Voldemort Nosql
elliando dias
 
Mixing d ps building architecture on the cross cutting example
Mixing d ps building architecture on the cross cutting exampleMixing d ps building architecture on the cross cutting example
Mixing d ps building architecture on the cross cutting example
corehard_by
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
Zohar Elkayam
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
ch adnan
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that grow
Gibraltar Software
 
In (database) automation we trust
In (database) automation we trustIn (database) automation we trust
In (database) automation we trust
DBmaestro - Database DevOps
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalability
Guy Tomer
 
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOE
 
DBmaestro's State of the Database Continuous Delivery Survey- Findings Revealed
DBmaestro's State of the Database Continuous Delivery Survey- Findings RevealedDBmaestro's State of the Database Continuous Delivery Survey- Findings Revealed
DBmaestro's State of the Database Continuous Delivery Survey- Findings Revealed
DBmaestro - Database DevOps
 
PostgreSQL as a Big Data Platform
PostgreSQL as a Big Data Platform PostgreSQL as a Big Data Platform
PostgreSQL as a Big Data Platform
Chris Travers
 
Why retail companies can't afford database downtime
Why retail companies can't afford database downtimeWhy retail companies can't afford database downtime
Why retail companies can't afford database downtime
DBmaestro - Database DevOps
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Randy Shoup
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, when
Eugenio Minardi
 
DNN-Connect 2019: DNN Horror Stories
DNN-Connect 2019: DNN Horror StoriesDNN-Connect 2019: DNN Horror Stories
DNN-Connect 2019: DNN Horror Stories
Will Strohl
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic Tool
EDB
 
Scaling apps for the big time
Scaling apps for the big timeScaling apps for the big time
Scaling apps for the big time
proitconsult
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Databricks
 
CD presentation march 12th, 2018
CD presentation march 12th, 2018CD presentation march 12th, 2018
CD presentation march 12th, 2018
Ran Levy
 
Mixing d ps building architecture on the cross cutting example
Mixing d ps building architecture on the cross cutting exampleMixing d ps building architecture on the cross cutting example
Mixing d ps building architecture on the cross cutting example
corehard_by
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
ch adnan
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that grow
Gibraltar Software
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalability
Guy Tomer
 
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOEcon17: Searchperience - The journey from PHP and Solr to Scala and Elastic...
AOE
 
DBmaestro's State of the Database Continuous Delivery Survey- Findings Revealed
DBmaestro's State of the Database Continuous Delivery Survey- Findings RevealedDBmaestro's State of the Database Continuous Delivery Survey- Findings Revealed
DBmaestro's State of the Database Continuous Delivery Survey- Findings Revealed
DBmaestro - Database DevOps
 

Recently uploaded (20)

Performance_Analysis_of_LMS_Adaptive_FIR_Filter_an.pdf
Performance_Analysis_of_LMS_Adaptive_FIR_Filter_an.pdfPerformance_Analysis_of_LMS_Adaptive_FIR_Filter_an.pdf
Performance_Analysis_of_LMS_Adaptive_FIR_Filter_an.pdf
JohnAtifAfroz
 
2025ieeexploresearchstrategieswebinarupdate32025k1741096305433.pdf
2025ieeexploresearchstrategieswebinarupdate32025k1741096305433.pdf2025ieeexploresearchstrategieswebinarupdate32025k1741096305433.pdf
2025ieeexploresearchstrategieswebinarupdate32025k1741096305433.pdf
dhouhaaridhi
 
Diagrams.pptx Diagrams.pptx Diagrams.pptx
Diagrams.pptx Diagrams.pptx Diagrams.pptxDiagrams.pptx Diagrams.pptx Diagrams.pptx
Diagrams.pptx Diagrams.pptx Diagrams.pptx
pateljeel24
 
Lecture Week 6 Process Synchronisation.pptx
Lecture Week 6 Process Synchronisation.pptxLecture Week 6 Process Synchronisation.pptx
Lecture Week 6 Process Synchronisation.pptx
vemiri6305
 
Narmada Main Canal Maintenance Work .pptx
Narmada Main Canal Maintenance Work .pptxNarmada Main Canal Maintenance Work .pptx
Narmada Main Canal Maintenance Work .pptx
NWRWS&K
 
Basic Python Programs, Python Fundamentals.pptx
Basic Python Programs, Python Fundamentals.pptxBasic Python Programs, Python Fundamentals.pptx
Basic Python Programs, Python Fundamentals.pptx
SrinivasGopalan2
 
Cryptography 3 Cryptography 3 Cryptography 3
Cryptography 3 Cryptography 3 Cryptography 3Cryptography 3 Cryptography 3 Cryptography 3
Cryptography 3 Cryptography 3 Cryptography 3
AhmedSaeed115917
 
Low cost Housing and Apartment Literature review
Low cost Housing and Apartment Literature reviewLow cost Housing and Apartment Literature review
Low cost Housing and Apartment Literature review
sadikshyaripple
 
Tanvir Ahmed Sohel _Top Tools Every Software Engineer Needs in 2024 to Boost ...
Tanvir Ahmed Sohel _Top Tools Every Software Engineer Needs in 2024 to Boost ...Tanvir Ahmed Sohel _Top Tools Every Software Engineer Needs in 2024 to Boost ...
Tanvir Ahmed Sohel _Top Tools Every Software Engineer Needs in 2024 to Boost ...
Tanbir Ahmed Shohel
 
Software Screen Universal Snubber machine (2).pptx
Software Screen Universal Snubber machine (2).pptxSoftware Screen Universal Snubber machine (2).pptx
Software Screen Universal Snubber machine (2).pptx
Neometrix_Engineering_Pvt_Ltd
 
Lean Energy Engineering: A New Frontier for American Industry.
Lean Energy Engineering: A New Frontier for American Industry.Lean Energy Engineering: A New Frontier for American Industry.
Lean Energy Engineering: A New Frontier for American Industry.
Lamar University
 
Module-5 Functional Materials Loganathan.pptx
Module-5 Functional Materials Loganathan.pptxModule-5 Functional Materials Loganathan.pptx
Module-5 Functional Materials Loganathan.pptx
DeveshwarUmapathy
 
Integration of AI and Digital Twin in Supply Chain Management Conference: 4th...
Integration of AI and Digital Twin in Supply Chain Management Conference: 4th...Integration of AI and Digital Twin in Supply Chain Management Conference: 4th...
Integration of AI and Digital Twin in Supply Chain Management Conference: 4th...
Ram Krishna
 
Transportation Design at Ajeenkya DY Patil Univeristy
Transportation Design at Ajeenkya DY Patil UniveristyTransportation Design at Ajeenkya DY Patil Univeristy
Transportation Design at Ajeenkya DY Patil Univeristy
sourabhmore19
 
Mohamed Ahmed Ali Ahmed Ali Katheer CV new update
Mohamed Ahmed Ali Ahmed Ali Katheer CV new updateMohamed Ahmed Ali Ahmed Ali Katheer CV new update
Mohamed Ahmed Ali Ahmed Ali Katheer CV new update
AhmedKatheer1
 
Module-7-Industrial applications-Loganathan.pptx
Module-7-Industrial applications-Loganathan.pptxModule-7-Industrial applications-Loganathan.pptx
Module-7-Industrial applications-Loganathan.pptx
DeveshwarUmapathy
 
E-BOOK MANAGEMENT.pptx using Object oriented software engineering
E-BOOK MANAGEMENT.pptx using Object oriented software engineeringE-BOOK MANAGEMENT.pptx using Object oriented software engineering
E-BOOK MANAGEMENT.pptx using Object oriented software engineering
raghaviarumugam14
 
Unit 5 Group Technology in Computer Aided Design
Unit 5 Group Technology in Computer Aided DesignUnit 5 Group Technology in Computer Aided Design
Unit 5 Group Technology in Computer Aided Design
DrRAMESHKUMARA1
 
Introduction-to-Micro-Nanofabrication.pdf
Introduction-to-Micro-Nanofabrication.pdfIntroduction-to-Micro-Nanofabrication.pdf
Introduction-to-Micro-Nanofabrication.pdf
buttermasala
 
Call for Papers - 6th International Conference on Advances in Artificial Inte...
Call for Papers - 6th International Conference on Advances in Artificial Inte...Call for Papers - 6th International Conference on Advances in Artificial Inte...
Call for Papers - 6th International Conference on Advances in Artificial Inte...
AIRCC Publishing Corporation
 
Performance_Analysis_of_LMS_Adaptive_FIR_Filter_an.pdf
Performance_Analysis_of_LMS_Adaptive_FIR_Filter_an.pdfPerformance_Analysis_of_LMS_Adaptive_FIR_Filter_an.pdf
Performance_Analysis_of_LMS_Adaptive_FIR_Filter_an.pdf
JohnAtifAfroz
 
2025ieeexploresearchstrategieswebinarupdate32025k1741096305433.pdf
2025ieeexploresearchstrategieswebinarupdate32025k1741096305433.pdf2025ieeexploresearchstrategieswebinarupdate32025k1741096305433.pdf
2025ieeexploresearchstrategieswebinarupdate32025k1741096305433.pdf
dhouhaaridhi
 
Diagrams.pptx Diagrams.pptx Diagrams.pptx
Diagrams.pptx Diagrams.pptx Diagrams.pptxDiagrams.pptx Diagrams.pptx Diagrams.pptx
Diagrams.pptx Diagrams.pptx Diagrams.pptx
pateljeel24
 
Lecture Week 6 Process Synchronisation.pptx
Lecture Week 6 Process Synchronisation.pptxLecture Week 6 Process Synchronisation.pptx
Lecture Week 6 Process Synchronisation.pptx
vemiri6305
 
Narmada Main Canal Maintenance Work .pptx
Narmada Main Canal Maintenance Work .pptxNarmada Main Canal Maintenance Work .pptx
Narmada Main Canal Maintenance Work .pptx
NWRWS&K
 
Basic Python Programs, Python Fundamentals.pptx
Basic Python Programs, Python Fundamentals.pptxBasic Python Programs, Python Fundamentals.pptx
Basic Python Programs, Python Fundamentals.pptx
SrinivasGopalan2
 
Cryptography 3 Cryptography 3 Cryptography 3
Cryptography 3 Cryptography 3 Cryptography 3Cryptography 3 Cryptography 3 Cryptography 3
Cryptography 3 Cryptography 3 Cryptography 3
AhmedSaeed115917
 
Low cost Housing and Apartment Literature review
Low cost Housing and Apartment Literature reviewLow cost Housing and Apartment Literature review
Low cost Housing and Apartment Literature review
sadikshyaripple
 
Tanvir Ahmed Sohel _Top Tools Every Software Engineer Needs in 2024 to Boost ...
Tanvir Ahmed Sohel _Top Tools Every Software Engineer Needs in 2024 to Boost ...Tanvir Ahmed Sohel _Top Tools Every Software Engineer Needs in 2024 to Boost ...
Tanvir Ahmed Sohel _Top Tools Every Software Engineer Needs in 2024 to Boost ...
Tanbir Ahmed Shohel
 
Lean Energy Engineering: A New Frontier for American Industry.
Lean Energy Engineering: A New Frontier for American Industry.Lean Energy Engineering: A New Frontier for American Industry.
Lean Energy Engineering: A New Frontier for American Industry.
Lamar University
 
Module-5 Functional Materials Loganathan.pptx
Module-5 Functional Materials Loganathan.pptxModule-5 Functional Materials Loganathan.pptx
Module-5 Functional Materials Loganathan.pptx
DeveshwarUmapathy
 
Integration of AI and Digital Twin in Supply Chain Management Conference: 4th...
Integration of AI and Digital Twin in Supply Chain Management Conference: 4th...Integration of AI and Digital Twin in Supply Chain Management Conference: 4th...
Integration of AI and Digital Twin in Supply Chain Management Conference: 4th...
Ram Krishna
 
Transportation Design at Ajeenkya DY Patil Univeristy
Transportation Design at Ajeenkya DY Patil UniveristyTransportation Design at Ajeenkya DY Patil Univeristy
Transportation Design at Ajeenkya DY Patil Univeristy
sourabhmore19
 
Mohamed Ahmed Ali Ahmed Ali Katheer CV new update
Mohamed Ahmed Ali Ahmed Ali Katheer CV new updateMohamed Ahmed Ali Ahmed Ali Katheer CV new update
Mohamed Ahmed Ali Ahmed Ali Katheer CV new update
AhmedKatheer1
 
Module-7-Industrial applications-Loganathan.pptx
Module-7-Industrial applications-Loganathan.pptxModule-7-Industrial applications-Loganathan.pptx
Module-7-Industrial applications-Loganathan.pptx
DeveshwarUmapathy
 
E-BOOK MANAGEMENT.pptx using Object oriented software engineering
E-BOOK MANAGEMENT.pptx using Object oriented software engineeringE-BOOK MANAGEMENT.pptx using Object oriented software engineering
E-BOOK MANAGEMENT.pptx using Object oriented software engineering
raghaviarumugam14
 
Unit 5 Group Technology in Computer Aided Design
Unit 5 Group Technology in Computer Aided DesignUnit 5 Group Technology in Computer Aided Design
Unit 5 Group Technology in Computer Aided Design
DrRAMESHKUMARA1
 
Introduction-to-Micro-Nanofabrication.pdf
Introduction-to-Micro-Nanofabrication.pdfIntroduction-to-Micro-Nanofabrication.pdf
Introduction-to-Micro-Nanofabrication.pdf
buttermasala
 
Call for Papers - 6th International Conference on Advances in Artificial Inte...
Call for Papers - 6th International Conference on Advances in Artificial Inte...Call for Papers - 6th International Conference on Advances in Artificial Inte...
Call for Papers - 6th International Conference on Advances in Artificial Inte...
AIRCC Publishing Corporation
 

PostgreSQL at 20TB and Beyond

  • 1. Our Environment Challenges and Solutions Conclusions PostgreSQL at 20TB and Beyond Analytics at a Massive Scale Chris Travers Adjust GmbH January 25, 2018
  • 2. Our Environment Challenges and Solutions Conclusions About Adjust Adjust is a market leader in mobile advertisement attribution. We basically act as a referee in pay-per-install advertising. We focus on fairness, fraud-prevention and other ways of ensuring that advertisers pay for the services they receiving fairly and with accountability.
  • 3. Our Environment Challenges and Solutions Conclusions Our Environment About Adjust Traffic Statistics Analytics Environment Challenges and Solutions Staffing Throughput Autovacuum Data Modeling Backups and Operations Conclusions
  • 4. Our Environment Challenges and Solutions Conclusions Basic Facts About Us • We are a PostgreSQL/Kafka shop • Around 200 employees worldwide • Link advertisements to installs • Delivering near-real-time analytics to software vendors • Our Data is “Big Data”
  • 5. Our Environment Challenges and Solutions Conclusions Just How Big? • Over 100k requests per second • Over 2 trillion data points tracked in 2017 • Over 400 TB of data to analyze • Very high data velocity
  • 6. Our Environment Challenges and Solutions Conclusions General Architecture • Requests come from Internet • Written to backends • Materialized to analytics shards • Shown in dashboard
  • 7. Our Environment Challenges and Solutions Conclusions Common Elements of Infrastructure • Bare metal • Stripped down Gentoo • Lots of A/B performance testing • Approx 50% more throughput than stock Linux systems • Standard PostgreSQL + extensions
  • 8. Our Environment Challenges and Solutions Conclusions Backend Servers • Original point of entry • Data distributed by load balancer • No data-dependent routing. • Data distributed more or less randomly • Around 20TB per backend server gets stored. • More than 20 backend servers
  • 9. Our Environment Challenges and Solutions Conclusions Materializer • Aggregates new events • Copies the aggregations to the shards • Runs every few minutes • New data only
  • 10. Our Environment Challenges and Solutions Conclusions Materializer and MapReduce Our materializer aggregates data from many servers and transfers it to many servers. It functions sort of like a mapreduce with the added complication that it is a many server to many server transformation.
  • 11. Our Environment Challenges and Solutions Conclusions Analytics Shards • Each about 2TB each • 16 shards currently, may grow • Our own custom analytics software for managing and querying • Custom sharding/locating software • Paired for redundancy
  • 12. Our Environment Challenges and Solutions Conclusions Staffing: Challenges • No Junior Database Folks • Demanding environment • Very little room for error • Need people who are deeply grounded in both theory and practice
  • 13. Our Environment Challenges and Solutions Conclusions Staffing: Solutions • Look for people with enough relevant experience they can learn • Be picky with what we are willing to teach new folks • Look for self-learners with enough knowledge to participate • Expect people to grow into the role • We also use code challenges
  • 14. Our Environment Challenges and Solutions Conclusions Throughput challenges • Most data is new • It is a lot of data • Lots of btrees with lots of random inserts • Ideally, every wrote inserted once and updated once • Long-term retention
  • 15. Our Environment Challenges and Solutions Conclusions Throughput Solutions Backend Servers • Each point of entry server has its own database • Transactional processing is separate. • Careful attention to alignment issues • We write our own data types in C to help • Tables partitioned by event time
  • 16. Our Environment Challenges and Solutions Conclusions Throughput Solutions Analytics Shards • Pre-aggregated data for client-facing metrics • Sharded at roughly 2TB per shard • 16 shards currently • Custom sharding framework optimized to reduce network usage • Goal is to have dashboards load fast. • We know where data is on these shards.
  • 17. Our Environment Challenges and Solutions Conclusions Throughput Solutions Materializer • Two phases • First phase runs on original entry servers • Aggregates and copies data to analytics shards • Second phase runs on analytics shards • further aggregates and copies.
  • 18. Our Environment Challenges and Solutions Conclusions Materializer: A Special Problem • Works great when you only have one data center • Foreign data wrapper bulk writes are very slow across data centers • This is a known issue with the Postgres FDW • This is a blocking issue.
  • 19. Our Environment Challenges and Solutions Conclusions Materializer: Solution • C extension using COPY • Acts as libpq client • Wrote a global transaction manager • Throughput restored.
  • 20. Our Environment Challenges and Solutions Conclusions Introducing Autovacuum • Queries have to provide consistent snapshots • All updates in PostgreSQL are copy-on-write • In our case, we write once and then update once. • Have to clean up old data at some point • By default, 50 rows plus 20% of table being “dead” triggers autovacuum
  • 21. Our Environment Challenges and Solutions Conclusions Autovacuum problems • For small tables, great but we have tables with 200M rows • 20% of 200M rows is 40 million dead tuples.... • Autovacuum does nothing and then undertakes a heavy task.... • performance suffers and tables bloat.
  • 22. Our Environment Challenges and Solutions Conclusions Autovacuum Solutions • Change to 150k rows plus 0% • Tuning requires a lot of hand-holding • Roll out change to servers gradually to avoid overloading system.
  • 23. Our Environment Challenges and Solutions Conclusions Why it Matters • Under heavy load, painful to change • Want to avoid rewriting tables • Want to minimize disk usage • Want to maximize alignment to pages • Lots of little details really matter
  • 24. Our Environment Challenges and Solutions Conclusions Custom 1-byte Enums • Country • Language • OS Name • Device Type
  • 25. Our Environment Challenges and Solutions Conclusions IStore Like HStore but for Integers • Like HStore but for integers • Supports addition, etc, between values of same key • Useful for time series and other modelling problems • Supports GIN indexing among others
  • 26. Our Environment Challenges and Solutions Conclusions The Elephant in the Room How do we aggregate that much data? • Basically incremental Map Reduce • Map and first phase aggregation on backends • Reduce and second phase aggregation on shards • Further reduction and aggregation possible on demand
  • 27. Our Environment Challenges and Solutions Conclusions Operations Tools • Sqitch • Rex • Our own custom tools
  • 28. Our Environment Challenges and Solutions Conclusions Backups • Home grown system • Base backup plus WAL • Runs as a Rex task • We can also do logical backups (but...)
  • 29. Our Environment Challenges and Solutions Conclusions Ongoing Distributed Challenges • Major Upgrades • Storage Space • Multi-datacenter challenges • Making it all fast
  • 30. Our Environment Challenges and Solutions Conclusions Overview This environment is all about careful attention to detail and being willing to write C code when needed. Space savings, better alignment, and other seemingly small gains add up over tens of billions of rows.
  • 31. Our Environment Challenges and Solutions Conclusions Major Points of Interest • We are using PostgreSQL as a big data platform. • We expect this architecture to scale very far. • Provides near-real-time analytics on user actions.
  • 32. Our Environment Challenges and Solutions Conclusions PostgreSQL makes all this Possible In buiding our 400TB analytics environment we have yet to outgrow PostgreSQL. In fact, this is one of the few pieces of our infrastructure we are perfectly confident in scaling.
close