Quantcast
Channel: ScyllaDB
Viewing all articles
Browse latest Browse all 1137

User use case: Mogujie

$
0
0

Guest post by By FengLin(MengYe Shen), Mogujie

Mogujie is a large fashion retail site, with more than 3 billion Yuan turnover, serving more than 60 million online users, shopping for clothes, shoes, bags, accessories, cosmetics and beauty products. With that kind of traffic, a resilient, low-latency database is critical.

Mogujie is now running ScyllaDB for the last six months, most of this time on production. Current deployment is six servers, serving 600,000 requests per second with Scylla 1.0.1

Mogujie has more than 60 million users

In November 2015 we started exploring columnar databases such as Cassandra. Cassandra was our first implementation of a column store database. Our target use case is real-time storage of monitoring data and providing offline query functionality.

Cassandra or HBase?

We decided to use Cassandra for its no single point of failure architecture and high scalability features, while maintaining good performance for mixed read and write workloads. HBase was, at that time, missing those features.

1. Test Cassandra

Our targeted performance TPS is 50k/s. We configured on-premise five servers to deploy Cassandra. Each server has the following configuration:

  • 24 cores, Intel® Xeon® processor E5-2600 product family
  • 64GB memory
  • 60TB located on magnetic Hard Drives
  • Gigabit Ethernet

For our test data model we created three tables, with the following settings:

// 1. Create keyspace
create keyspace sentry2 with replication = { 'class': 'SimpleStrategy', 'replication_factor': '2'};

// 2 custom data types.
create type sentry2.scmm (sum double, count int, min double, max double);

// 3. Data Sheet
create table sentry2.datapoints (key text, time timestamp, data FROZEN <sentry2.scmm>, PRIMARY KEY (key, time));

// 4. Index table
create table sentry2.key_index (metric text, tags text, tagsMap map <text, text>, PRIMARY KEY (metric, tags));
create index tagMap_idx on sentry2.key_index (ENTRIES (tagsMap));
  • Schema: a User Defined Type, describing the following fields (sum double, count int, min double, max double)
  • Datapoints: by key and timestamp can be found in a particular data point
  • Key_index: metric corresponding to a plurality of tags;

At the time, we created this data model we were not highly familiar with Cassandra, and we later realized our model is not optimized. However, it was a good model to test Cassandra, with the following results:

  1. Transactions Per Second (TPS) maintained at 30k ~ 40k / s. When increased to a higher rate we experienced dropped mutations
  2. After 2 to 3 days of continuous operation, Cassandra would crash

Based on the above issues, we were ready to increase our Cassandra cluster footprint to handle the additional capacity requirements. We had to purchase additional servers to continue the testing, so we suspended the Cassandra testing.

2. Meet Scylladb

We met ScyllaDB by chance, while trying to resolve the above issues. One of our team members discovered an open-source C++ implementation of Cassandra, called Scylla. We were excited! as C++ developers ourselves, we are less familiar with Java. We found information about Scylla on a Chinese site, oschina.net and the ScyllaDB website and believed we found the right solution to our issues.

At that time, we had a limited number of adequate servers to meet the requirements. We only had six machines, but after deploying Scylla on those machines our performance SLA requirements were met! Scylla delivered 100k/s TPS. Instantly our resources and efficiency issues were solved!

Thanks to Scylla’s superior performance saved three servers, operational and capital costs. Scylla is highly efficient, so we meet TPS production environment requirements with no problem; However, we still have to measure Scylla under high write workload, when TPS exceeds 40k/s, to discover if and when there will be dropped mutations. For these testing, we use Scylla version 0.11, with the same server setup mentioned above.

When we concluded our testing with Scylla, we found it meeting our expectations. So, we decided to waive the tests with Cassandra, and move Scylla to our production environment.

Issues and lessons learned

As expected, we did face some challenges while deploying Scylla, which was pre-GA at the time. Helping us to mitigate those issues, we received excellent support from Scylla team: Tzach Livyatan, Asias He, and others. In our initial tests Scylla 0.11 had good performance, but did suffer from problem and glitches. Any problem encountered in the following days was solved in the follow-up releases or earlier.

One of the new key concepts we had to learn is correct data modeling. Scylla and Cassandra have a few concepts that need to be understood. (Stack Overflow Original):

  • Primary-key: a combination of partition key and a cluster key;
  • Partition-key: mainly used for the distribution of identification data;
  • Composite key: When multiple primary key columns when it becomes a key combination;
  • Clustering-key: ordering columns in the partition;
create table stackoverflow (
      k_part_one text,
      k_part_two int,
      k_clust_one text,
      k_clust_two int,
      k_clust_three uuid,
      data text,
      PRIMARY KEY ((k_part_one, k_part_two), k_clust_one, k_clust_two, k_clust_three)
  );
  • Primary-key: (k_part_one, k_part_two), k_clust_one, k_clust_two, k_clust_three)
  • Partition-key: (k_part_one, k_part_two)
  • Clustering-key: k_clust_one, k_clust_two, k_clust_three

When performing a CQL query, specifying the partition key, is the right way to locate the right node in a timely fashion.

According to Scylla and Cassandra recommendation, no single partition (row) should consume more than 50% of memory, for Scylla it is (total_machine_memory / (number_of_lcores * 2)) Initially, we violated this advice which leads us to extremely wide rows (many GBs). To solve this issue, we re-module the data such that each row will smaller than 10MB.

Lesson Learned

  • Data Model. Before using Scylla, I highly recommend understanding your data model better. Understanding your data will help you design an efficient system.

  • Monitoring. I recommend using a Scylla monitoring application, to help you monitor and observe Scylla operations.

Hitting hardware limitations, and solving them

Disk

After the trial with Scylla, we found its performance has exceeded our expectations; but in the process we use, we found that when we increase query rates up to 300K/s, they will yield a timeout. Our initial suspect was the network card, but once we start using netdata monitoring, we discovered that in fact our timeouts were the result of disk IOPS.

When the rate of arrival 300K/s, disk utilization has reached 100%, which leads to Scylla latencies. Look like we had to follow ScyllaDB team’s recommendation and move from spinning disk to SSD. Trying to delay the move to SSD a little longer, we decided to optimize our data modeling, updating the schema from

create table sentry2.datapoints (key text, time timestamp, data FROZEN <sentry2.scmm>, PRIMARY KEY (key, time));

To

CREATE TABLE sentrydata.datapoints (
 metric bigint,
 tags bigint,
 wholehour bigint,
 data map <int, frozen <tuple <double, bigint, bigint, double, double >>>,
 PRIMARY KEY (metric, tags, wholehour)
);

Inserting 360 data point at a time, allowing 100 million data points per minute. With this update the bottleneck moved from disk to the network card.

AIO Optimization

To use the full power of asynchronous I/O (AIO), Scylla documentation recommends kernel version 3.15 or later. Since we had an earlier kernel version we had to use developer mode. Once we move to a later kernel version, our tests showed increased throughput of 20%. Below are the test details:

  • Test Platform - Server
    • Three server machines
    • 40 core
    • 64G memory
    • 60T mechanical hard drive
    • Gigabit Ethernet
  • Test Platform - Client
    • 4 machines
    • 24 core
    • 64G memory
    • 400G ssd
    • Gigabit Ethernet
  • Table structure
CREATE TABLE sentrydata.newdatapoint1 (
 metric bigint,
 tags bigint,
 wholehour bigint,
 data map <int, frozen <tuple <double, bigint, bigint, double, double >>>,
 PRIMARY KEY (metric, tags, wholehour)
)
  • Results for kernel version 3.10
Data pointsReq / mininserted datapointCPURAMCommit Disk utilizationCommit Disk BWDisk Use RateDisk BWIngress Network BWEgress Network BW
115M15M75%62.70%2.60%45M99%37M510448
601.5M92M50%62.70%5.10%95M94%20M840573
360288K102M30-40%63%5.50%105M80~98%15M915623
3600288K102M10-30%62.80%6.50%110M70~95%14M925640
  • Results for kernel version 3.18
Data pointsReq / mininserted datapointCPURAMCommit Disk utilizationCommit Disk BWDisk Use RateDisk BWIngress Network BWEgress Network BW
117M17M75%60%2.33%59M99%44M610550
601.908M114.2M35~40%60.70%6.50%110M99%36.5M930640
360357K128.2M24%~30%61%7.80%110M67~90%21M934575
360035.8K128~130M10%~30%62.80%6.50%110M50~70%10~22M937480

Summary

Once we had Scylla up and running we took several steps to remove bottlenecks and improve performance:

  • To bypass spinning disk IO problem, we moved to a data model which inserts data points in batches of 60
  • To bypass network bottlenecks we increased the bulk size further to 360 and 3600
  • Moving to latest kernel and enabling Scylla production mode increases performance by 20%

4. future plans

We plan to migrate more and more applications to Scylla. We are now planning

  • Build data backup and restore architecture
  • Integrate Spark, for off-line analysis
  • Use Scylla to store Kafka time index offset, etc.
  • Create a storage platform (relatively long-term plans)

My team and I are very excited using Scylla and Seastar and have confidence in the team behind these products. We are constantly learning these projects ideas, hoping to use similar methods and concepts to our other products.

Lastly, although Scylla is still growing, it really is a super powerful product. ScyllaDB team: I hope you continue to develop good products. My team and I will continue to follow your contributions.


Viewing all articles
Browse latest Browse all 1137

Trending Articles