Comprehensive guide to Redis. Part 2.

In this tutorial, we are going to discover a few more data types like Set, Sorted Set, Bitmap, and HyperLogLog.

Sets

A Set is a collection of distinct strings. It is impossible to add repeated elements to a Set. A Set can hold more than 4 billion elements per Set.

You can use Sets for:
– data grouping. Something like recommended purchases in e-commerce sites.
– data filtering. Grouping all customers who bought the same product.
– checking for membership. Check whether the customer is in the patron group.

Commands:

SADD: add one or many members to a Set. It ignores members that already exist in a Set and returns the number of added members.
SINTER: expects one or many Sets and returns an array of members that belong to every Set.
SDIFF: expects one or many Sets and returns an array of members of the first Set which do not exist in the following one.
SUNION: returns a union of Sets. All elements are unique.
SRANDMEMBER: returns a random member from a Set.
SISMEMBER: checks whether a member exists in a Set. Returns 1(exist) or 0(not exist).
SREM: removes and returns members from a Set.
SCARD: returns the number of members in a Set. (cardinality)
SMEMBERS: returns an array of all members in a Set.

img1

Let’s create an example. Imagine we have an e-commerce site and we want to send our special Christmas offers to the users. We have to implement the following functions:
– mark offer as sent
– check whether a user received a group of offers
– gather metrics

Every offer is a Set containing user IDs that have received the offer. Create a file called christmas-offers.js with the following code:

Now run the code via node christmas-offers. You should see the following console output:

img2

Sorted Sets

A Sorted Set is a collection of nonrepeating Strings sorted by score. You can have elements with repeated scores. In this case, the repeated elements are ordered in alphabetical order. Sorted Set operations run in O(log(N)) time.

You can use Sorted Sets for:
– leaderboard for online games
– input autocomplete
– real time queue

Commands:

ZADD: adds one or many members to a Sorted Set. Returns the number of added members. Elements are added with a score and a String value.
ZRANGE: returns elements from the lowest to the highest score.
ZREVRANGE: returns elements from the highest to the lowest score. If you pass an optional parameter WITHSCORES it will return elements with their scores.
ZREM: removes a member from a Sorted Set
ZSCORE: returns the score of a member
ZRANK: returns the member index ordered from low to high. The member with the lowest score has index 0.
ZREVRANK: returns the member index ordered from high to low. The member with the highest score has index 0.

img3

Let’s create an example. We will build a leaderboard for an online game. We have to implement the following functions:
– add and remove users
– display detailed user info
– show the top COUNT users
– show the users above and below a given user

Create a file leaderboard.js with the following code:

Now run the code via node leaderboard. You should see the following console output:

img4

Bitmaps

A Bitmap is a sequence of bits. Each of them can store 0 or 1. Basically, Bitmap is an array of ones and zeroes.

See an example:
10011
This is a bitmap with one set on offsets 0,3,4 and zero set on offsets 1 and 2

Bitmaps are great for real-time analytics. For example, to make a report “How many users bought a product A yesterday?”.

The examples will show how to use Bitmap commands to userIDs of users who bought a given product. Each user is identified by an ID. Each Bitmap offset represents a user. For example, user 10 is offset 10, user 50 is offset 50, and so on.

Commands:

SETBIT: gives a value to a Bitmap offset.
GETBIT: returns a value of a Bitmap offset.
BITCOUNT: returns the number of bits marked as 1.
BITOP: requires a bitwise operation(OR, AND, XOR, NOT) and a list of keys to apply to this operation. It stores the result in the destination key.

img5

The example application will be a web analytics system that saves and counts daily user visits and the retrieves userIDs from the visits on a given date.

Create a file metrics.js with the following code:

Now run node metrics. You should see the following console output:

img6

HyperLogLogs

A HyperLogLog is not a real data type. It is an algorithm that uses randomization in order to provide a very good approximation of the number of unique members in a Set. It runs in O(1) constant time.

The HyperLogLog algorithm is probabilistic. It does not ensure 100 percent accuracy. The Redis implementation has an error of 0.81 percent. In some cases, 99.19 percent is good enough.

There are only three commands for HyperLogLogs: PFADD, PFCOUNT, PFMERGE. The prefix PF is in honor of Philippe Flajolet who is the author of the algorithm.

You can use HyperLogLogs for:
– counting the number of search terms on your website
– counting the number of unique users
– counting the number of distinct words on a page
– counting the number of distinct hashtags used by a user

Commands:

PFADD: adds one or many strings to a HyperLogLog. Returns 1(if the cardinality was changed) or 0(the cardinality remains the same).
PFCOUNT: accepts one or many keys as arguments. When there is a single argument, it returns the approximate cardinality. When there are multiple keys, it returns the approximate cardinality of the union of all unique elements.
PFMERGE: accepts a destination key and one or many HyperLogLog keys. It merges all the specified keys and stores the result in the destination key.

img7

The example application will be a web system for counting and retrieving unique website visits. Each date will have 24 keys that represent each hour of a day.

Create a file called unique-analytics.js and add the following code:

Now run node unique-anaytics. You should see the following console output:

img8

In this tutorial, we discovered Sets, Sorted Sets, Bitmaps, and HyperLogLogs. That’s all for today 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *