Generating Alerts from Twitter


If you’d like to run through this tutorial more quickly, be sure you’ve loaded Cyohon’s starter settings before you begin. These settings provide many of the configurations used in this tutorial.


Suppose we want to capture a tweet like this:

The raw data for this tweet might look like this:

    "created_at": "Wed Apr 05 19:06:12 +0000 2017",
    "id": 849699711376293888,
    "id_str": "849699711376293888",
    "text": "Cyphon Release 1.0.0",
    "truncated": false,
    "entities": {
        "hashtags": [],
        "symbols": [],
        "user_mentions": [],
        "urls": [{
            "url": "",
            "expanded_url": "",
            "display_url": "\u2026",
            "indices": [23, 46]
    "source": "\u003ca href=\"\" rel=\"nofollow\"\u003eSquarespace\u003c/a\u003e",
    "in_reply_to_status_id": null,
    "in_reply_to_status_id_str": null,
    "in_reply_to_user_id": null,
    "in_reply_to_user_id_str": null,
    "in_reply_to_screen_name": null,
    "user": {
        "id": 834789750624116736,
        "id_str": "834789750624116736",
        "name": "The Cyphon Project",
        "screen_name": "cyphonproject",
        "location": "Maryland, USA",
        "description": "An open platform for incident management and response. Maintained by @DunbarCyber.",
        "url": "",
        "entities": {
            "url": {
                "urls": [{
                    "url": "",
                    "expanded_url": "",
                    "display_url": "",
                    "indices": [0, 23]
            "description": {
                "urls": []
        "protected": false,
        "followers_count": 20,
        "friends_count": 3,
        "listed_count": 0,
        "created_at": "Thu Feb 23 15:39:21 +0000 2017",
        "favourites_count": 1,
        "utc_offset": null,
        "time_zone": null,
        "geo_enabled": false,
        "verified": false,
        "statuses_count": 9,
        "lang": "en",
        "contributors_enabled": false,
        "is_translator": false,
        "is_translation_enabled": false,
        "profile_background_color": "F5F8FA",
        "profile_background_image_url": null,
        "profile_background_image_url_https": null,
        "profile_background_tile": false,
        "profile_image_url": "",
        "profile_image_url_https": "",
        "profile_banner_url": "",
        "profile_link_color": "1DA1F2",
        "profile_sidebar_border_color": "C0DEED",
        "profile_sidebar_fill_color": "DDEEF6",
        "profile_text_color": "333333",
        "profile_use_background_image": true,
        "has_extended_profile": false,
        "default_profile": true,
        "default_profile_image": false,
        "following": true,
        "follow_request_sent": false,
        "notifications": false,
        "translator_type": "none"
    "geo": {
        "type": "Point",
        "coordinates": [39.489207, -76.661301]
    "coordinates": {
        "type": "Point",
        "coordinates": [-76.661301, 39.489207]
    "place": {
        "id": "014f4cd64f60daea",
        "url": "https:\/\/\/1.1\/geo\/id\/014f4cd64f60daea.json",
        "place_type": "city",
        "name": "Cockeysville",
        "full_name": "Cockeysville, MD",
        "country_code": "US",
        "country": "United States",
        "contained_within": [],
        "bounding_box": {
            "type": "Polygon",
            "coordinates": [
                    [-76.6729123, 39.453647],
                    [-76.58212, 39.453647],
                    [-76.58212, 39.534252],
                    [-76.6729123, 39.534252]
        "attributes": {}
    "contributors": null,
    "is_quote_status": false,
    "retweet_count": 2,
    "favorite_count": 2,
    "favorited": true,
    "retweeted": true,
    "possibly_sensitive": false,
    "possibly_sensitive_appealable": false,
    "lang": "en"

Yikes! We’re only interested in a fraction of this information, so we’d like to condense all this into the following format:

    "created_date": "2017-04-16T14:05:17.000Z",
    "link": "",
    "location": [-76.661301, 39.489207],
    "ref_id": "849699711376293888",
    "text": "Cyphon Release 1.0.0",
    "user": {
        "link": "",
        "name": "The Cyphon Project",
        "profile_pic": "",
        "ref_id": "834789750624116736",
        "screen_name": "cyphonproject"

We also want to analyze the tweet to determine whether it reflects a positive, negative, or neutral sentiment. If it’s negative, we want to be alerted to that fact.

In this tutorial, we’ll show you how to use Cyphon’s DataChutes to parse and save tweets in a condensed form, perform a sentiment analyses on them, and generate Alerts for tweets with negative sentiments.

Step 1: Container

We’ll start by creating a Container to store the condensed tweet. The Container will consist of a Bottle and a Label.



Open the “Shaping Data” panel on Cyphon’s main admin page, and create the following BottleFields:

Field Name Field Type Target Type
created_date DateTimeField DateTime
link URLField  
location PointField Location
name CharField Account
profile_pic URLField  
ref_id CharField  
screen_name CharField Account
text TextField Keyword

Now we’ll create two Bottles (one will be nested within the other). The first Bottle will contain data specific to the user who posted the tweet.

Using the BottleFields we just created, add these fields to a Bottle called “user”:

  • link
  • name
  • profile_pic
  • ref_id
  • screen_name

Now we’ll work on the Bottle for the tweet itself. Create a Bottle called “post” and add the following BottleFields:

  • created_date
  • link
  • location
  • ref_id
  • text

Finally, we need to embed the “user” Bottle within the “post” Bottle. To do this, we’ll create a special BottleField that refers to the “user” Bottle. You can create this new BottleField from the admin page of the “post” Bottle by clicking the “+” sign by the list of fields.

The new BottleField should have the following properties:

BottleField.field_name user
BottleField.field_type EmbeddedDocument
BottleField.embedded_doc user

Confirm that this new BottleField was added to the “post” Bottle and save the Bottle.



Next, we’ll need to create a couple of Protocols that will perform the sentiment analyses, if they don’t already exist.

Under “App Configurations,” click “Protocols,” and create two new Protocols with the following properties:

Name Package Module Function
sentiment sentiment sentiment get_sentiment
polarity sentiment sentiment get_polarity

These Protocols will analyze text using the get_sentiment() and get_polarity() functions in the sentiment module within the Lab package’s sentiment subpackage. The get_sentiment() function will determine whether text is “positive,” “negative,” or “neutral.” The get_polarity() function will assign a numerical value for polarity (e.g., 0.1, -0.3, etc.). This will allow us to distinguish between tweets that are mildy negative and those that are strongly negative.



Now we’ll specify which field of the Twitter data object we want to analyze with these Protocols.

Under “Enhancing Data,” create two new Procedures with the following properties:

Name Protocol Field Name
text_sentiment sentiment text
text_polarity polarity text

These Procedures will be used to analyze the “text” field in our “post” Bottle.



Next, we’ll create the Label that will contain the results of our analyses.

First we’ll create LabelFields that will store the results of the two Procedures we created. Under “Shaping Data,” create two new LabelFields with the following properties:

Field Name Field Type Analyzer
polarity FloatField text_polarity (procedure)
sentiment CharField text_sentiment (procedure)

Add the two LabelFields to a new Label called “post”.



We now have all the pieces to make a Container. Create a Container called “post” using the “post” Bottle and the “post” Label. The preview should look like this:

    "_metadata": {
        "polarity": "FloatField",
        "sentiment": "CharField"
    "created_date": "DateTimeField (DateTime)",
    "link": "URLField",
    "location": "PointField (Location)",
    "ref_id": "CharField",
    "text": "TextField (Keyword)",
    "user": {
        "link": "URLField",
        "name": "CharField (Account)",
        "profile_pic": "URLField",
        "ref_id": "CharField",
        "screen_name": "CharField (Account)"

Note that our LableFields are stored under the “_metadata” field.

Define a Taste for this Container with the following properties: user.screen_name
Taste.title text
taste.content text
taste.location location
taste.location_format Longitude, Latitude
taste.datetime created_date

Step 2: Condenser


We now need to map the fields from the Twitter data to the fields of the Container we created. We can do this by creating a Condenser.

Under the “Condensing Data” panel on the main admin page, click “JSON Data,” and create the following DataParsers:

Name Source Field Method Formatter
coordinates.coordinates__COPY coordinates.coordinates Copy  
created_at__DATE created_at Date from string  
id_str__COPY id_str Copy  
text__COPY text Copy  
twitter__content_link user.screen_name, id_str Copy{}/statuses/{}
twitter__user_link user.screen_name Copy{}/
user.id_str__COPY user.id_str Copy  
user.name__COPY Copy  
user.profile_image_url__COPY user.profile_image_url Copy  
user.screen_name__COPY user.screen_name Copy  

Next, we’ll create DataCondensers to map these DataParsers to the fields of the “post” Bottle and its nested “user” Bottle.

Create a DataCondenser called “twitter__user” and associate it with the “user” Bottle. Add the following DataFittings:

Target Field Parser (data parser)
link twitter__user_link (data parser)
name user.name__COPY (data parser)
profile_pic user.profile_image_url__COPY (data parser)
ref_id user.id_str__COPY (data parser)
screen_name user.screen_name__COPY (data parser)

Now create another DataCondenser called “twitter__post,” and associate it with the “post” Bottle. Add the following DataFittings:

Target Field Parser
created_date created_at__DATE (data parser)
link twitter__content_link (data parser)
location coordinates.coordinates__COPY (data parser)
ref_id id_str__COPY (data parser)
text text__COPY (data parser)
user twitter__user (data condenser)

Step 3: Distillery


Next, we need to specify where the data will be stored. To do this, we’ll create a Distillery, which defines the data store and the data schema for the tweets. The data schema is represented by the Distillery’s Container (which we already created). The data store is represented by the Distillery’s Collection.

Before we create the Collection, we first need to establish a Warehouse for it. Open the “Storing Data” panel, and create a Warehouse with the following properties:

Warehouse.backend elasticsearch cyphon
Warehouse.time_series True

This will establish daily indexes in Elasticsearch with the pattern “cyphon-YYYY-MM-dd.”


Time-series indexes make it easier to archive or delete old data. See Elasticsearch’s documentation on retiring data for more info.

Next, create a Collection in this Warehouse for social media posts:

Collection.warehouse cyphon social

This will create a “social” document type within the “cyphon” Elasticsearch index. (If you were using MongoDB as a backend instaad, it would create a collection called “social” in a database called “cyphon.”)

Under the “Distilling Data” panel, create a Distillery that uses this Collection and the “post” Container we previously created:

Distillery.container post

Step 4: Munger


Now that we have a Condenser to process the data and a Distillery to store it, we can put them together to create a Munger.

Open the “Sifting Data” panel, and click on “JSON Data.” Create a DataMunger with the following properties: tweets
DataMunger.condenser twitter__post

Step 5: Chute


Now that we hava a Munger, we can create a Chute to catch incoming tweets.

Create a DataChute with the following properties:

DataChute.munger tweets
DataChute.endpoint Twitter PublicStreamsAPI
DataChute.enabled True

We’re not associating a Sieve with this Chute, which means it will capture all incoming tweets and send them to the “Social Media” Munger. This Munger will parse the tweets and save them in the “” Collection.


If you’d only like to send a subset of tweets to a particular Munger, you can create a Chute with a Sieve that captures only the documents you want. Creating several Chutes with different Sieves and Mungers will allow you to save tweets on different topics to different Collections.

Step 6: Watchdog


Now that we’ve made a Chute to save the tweets, we need to create a Watchdog to generate Alerts.

Let’s suppose we want to generate an Alert whenever we get a negative tweet about a company called “Acme”. We’ll start by making some DataRules to identify these tweets.

Under the “Sifting Data” panel, click on “JSON Data” and create the following DataRules:

DataRule Name Field Name Operator Value
text_contains_acme text contains acme
sentiment_equals_negative _metadata.sentiment equals negative

We can then use these DataRules to create a DataSieve that a Watchdog will use to inspect the document:

DataSieve Name Logic Node 1 (data rule) Node 2 (data rule)
Negative Acme posts AND text_contains_acme sentiment_equals_negative

Now go to the “Configuring Alerts” panel and create the Watchdog: Acme
Watchdog.enabled True

Add the following Trigger to the Watchdog:

DataSieve Alert Level Rank
Negative Acme posts High 0

Step 7: Filter


Now that we’ve configured Cyphon to handle the tweets, we need to set up a Filter that will collect tweets about “Acme.”

Under the “Filtering Data” panel, create a SearchTerm for the term “acme”. Then create a Filter based on the SearchTerm you just added.

Step 8: Stream


Finally, we need to connect Cyphon to a Twitter stream, so it can start gathering tweets. To do this, follow the instructions under Twitter Setup. When you’re done, make sure the “twitter” Reservoir is enabled (under “App Configurations > Data Collection”).

You should now be receiving tweets! You can double-check this by clicking on “Streams” under the “Records” panel of the admin page. You should see a stream for “Twitter PublicStreamsAPI: Twitter”.


In this tutorial, we showed you how to collect, analyze, and generate Alerts from tweets. Now that you have a basic idea of how to filter tweets and set up rules, you can experiment with more advanced rulesets that generate high-, medium-, and low-level Alerts.