# Homework 9

We're going to make a Lil' QuakeBot!!!! Please write this code as a .py file, not a Notebook.

## First, tips:

Be careful that you're doing all of your math with ints or floats instead of strings that look like ints or floats.
When you write your functions, you can pass either the entire dictionary to the function OR just the part you're curious about (e.g., when you're getting the day you could send the whole earthquake dictionary or just what's in the 'time' key.)

Writing empty functions that always return the same thing are a great way to start off. You can start saying every earthquake is shallow and then fill in the actual code later.

Find out what each column name in the database means by visiting http://earthquake.usgs.gov/earthquakes/feed/v1.0/csv.php and clicking the links for each column.

## PART ZERO: Overall description

Given an earthquake defined like this...

    earthquake = {
      'rms': '1.85',
      'updated': '2014-06-11T05:22:21.596Z',
      'type': 'earthquake',
      'magType': 'mwp',
      'longitude': '-136.6561',
      'gap': '48',
      'depth': '10',
      'dmin': '0.811',
      'mag': '5.7',
      'time': '2014-06-04T11:58:58.200Z',
      'latitude': '59.0001',
      'place': '73km WSW of Haines, Alaska',
      'net': 'us',
      'nst': '',
      'id': 'usc000rauc'}

I want to be able to run

    print(eq_to_sentence(earthquake))

and get the following:

> A **DEPTH** **POWER**, **MAGNITUDE** earthquake was reported **DAY** **TIME_OF_DAY** on **DATE** **LOCATION**.

So, for example, "A **deep**, **huge** **4.5** magnitude earthquake was reported **Monday** **morning** on **June 22** **73km WSW of Haines, Alaska**".

**DEPTH**, **POWER**, **MAGNITUDE**, **DAY**, and **TIME_OF_DAY** should all come from separate functions. More details are in **PART ONE** and **PART TWO**.

In [1]:
# Start by saving the sample earthquake
earthquake = {
  'rms': '1.85',
  'updated': '2014-06-11T05:22:21.596Z',
  'type': 'earthquake',
  'magType': 'mwp',
  'longitude': '-136.6561',
  'gap': '48',
  'depth': '10',
  'dmin': '0.811',
  'mag': '5.7',
  'time': '2014-06-04T11:58:58.200Z',
  'latitude': '59.0001',
  'place': '73km WSW of Haines, Alaska',
  'net': 'us',
  'nst': '',
  'id': 'usc000rauc'}

## PART ONE: Write your few tiny functions

First you'll need to write a few functions to help describe an earthquake. Try out each of these functions individually. You will probably require:

* **`depth_to_words`** will describe the earthquake's depth
* **`magnitude_to_words`** will describe the earthquake's power
* **`day_in_words`** should be the day of the week
* **`time_in_words`** should be "morning", "afternoon", "evening" or "night"
* **`date_in_words`** should be "Monthname day", e.g. "June 22"
* Any other functions as necessary

**DEPTH** can be determined from the USGS website - http://earthquake.usgs.gov/learn/topics/seismology/determining_depth.php - it should be either 'shallow', 'intermediate' or 'deep'

**POWER** should be evocative words like like 'easily ignored' or 'huge' or 'very destructive' (feel free to pick your own) - look on Google Image Search for "richter scale" to see some possible descriptors.

**MAGNITUDE** should be the actual numerical magnitude.

**DAY** should be the day of the week.

**TIME_OF_DAY** should be morning, afternoon, evening or night.

**DATE** should be "Monthname day", e.g. "June 22".

**TIP:** You probably (a.k.a. definitely) need to convert 'time' - which is a string - into a Python datetime object which can do **`.hour`**, **`.day`**, **`.strftime("%Y %b %d")`** and other fun things. Convert it and test the conversion like this:

````py
import dateutil.parser
timestring = '2014-06-04T11:58:58.200Z'
yourdate = dateutil.parser.parse(timestring)
print("The hour is", yourdate.hour)
print("We can do things with strftime like", yourdate.strftime("%Y %b"))
````

You'll need to **`pip install dateutils`**.

In [None]:
# I hate running command line tools in the notebook, but let's do it anyway...
!pip install dateutils

In [4]:
import dateutils

## Our first function: **`depth_to_words`**

There are a few different ways to build `depth_to_words`. Let's take a look at a few.

* **`depth_to_words`** will describe the earthquake's depth
* **DEPTH** can be determined from the USGS website - http://earthquake.usgs.gov/learn/topics/seismology/determining_depth.php  - it should be either 'shallow', 'intermediate' or 'deep'

From that URL:

> Shallow earthquakes are between 0 and 70 km deep; intermediate earthquakes, 70 - 300 km deep; and deep earthquakes, 300 - 700 km deep.

### Attempt One

We'll take an integer, and just follow what the text says.

In [16]:
def depth_to_words(depth):
    if depth > 0 and depth < 70:
        return "shallow"
    elif depth > 70 and depth < 300:
        return "intermediate"
    elif depth > 300 and depth < 700:
        return "shallow"

# You would call it like this
depth_to_words(70)

# Now let's test it with some sample values
test_values = [30, 100, 70, 0, 1000, -1]
for value in test_values:
    print(value, "gives us", depth_to_words(value))

30 gives us shallow
100 gives us intermediate
70 gives us None
0 gives us None
1000 gives us None
-1 gives us None


70, 0, 1000 and -1 all give us **`None`** as our depth! This is because we listened a little too closely to the description given to us by USGS.

* Shallow: ABOVE 0 and BELOW 70
* Intermediate: ABOVE 70 and BELOW 300
* Deep: ABOVE 300 and BELOW 700

Unfortunately 70 slipped through the cracks (we looked for BELOW 70 and ABOVE 70) as did values outside of 0-700. Should a 70km earthquake be shallow or intermediate? Should a 1000km earthquake be considered deep, or should it encounter an error? How about 0km, or negative values?

Sometimes decisions like this take more research, and sometimes it takes you making a decision.

* 0km and negative-km earthquakes are generally explosions at the surface, but there can also be slight errors in measurement. I think making them shallow isn't a big deal.
* Is the 70km cutoff for shallow/intermediate an official term? I'm going to say **no** (even though I didn't check!), so we can count it as either.
* 1000km earthquakes seem deep to me, so let's consider them deep.

### Version Two: Taking care with edge cases

In [20]:
def depth_to_words(depth):
    if depth < 70:
        return "shallow"
    elif depth < 300:
        return "intermediate"
    else:
        return "shallow"

# You would call it like this
depth_to_words(70)

# Now let's test it with some sample values
test_values = [30, 100, 70, 0, 1000, -1]
for value in test_values:
    print(value, "gives us", depth_to_words(value))

30 gives us shallow
100 gives us intermediate
70 gives us intermediate
0 gives us shallow
1000 gives us shallow
-1 gives us shallow


That looks a little better. **Opening up the ends** instead of being firm gives us a little flexibility with our input. Whether that's a good or bad idea depends on what your data is!

We do have one big problem, though, and it shows up when we start working with our real data

In [19]:
depth_to_words(earthquake['depth'])

TypeError: unorderable types: str() < int()

Uh oh, turns out **`earthquake['depth']`** is actually a `string`, which you can't compare to an integer. Guess we need to convert that.

### Attempt Three: Preparing our input

In [21]:
def depth_to_words(str_depth):
    depth = int(str_depth)
    if depth < 70:
        return "shallow"
    elif depth < 300:
        return "intermediate"
    else:
        return "shallow"

# You would call it like this
depth_to_words(70)

# Now let's test it with some sample values
test_values = [30, 100, 70, 0, 1000, -1]
for value in test_values:
    print(value, "gives us", depth_to_words(value))

30 gives us shallow
100 gives us intermediate
70 gives us intermediate
0 gives us shallow
1000 gives us shallow
-1 gives us shallow


Luckily there isn't a problem with converting an integer into an integer: `int(70)` just becomes `70` - it doesn't have to be int("70"). Now let's try it with our data.

In [22]:
depth_to_words(earthquake['depth'])

'shallow'

### Attempt three: Passing a dictionary

Now you get to make an **executive decision**. Which one of the following looks better to you?

* `depth_to_words(earthquake['depth'])`
* `depth_to_words(earthquake)`

I like the second one, it moves more of the "magic" out into the function. So let's rewrite our function **to accept a dictionary instead of a number.**

In [29]:
def depth_to_words(earthquake):
    depth = int(earthquake['depth'])
    if depth < 70:
        return "shallow"
    elif depth < 300:
        return "intermediate"
    else:
        return "shallow"

# You would call it like this
depth_to_words(earthquake)

'shallow'

This **looks far nicer**, but is also **much more difficult to test**.

Before we could just try things out by sending integers (or even strings) to the function. But now if we wanted to test a bunch of different depths, we need to create a bunch of **fake earthquakes**.

At least we only need to give them the `depth` key, since that's the only key we use in our function.

In [30]:
test_earthquakes = [{'depth': 30}, {'depth': 100}, {'depth': 70}, {'depth': 0}, {'depth': 1000}, {'depth': 0}]
for test_earthquake in test_earthquakes:
    print(test_earthquake['depth'], "gives us", depth_to_words(test_earthquake))

30 gives us shallow
100 gives us intermediate
70 gives us intermediate
0 gives us shallow
1000 gives us shallow
0 gives us shallow


We could also create the dictionary just at the moment we're calling the function, which is a little cleaner-looking.

In [31]:
test_values = [30, 100, 70, 0, 1000, -1]
for value in test_values:
    print(value, "gives us", depth_to_words({'depth': value}))

30 gives us shallow
100 gives us intermediate
70 gives us intermediate
0 gives us shallow
1000 gives us shallow
-1 gives us shallow


I personally think testing with real data is the way to go - I'd give a few tests myself, then plug it into the real data that comes in Part Two.

## Function two: **`magnitude_in_words`**

* **`magnitude_to_words`** will describe the earthquake's power
* **POWER** should be evocative words like like 'easily ignored' or 'huge' or 'very destructive' (feel free to pick your own) - look on Google Image Search for "richter scale" to see some possible descriptors.

I'm going to use this image: http://wiki.ubc.ca/images/e/ea/Richterscale.png

|Range|Description|
|---|---|
|0-2|Never felt|
|2-3|Minor|
|3-4|Minor|
|4-5|Light|
|5-6|Moderate|
|6-7|Strong|
|7-8|Major|
|8-9|Great|
|9-10|Great|
|10+|Epic|

It doesn't seem like the *best* set of descriptions, but we'll go with it.

After the lessons learned with the last function, we're going to make sure to **convert to a float** (not an integer, since the decimals matter) and **accept a dictionary**.

In [34]:
def magnitude_in_words(earthquake):
    mag = float(earthquake['mag'])
    if mag < 2:
        return "unnoticeable"
    elif mag < 4:
        return "minor"
    elif mag < 5:
        return "light"
    elif mag < 6:
        return "moderate"
    elif mag < 7:
        return "strong"
    elif mag < 8:
        return "major"
    elif mag < 10:
        return "great"
    else:
        return "epic"

# And we'll test it with our sample
print(earthquake['mag'], "registers as", magnitude_in_words(earthquake))

5.7 registers as moderate


Looks good to me, on to the next one.

## Function Three: Day of the Week

* **`day_in_words`** should be the day of the week
* **DAY** should be the day of the week.
* **TIP:** You probably (a.k.a. definitely) need to convert 'time' - which is a string - into a Python datetime object which can do .hour, .day, .strftime("%Y %b %d") and other fun things. Convert it and test the conversion like this:

````py
import dateutil.parser
timestring = '2014-06-04T11:58:58.200Z'
yourdate = dateutil.parser.parse(timestring)
print("The hour is", yourdate.hour)
print("We can do things with strftime like", yourdate.strftime("%Y %b"))
````

Let's look at our earthquake's **`time`**

In [35]:
earthquake['time']

'2014-06-04T11:58:58.200Z'

Now let's try out the sample code from up above with our earthquake's **`time`**

In [42]:
import dateutil.parser
yourdate = dateutil.parser.parse(earthquake['time'])
print("The hour is", yourdate.hour)
print("We can do things with strftime like", yourdate.strftime("%Y %b"))

The hour is 11
We can do things with strftime like 2014 Jun


If we're trying to get the day, and we can use **`.hour`**, can we use... **`.day`**?

In [44]:
yourdate.day

4

Uh, not really, I guess. **This is where `strftime` comes into play.** It will convert a **Python representation of time** into a **string**. We can use the [handy reference on strftime.org](http://strftime.org/) to know what the secret codes are when using **`.strftime`**.

In [45]:
yourdate.strftime("%A")

'Wednesday'

Worked like a charm, so now all we need to do is make a function that **accepts the earthquake dictionary** and **returns the day name**.

In [65]:
def day_in_words(earthquake):
    parsed_date = dateutil.parser.parse(earthquake['time'])
    return parsed_date.strftime("%A")

day_in_words(earthquake)

'Wednesday'

Beautiful! On to the next one.

## Function Four: `time_in_words`

* time_in_words should be "morning", "afternoon", "evening" or "night"

Unfortunately we *can't* use **`strftime`** for this. What we can do, though, is think about **what makes morning, afternoon, evening and night?**

The hour. And we have the hour!

In [49]:
yourdate = dateutil.parser.parse(earthquake['time'])
print("The hour is", yourdate.hour)

The hour is 11


**Time for some decisions.** Morning is probably before 12pm, afternoon is before 6pm, evening is 6-8, and night is after that.

We won't see this unless we test specific times, **but what about 2am?** You could either have been out all night (still nighttime) or waking up early for a flight (which seems like morning). What do you pick?

In this case, it's really your choice. Decisions like this are what make computer-generated stories *seem* independent but show that they're really just as human-being-written as any other story. I'm going to say 3am is the beginning of morning.

In [52]:
def time_in_words(earthquake):
    parsed_date = dateutil.parser.parse(earthquake['time'])
    hour = parsed_date.hour
    if hour < 3:
        return "night"
    elif hour < 12:
        return "morning"
    elif hour < 6:
        return "afternoon"
    elif hour < 8:
        return "evening"
    else:
        return "night"
        
print(earthquake['time'])
time_in_words(earthquake)

2014-06-04T11:58:58.200Z


'morning'

Yup, 11AM is morning, so that sounds good. 

I made a choice to have **two separate returns for night**. I could have done `if hour < 3 or hour >= 8` but I felt that keeping the times in order was important. It also would have been easy to do `if hour < 3 or hour > 8` and accidentally miss out on 8pm.

## Function Five: `date_in_words`

* date_in_words should be "Monthname day", e.g. "June 22"

This is easy, it's just another **`strftime`** one! Hop on over to [strftime.org](strftime.org) and we'll be good to go. You'll just want to be careful you don't use `%d`, which would give "04" instead of `%-d`, which gives "4".

In [64]:
def date_in_words(earthquake):
    parsed_date = dateutil.parser.parse(earthquake['time'])
    return parsed_date.strftime("%B %-d")

date_in_words(earthquake)

'June 4'

## PART TWO: Write the `eq_to_sentence` function

Write a function called `eq_to_sentence` that, when called, returns the whole sentence mentioned above, "A DEPTH, POWER MAGNITUDE earthquake was reported DAY TIME_OF_DAY on DATE LOCATION."

Print out the result for the sample earthquake.

In [67]:
def eq_to_sentence(earthquake):
    print("A", depth_to_words(earthquake), magnitude_in_words(earthquake), earthquake['mag'], "earthquake was reported", day_in_words(earthquake), time_in_words(earthquake), "on", date_in_words(earthquake), earthquake['place'])

eq_to_sentence(earthquake)

A shallow moderate 5.7 earthquake was reported Wednesday morning on June 4 73km WSW of Haines, Alaska


This isn't what I asked for, though! I asked for something that *returns a string*, and then the *string gets printed*. It's just a matter of moving **`print`**, maybe?

In [69]:
def eq_to_sentence(earthquake):
    return "A", depth_to_words(earthquake), magnitude_in_words(earthquake), earthquake['mag'], "earthquake was reported", day_in_words(earthquake), time_in_words(earthquake), "on", date_in_words(earthquake), earthquake['place']

print(eq_to_sentence(earthquake))

('A', 'shallow', 'moderate', '5.7', 'earthquake was reported', 'Wednesday', 'morning', 'on', 'June 4', '73km WSW of Haines, Alaska')


Hmmm, doesn't look beautiful. Now, what you **could do** is use the **`+`** to stick all of those together, or maybe treat them as a list and use `' '.join` to join them all together with spaces. I'm going to show you a secret trick, though, called **`.format`**!

In [70]:
name = "Smushface"
animal = "cat"
"My name is {} and I am a {}".format(name, animal)

'My name is Smushface and I am a cat'

**`.format`** allows you to do a "fill in the blanks" with variables when creating a string. It's pretty useful!

In [74]:
depthwords = depth_to_words(earthquake)
magwords = magnitude_in_words(earthquake)
daywords = day_in_words(earthquake)
timewords = time_in_words(earthquake)
datewords = date_in_words(earthquake)
"A {} {} {} earthquake was reported {} {} on {}, {}".format(depthwords, magwords, earthquake['mag'], daywords, timewords, datewords, earthquake['place'])

'A shallow moderate 5.7 earthquake was reported Wednesday morning on June 4, 73km WSW of Haines, Alaska'

If we want to get even crazier, you can use a dictionary with **`.format`** so that you don't have to pay attention to order.

In [77]:
eq_words = {
    'depthwords': depth_to_words(earthquake),
    'magwords': magnitude_in_words(earthquake),
    'daywords': day_in_words(earthquake),
    'timewords': time_in_words(earthquake),
    'datewords': date_in_words(earthquake),
    'location': earthquake['place'],
    'magnitude': earthquake['mag']
}
"A {depthwords} {magwords} {magnitude} earthquake was reported {daywords} {timewords} on {location}, {magnitude}".format(**eq_words)

'A shallow moderate 5.7 earthquake was reported Wednesday morning on 73km WSW of Haines, Alaska, 5.7'

Whichever version you choose, now you can pop it into a function.

In [79]:
def eq_to_words(earthquake):
    eq_words = {
        'depthwords': depth_to_words(earthquake),
        'magwords': magnitude_in_words(earthquake),
        'daywords': day_in_words(earthquake),
        'timewords': time_in_words(earthquake),
        'datewords': date_in_words(earthquake),
        'location': earthquake['place'],
        'magnitude': earthquake['mag']
    }
    return "A {depthwords} {magwords} {magnitude} earthquake was reported {daywords} {timewords} on {location}, {magnitude}".format(**eq_words)

eq_to_words(earthquake)

'A shallow moderate 5.7 earthquake was reported Wednesday morning on 73km WSW of Haines, Alaska, 5.7'

Looking good!

## PART THREE: Doing it in bulk

Read in the csv of the past 30 days of 1.0+ earthquke activity from http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/1.0_month.csv (tip: read_csv works with URLs!)

Because we haven't covered looping through pandas, use the following code to convert a pandas DataFrame into a list of dictionaries that you can loop through.

earthquakes_df = pd.read_csv("1.0_month.csv")
earthquakes = earthquakes_df.to_dict('records')

(If you really want to do it with pandas, it's for index, row in earthquakes_df.iterrows():)

Loop through each earthquake, printing sentence descriptions for the ones that are above or equal to 4.0 on the Richter scale.

In [80]:
import pandas as pd
earthquakes_df = pd.read_csv("http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/1.0_month.csv")
earthquakes_df.head()

Unnamed: 0,time,latitude,longitude,depth,mag,magType,nst,gap,dmin,rms,...,updated,place,type,horizontalError,depthError,magError,magNst,status,locationSource,magSource
0,2016-06-26T11:55:51.850Z,37.506168,-118.797333,-1.19,1.47,md,10.0,219.0,0.06295,0.03,...,2016-06-26T12:18:03.754Z,"22km SE of Mammoth Lakes, California",earthquake,0.72,2.26,0.08,10.0,automatic,nc,nc
1,2016-06-26T11:50:06.540Z,34.832,-118.922167,13.5,1.99,ml,43.0,36.0,0.05095,0.25,...,2016-06-26T12:00:31.920Z,"2km ENE of Frazier Park, CA",earthquake,0.34,0.57,0.196,26.0,automatic,ci,ci
2,2016-06-26T11:41:38.920Z,38.793167,-122.759163,1.09,1.13,md,17.0,69.0,0.02014,0.02,...,2016-06-26T12:07:02.715Z,"1km N of The Geysers, California",earthquake,0.19,0.53,0.1,6.0,automatic,nc,nc
3,2016-06-26T11:17:11.690Z,39.4867,73.3252,15.5,6.4,mww,,57.0,1.176,1.17,...,2016-06-26T12:18:32.170Z,"27km SSE of Sary-Tash, Kyrgyzstan",earthquake,5.8,1.8,,,reviewed,us,us
4,2016-06-26T11:13:00.970Z,33.009667,-116.485,10.52,1.1,ml,34.0,77.0,0.1183,0.17,...,2016-06-26T11:24:18.280Z,"13km SE of Julian, CA",earthquake,0.21,0.45,0.204,27.0,automatic,ci,ci


Now all we need to do is convert the dataframe into a list of dictionaries and pass them to our function, and suddenly we have a hundred sentences without doing much more work at all!

In [84]:
earthquakes = earthquakes_df.to_dict('records')
# I'm just going to slice off the first 100
for earthquake in earthquakes[0:100]:
    print(eq_to_words(earthquake))

A shallow unnoticeable 1.47 earthquake was reported Sunday morning on 22km SE of Mammoth Lakes, California, 1.47
A shallow unnoticeable 1.99 earthquake was reported Sunday morning on 2km ENE of Frazier Park, CA, 1.99
A shallow unnoticeable 1.13 earthquake was reported Sunday morning on 1km N of The Geysers, California, 1.13
A shallow strong 6.4 earthquake was reported Sunday morning on 27km SSE of Sary-Tash, Kyrgyzstan, 6.4
A shallow unnoticeable 1.1 earthquake was reported Sunday morning on 13km SE of Julian, CA, 1.1
A shallow unnoticeable 1.2 earthquake was reported Sunday morning on 27km ENE of Cantwell, Alaska, 1.2
A shallow unnoticeable 1.04 earthquake was reported Sunday morning on 8km E of Mammoth Lakes, California, 1.04
A shallow unnoticeable 1.9 earthquake was reported Sunday morning on 83km E of Sutton-Alpine, Alaska, 1.9
A shallow unnoticeable 1.0 earthquake was reported Sunday morning on 81km NNW of Valdez, Alaska, 1.0
A intermediate minor 2.2 earthquake was reported Sunday

## PART FOUR: The other bits

If the earthquake is anything other than an earthquake (e.g. explosion or quarry blast), print

There was also a magnitude MAGNITUDE TYPE_OF_EVENT on DATE LOCATION.

For example,

There was also a magnitude 1.29 quarry blast on June 19 12km SE of Tehachapi, California.

with TYPE_OF_EVENT being explosion, quarry blast, etc and LOCATION being 'place' - e.g. '0km N of The Geysers, California'.


In [90]:
# Just add in an 'if' statement - we COULD make another function to build that string... but I'm not doing it.

earthquakes = earthquakes_df.to_dict('records')
for earthquake in earthquakes:
    if earthquake['type'] != 'earthquake':
        print("There was also a magnitude", earthquake['mag'], earthquake['type'], "at", earthquake['place'])
    else:
        # I'm going to comment out the earthquakes just so we can make sure quarry blasts etc show up
        #print(eq_to_words(earthquake))
        pass


There was also a magnitude 1.58 quarry blast at 45km NNW of Los Algodones, B.C., MX
There was also a magnitude 1.73 explosion at 3km SSE of Princeton, Canada
There was also a magnitude 1.19 quarry blast at 8km SSW of Mojave, CA
There was also a magnitude 1.78 quarry blast at 10km NNW of Big Bear Lake, CA
There was also a magnitude 1.25 explosion at 13km NE of Abbotsford, Canada
There was also a magnitude 1.84 explosion at 5km E of White Salmon, Washington
There was also a magnitude 1.11 quarry blast at 1km SE of Quarry near Aromas, CA
There was also a magnitude 1.31 quarry blast at 13km SE of Tehachapi, CA
There was also a magnitude 1.22 quarry blast at 11km E of Quarry near Portola Valley, CA
There was also a magnitude 1.34 quarry blast at 11km E of Quarry near Portola Valley, CA
There was also a magnitude 1.08 explosion at 2km ENE of Eatonville, Washington
There was also a magnitude 1.94 quarry blast at 5km ENE of Butte, Montana
There was also a magnitude 1.27 explosion at 7km SSE of