BOS311

311 SERVICE REQUESTS - 2020
Download Dataset


Presentation


Introduction

System Blueprint:

Background Program Flow Chart:

Take a glance of the official database:


About BOS:311

  • Non-emergency city service company
  • Incidents are reported by residents of Boston
  • Public service whose motto is to resolve reported issues effectively

Data Visualization of BOS:311 datasets


Problem Statements

  • Potential Duplicate issue creations.
  • No feedback is given to the users.
  • Hundred of new cases created every day disperse in Boston city.
  • No rule for reasonably distributing resources to handle the cases.

Our Targets

1. Improve system efficiency.

2. Reduce staff workload.

3. Increase user satisfaction.

4. Integrate the ChatBot in BOS:311 Mobile Application.

Details:

  1. The first target is to improve system efficiency through Amazon RDS. We defined a new variable named priority from 1~5 to mark the emergence level of each case.
  2. The second target is to reduce the staff workload through the background program. Our system will detect the validation of every new report.
  3. The third target is to increase user satisfaction through Amazon SNS. Our system will follow up on the cases, and inform the latest cast status to the users.
  4. The fourth target is to integrate the ChatBot in the BOS:311 application, which created base on Amazon Lex. The users could talk with the phone and submit a new report. That’s cool, right?

P.S: Priority is a big issue that is worth discussing more. Also, a different case should have a different weighted priority. And the default priority from 1 to 5 could be customized again.


Our Future Plan

Amazon Lambda - running background program.

Amazon S3 - storing case pictures.

Amazon Rekognition - analyze closed today case for detecting validation.

Amazon SNS - inform user the latest case status.

Amazon Cloud Formation - for future upgrade.


Contributions

This program is designed for BOS311 and related database. Basically:

  1. the AWS Chatbot, created by Gowtham Badu, will gain users’ input and send a JSON file to the background program.
  2. the background program, created by Zacks Shen, will detect the case then decide to UPDATE, INSERT our database or make a deeper analysis. The database request will send to AWS RDS in a JSON file.
  3. the AWS RDS, maintained by Deepak Prasad Baskaran, will receive the request from the background program then manipulate the database.
  4. the report and first version of the slides, was created by Jaswanth Satya Harsha Mattaparthy.
  5. the video, created by us together, was edited by Zacks Shen.


One more thing

Besides on the program itself, the article contains several crucial system design ideas and methods in Python and MySQL:

  • How to read .csv files in dataframe style in Python.
  • How to import text files such as .csv to the database in OS X.
  • How to connect to MySQL server through Python.
  • How to using Python manipulate MySQL.
  • The ideas of handling users’ case.
    • Checking the case validity.
    • Saving system and human resource.
    • Providing better follow-up services.

Offline Preparation

  1. Download dataset and Gather Information in Python
  2. Load data in MySQL
    • Get columns name
    • Modify the columns
      • Add column email
      • Add column priority

If you do NOT have MySQL option file (https://zacks.one/sql-tutorial/#2.3)


1. Gather data information in Python

pandas.read_csv
DataFrame

Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import pandas

bos311 = pandas.read_csv('https://raw.githubusercontent.com/ZacksAmber/Code/master/Python/BOS311/tmpr2k_j017.csv')

bos311.dtypes
"""
case_enquiry_id int64 # BIGINT
open_dt object
target_dt object
closed_dt object
ontime object
case_status object
closure_reason object
case_title object
subject object
reason object
type object
queue object
department object
submittedphoto object
closedphoto object
location object
fire_district object
pwd_district object
city_council_district object
police_district object
neighborhood object
neighborhood_services_district object
ward object
precinct object
location_street_name object
location_zipcode float64 # DOUBLE
latitude float64
longitude float64
source object
dtype: object
"""

2. Load data in MySQL

Login to MySQL as root:

Shell
1
mysql -u root -p

Create database named bosbot then use it:

SQL
1
2
CREATE DATABASE bosbot;
USE bosbot;

Create a table named “test” with the same columns as the csv file.

  • Remember the data_type must be equal or longer than it in the text file.
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
CREATE TABLE sample_311 (
case_enquiry_id BIGINT NOT NULL PRIMARY KEY,
open_dt VARCHAR(100),
target_dt VARCHAR(100),
closed_dt VARCHAR(100),
ontime VARCHAR(100),
case_status VARCHAR(100),
closure_reason TEXT,
case_title VARCHAR(100),
subject VARCHAR(100),
reason VARCHAR(100),
type VARCHAR(100),
queue VARCHAR(100),
department VARCHAR(100),
submittedphoto TEXT,
closedphoto TEXT,
location VARCHAR(100),
fire_district VARCHAR(100),
pwd_district VARCHAR(100),
city_council_district VARCHAR(100),
police_district VARCHAR(100),
neighborhood VARCHAR(100),
neighborhood_services_district VARCHAR(100),
ward VARCHAR(100),
precinct VARCHAR(100),
location_street_name VARCHAR(100),
location_zipcode VARCHAR(100),
latitude DOUBLE,
longitude DOUBLE,
source VARCHAR(100),
email TEXT,
priority INT NOT NULL DEFAULT 1
)ENGINE=InnoDB DEFAULT CHARSET=utf8;

ALTER TABLE sample_311 MODIFY open_dt DATETIME;
ALTER TABLE sample_311 AUTO_INCREMENT=101003186441;
ALTER TABLE sample_311 MODIFY case_enquiry_id BIGINT AUTO_INCREMENT;

Import your data from the text file to table sample_311:
Remember copy your text file such as .csv to /tmp first

SQL
1
2
3
4
5
6
LOAD DATA INFILE '/tmp/tmpr2k_j017.csv'
INTO TABLE sample_311
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

If the system pops up the error: ERROR 1290 (https://zacks.one/sql-tutorial/#2.3)

Check the top 10 rows:

SQL
1
SELECT * FROM sample_311 LIMIT 10 \G

Compare it to the original file:

  • Return to Python interface and input:
Python
1
bos311.head(10)


Code Offline

Source Code


Modules

Modules:

  • mysql.connector: for connecting to MySQL.
    • python3 -m pip install mysql-connector-python
    • OR conda install mysql-connector-python
  • datetime: for requesting the date.
Python
1
2
3
# Import modules
import mysql.connector
import datetime

Code Part

Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
#!/usr/bin/env python3
# -*- coding:utf-8 -*-
################################################################################

# Title: bos311_offline.py #
# File: /Users/zacks/Desktop/Code/Python/BOS311/bos311_offline.py #
# Project: /Users/zacks/Desktop/Code/Python/BOS311 #
# Created Date: 2020-02-07 #
# ----- #
# Author: Zacks Shen #
# Blog: https://zacks.one #
# Email: <zacks.shen@gmail.com> #
# Github: https://github.com/ZacksAmber #
# ----- #
# Last Modified: 2020-02-07 8:48:47 pm #
# Modified By: Zacks Shen <zacks.shen@gmail.com> #
# ----- #
# Copyright (c) 2020 Zacks Shen #
################################################################################

# Import modules
import mysql.connector
import datetime

##########################

# Define Query Functions #
##########################

# Define a function to connect MySQL
def mysql_login(mysql_host, mysql_user, mysql_passwd, mysql_database):
global mysql_connect

mysql_connect = mysql.connector.connect(
host = mysql_host,
user = mysql_user,
passwd = mysql_passwd,
database = mysql_database
)

# Define a function to collect case information
def case(open_dt, reason, location, case_status = 'Open', case_title = None, source = 'ChatBot', email = None, priority = 1):
global case_information

try:
case_information
except NameError:
pass
else:
case_information.clear()

case_information = {
'case_enquiry_id': None,
'open_dt': open_dt,
'target_dt': None,
'closed_dt': None,
'ontime': None,
'case_status': case_status,
'closure_reason': None,
'case_title': case_title,
'subject': None,
'reason': reason,
'type': None,
'queue': None,
'department': None,
'submittedphoto': None,
'closedphoto': None,
'location': location,
'fire_district': None,
'pwd_district': None,
'city_council_district': None,
'police_district': None,
'neighborhood': None,
'neighborhood_services_district': None,
'ward': None,
'precinct': None,
'location_street_name': None,
'location_zipcode': None,
'latitude': None,
'longitude': None,
'source': source,
'email': email,
'priority:': priority
}

return(case_information)

# Define a MySQL SELECT function. This function will not be used in this program.
def mysql_select():
sql = input("Input your SELECT sentence: ")

mysql.execute(sql)
result = mysql.fetchall()

if mysql.rowcount == 0:
print("Empty set")
elif mysql.rowcount == 1:
print("1 row in set")
else:
print("{0} rows in set".format(mysql.rowcount))

return(result)

# Define a function to judge if the case is duplicate.
def judge_duplicate():
sql = "SELECT * FROM {0} WHERE reason = '{1}' AND location = '{2}'".format(
mysql_table,
case_information['reason'],
case_information['location']
)

mysql.execute(sql)
result = mysql.fetchall()

return(result)

# Define a function to judge if the case is Open.
def judge_open():
sql = "SELECT * FROM {0} WHERE reason = '{1}' AND location = '{2}'".format(
mysql_table,
case_information['reason'],
case_information['location']
)

mysql.execute(sql)
result = mysql.fetchall()

l = []
for i in result:
if i[5] == 'Open':
l.append(i)

if len(l) > 0:
case_information['priority'] = len(l)

return(result)

# Define a function to judge if the case is created today.
def judge_date():
open_dt = case_information['open_dt'][0:10] # Extract the data

sql = "SELECT * FROM {0} WHERE reason = '{1}' AND location = '{2}' AND open_dt LIKE '%{3}%'".format(
mysql_table,
case_information['reason'],
case_information['location'],
open_dt
)

mysql.execute(sql)
result = mysql.fetchall()

l = []
for i in result:
l.append(i[1].strftime("%Y-%m-%d %H:%M:%S")[0:10]) # change datetime.datetime type to string first then extract date

if datetime.datetime.now().date().isoformat() in l:
return(result)
else:
return(None)

##########################

# Define MySQL Functions #
##########################

# Define the UPDATE function to raise the priority
def mysql_UPDATE_priority():
sql = "SELECT * FROM {0} WHERE reason = '{1}' AND location = '{2}'".format(
mysql_table,
case_information['reason'],
case_information['location']
)

mysql.execute(sql)
result = mysql.fetchall()

# Get the priority of the existing case.
l = []
for i in result:
l.append(i[-1])

if case_information['priority'] + max(l) > 5:
case_information['priority'] = 5
else:
case_information['priority'] = case_information['priority'] + max(l)

sql = "UPDATE {0} SET priority = {1} WHERE case_status = 'Open' AND reason = '{2}' AND location = '{3}'".format(
mysql_table,
case_information['priority'],
case_information['reason'],
case_information['location']
)

mysql.execute(sql)
mysql_connect.commit()
print(mysql.rowcount, "record(s) affected")

# Define the UPDATE function to change the case_status
def mysql_UPDATE_case_status(case_status):
case_information['case_status'] = case_status

sql = "UPDATE {0} SET case_status = '{1}' WHERE reason = '{2}' AND location = '{3}'".format(
mysql_table,
case_information['case_status'],
case_information['reason'],
case_information['location']
)

mysql.execute(sql)
mysql_connect.commit()
print(mysql.rowcount, "record(s) affected")

# Define the INSERT function to insert new case
def mysql_INSERT():
sql = "INSERT INTO " + mysql_table + " (case_enquiry_id, open_dt, target_dt, closed_dt, ontime, case_status, closure_reason, case_title, subject, reason, type, queue, department, submittedphoto, closedphoto, location, fire_district, pwd_district, city_council_district, police_district, neighborhood, neighborhood_services_district, ward, precinct, location_street_name, location_zipcode, latitude, longitude, source, email, priority) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"

val = (0, case_information['open_dt'], None, None, None, 'Open', None, case_information['case_title'], None, case_information['reason'], None, None, None, None, None, case_information['location'], None, None, None, None, None, None, None, None, None, None, None, None, case_information['source'], case_information['email'], 1)

mysql.execute(sql, val)
mysql_connect.commit()
print(mysql.rowcount, "record inserted.")

#################

# Main function #
#################

def main_function():
if judge_duplicate():
if judge_open(): # Situation 4: Open case
print("Situation 4: This is a duplicate case. You should UPDATE database!")
mysql_UPDATE_priority()
else: # Must be Closed case
if judge_date(): # Situation 3: closed today days
print("Situation 3: This may not be a new case. You should use AWS Rekognition!")
else: # Situation 2: closed several days ago
print("Situation 2: This is a new case. You should INSERT database!")
mysql_INSERT()
else: # Situation 1: NOT duplicate case
print("Situation 1: This is a new case. You should INSERT database!")
mysql_INSERT()

###################

# Set Information #
###################

# Set login information
mysql_login(
mysql_host = "localhost",
mysql_user = "test",
mysql_passwd = "passwd",
mysql_database = "bosbot"
)

# Set MySQL cursor
mysql = mysql_connect.cursor()

# Set MySQL table
# mysql_table = 'sample_311'
mysql_table = 'sample_311'

Testing Part

Background Program Flow Chart:

The Background Program Flow Chart simulates the whole process from creating a case to manipulating our database. Once the user clicks Submit Report, he/she has to select a reason, which we need to classify the case. Then the user is required to provide the location to help us handle the case. Photo and description are optional. You can also leave your contact information. After updating the case status, we will inform you. Our Chatbot will record all the above information then send it to our Background program.

Here our Background program received all the information it needed. It will start to compare the new case to our existing cases. According to the reason, location, and date, we can make sure if it is a valid case. And the cased will be changed from Open to Closed when we complete your case. There are four situations:

Situation 1: The case is an absolute new case that is no similar to all of the cases stored in our database.

  • reason = NOT same
  • location = OR NOT same
  • ACTION: INSERT Database; inform the user; return the case ID.
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Situation 1: NOT duplicate case > INSERT Database
def situation_1():
case(
open_dt = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
case_status = "Open",
case_title = "Situation 1",
reason = "Illegal Parking",
location = "360 Huntington Ave Boston MA 02115",
source = "ChatBot",
email = "test@gmail.com",
priority = 1
)
return(case_information)

situation_1()
main_function()

SQL
1
SELECT * FROM sample_311 WHERE case_title  = "Situation 1" \G

Python
1
mysql_UPDATE_case_status(case_status = "Closed") # Change the case_status of situation 1 from "Open" to "Closed"

SQL
1
SELECT * FROM sample_311 WHERE case_title  = "Situation 1" \G

Situation 2: The case is similar to an existing case. However, the existing case has been Closed days ago, which means things happen again. It should be a new case.

  • reason = SAME
  • location = SAME
  • case_status = Closed
  • open_dt = NOT today
  • ACTION: INSERT Database; inform the user; return the case ID.
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Situation 2: duplicate case but closed several days ago > New case and INSERT database
def situation_2():
case(
open_dt = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
case_status = "Open",
case_title = "Situation 2",
reason = "Enforcement & Abandoned Vehicles",
location = "15 Sudbury St Boston MA 02203",
source = "ChatBot",
email = "test@gmail.com",
priority = 1
)
return(case_information)

situation_2()
main_function()

SQl
1
2
SELECT * FROM sample_311 WHERE reason = "Enforcement & Abandoned Vehicles" AND location = "15 Sudbury St  Boston  MA  02203" \G
DELETE FROM sample_311 WHERE case_title = "Situation 2";

Situation 3: The case is similar to an existing case. However, the existing case just has been Closed today, which means the new case might be the duplicate case. (Remember we just created a case today and closed it in Situation 1.) For example, several users submit the same case; or the same case is submitted several times due to network or system error; or maybe things happen again. If so, we need to analyze it through AWS Rekognition and make a decision.

  • reason = SAME
  • location = SAME
  • case_status = Closed
  • open_dt = Today
  • ACTION: Analyze the case; inform the user.
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Situation 3: # duplicate case but closed today > Analyze(Rekognition)
def situation_3():
case(
open_dt = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
case_status = "Open",
case_title = "Situation 3",
reason = "Illegal Parking",
location = "360 Huntington Ave Boston MA 02115",
source = "ChatBot",
email = "test@gmail.com",
priority = 1
)
return(case_information)

situation_3()
main_function()

SQL
1
2
SELECT * FROM sample_311 WHERE reason = "Illegal Parking" AND location = "360 Huntington Ave  Boston  MA  02115" \G
DELETE FROM sample_311 WHERE case_title = "Situation 1";

Situation 4: The case is similar to an existing case. However, the existing case is Open, which means it may be an urgent case. We should raise the priority of the case. And our priority is from 1 to 5.

  • reason = SAME
  • location = SAME
  • case_status = Open
  • ACTION: UPDATE database. Raise the priority; Inform the user, return the case ID.
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Situation 4: duplicate and open case > UPDATE priority
def situation_4():
case(
open_dt = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
case_status = "Open",
case_title = "Situation 4",
reason = "Street Lights",
location = "15 Sudbury St Boston MA 02203",
source = "ChatBot",
email = "test@gmail.com",
priority = 1
)
return(case_information)

situation_4()
main_function()

SQL
1
2
SELECT * FROM sample_311 WHERE priority != 1 \G
UPDATE sample_311 SET priority = 1 WHERE priority != 1;