Talk:Machine Learning: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
|||
| Line 2: | Line 2: | ||
Folks met and hacked on the noisebridge discuss mailing list. We created a 93MB text dump, and a python script to parse it, [[File:Py-piper-parser.txt]]. We wrote pseudo code to implement a Naive Bayesian filter to protect the world from trolls. Will implement soon. | Folks met and hacked on the noisebridge discuss mailing list. We created a 93MB text dump, and a python script to parse it, [[File:Py-piper-parser.txt]]. We wrote pseudo code to implement a Naive Bayesian filter to protect the world from trolls. Will implement soon. | ||
== Word parsing python script == | |||
# Function 'get_words' takes list of dictionary of emails. | |||
# Yields: | |||
# Lists of words of in the message, for each message. | |||
def get_words(lst): | |||
for d in lst: | |||
m = d['messageline'] | |||
yield m.split() | |||
Revision as of 00:02, 28 February 2014
Feb. 27, 2014
Folks met and hacked on the noisebridge discuss mailing list. We created a 93MB text dump, and a python script to parse it, File:Py-piper-parser.txt. We wrote pseudo code to implement a Naive Bayesian filter to protect the world from trolls. Will implement soon.
Word parsing python script
- Function 'get_words' takes list of dictionary of emails.
- Yields:
- Lists of words of in the message, for each message.
def get_words(lst):
for d in lst: m = d['messageline'] yield m.split()