Talk:Machine Learning: Difference between revisions

From Noisebridge
Jump to navigation Jump to search
Line 6: Line 6:
== Word parsing python script ==
== Word parsing python script ==


# Function 'get_words' takes list of dictionary of emails.  
Function 'get_words' takes list of dictionary of emails.  
# Yields:
Yields lists of words of in the message, for each message:
#  Lists of words of in the message, for each message.
 
def get_words(lst):
  def get_words(lst):
  for d in lst:
    for d in lst:
    m = d['messageline']
      m = d['messageline']
    yield m.split()
      yield m.split()

Revision as of 00:04, 28 February 2014

Feb. 27, 2014

Folks met and hacked on the noisebridge discuss mailing list. We created a 93MB text dump, and a python script to parse it, File:Py-piper-parser.txt. We wrote pseudo code to implement a Naive Bayesian filter to protect the world from trolls. Will implement soon.


Word parsing python script

Function 'get_words' takes list of dictionary of emails. Yields lists of words of in the message, for each message:

 def get_words(lst):
   for d in lst:
     m = d['messageline']
     yield m.split()