Machine Learning Meetup Notes: 2010-05-19: Difference between revisions
Jump to navigation
Jump to search
Created page with '*Erin provided a list of unique SubSkills and TracedSkills with frequencies, as well as a python script to normalize the skill values in the challenge sets. *Vikram gave a presen…' |
No edit summary |
||
| Line 1: | Line 1: | ||
*Erin provided a list of unique SubSkills and TracedSkills with frequencies, as well as a python script to normalize the skill values in the challenge sets. | *Erin provided a list of unique SubSkills and TracedSkills with frequencies, as well as a python script to normalize the skill values in the challenge sets. | ||
*Vikram gave a presentation on Hadoop, EC2 and MapReduce. He created a bunch of scripts for EC2 MapReduce. Those tools can be found on [http://github.com/voberoi/hadoop-mrutils github]. | *Vikram gave a presentation and demo on Hadoop, EC2 and MapReduce. He created a bunch of scripts for EC2 MapReduce. Those tools can be found on [http://github.com/voberoi/hadoop-mrutils github]. | ||
Here are some map reduce notes: | Here are some map reduce notes: | ||
Latest revision as of 22:04, 23 May 2010
- Erin provided a list of unique SubSkills and TracedSkills with frequencies, as well as a python script to normalize the skill values in the challenge sets.
- Vikram gave a presentation and demo on Hadoop, EC2 and MapReduce. He created a bunch of scripts for EC2 MapReduce. Those tools can be found on github.
Here are some map reduce notes:
Word Counts (let line number be the key):
1 hello how are you
2 how is it going
3 are you happy
def map(key, value): words = value.split() #["hello", "how", "are", "you"] for word in words emit(word, 1) def reduce(key, values): emit(key, len(values))
results:
hello [1]
how [1,1]
are [1,1]