Proposed Sessions for DataScienceCamp 2014
Sort By Votes
|
Sort By Newest
What is the best model? Larger data sets and more computational power make possible more comprehensive methods to find the correct regression or classification model. Building on the conventional methods of statistical significance testing and relating them to more powerful Bayesian methods, we'll...
Proposed by: John Mark Agosta
As computer clusters scale up, data flow models such as MapReduce have emerged as a way to run fault-tolerant computations on commodity hardware. Unfortunately, MapReduce is limited in efficiency for many numerical algorithms. We show how new data flow engines, such as Apache Spark, enable much fast...
Proposed by: SFBayACM, Reza Zadeh
I would like to invite others who hire Data Scientists to join a panel discussion * Initial 3-5 minutes per panelist giving context and background * General Q&A with the audience What are sub-types of candidates? What blend of experiences may be desired? * big data software developer * data...
Proposed by: SFBayACM, Greg Makowski
Many of today's products are fueled with buzz words BigData, Data Scientist, etc. How would product owners leverage the Data Science resources to build better products, and market to more targeted customers? How would Data Scientists work with Product owners and becoming an integral part of the Prod...
Proposed by: Jay Chen
Greg will go through a number of case studies applying cluster analysis in consulting or embedded in an enterprise software solution. The audience will learn about general challenges and solutions that can be applied to other projects. Customer description: A major credit card company had an a...
Proposed by: SFBayACM
We show how one can learn rich linguistic data using tree kernel learning. This is a support vector learning where the features are fragments of parse trees for sentences and parse thickets for paragraphs of text. Learning parse trees instead of keyword statistics gets much deeper into language unde...
Proposed by: Boris Galitsky
Azure ML is a new, fully cloud-based product from Microsoft that allows people to create fairly sophisticated machine learning applications and take them to production with a simple push of a button. It's easy to get going with Azure ML using its drag-and-drop Studio interface. In this session, we w...
Proposed by: Eugene Chuvyrov
A brainstorming session on how data science can help in the biggest challenge of the century: healthcare.
Proposed by: vishnu pendyala, Vishnu Pendyala
The latest effort in a never-ending quest to make big data and mapreduce more accessible, plyrmr takes a page out of the popular R package plyr and the SQL language to define a DSL that covers a variety of data manipulations and runs them on mapreduce and, soon, spark, without you having to think a...
Proposed by: Antonio Piccolboni
[Placeholder for discussion about Recommendation Systems. Feel free to propose specific sessions, especially if you presented at RecSys] The ACM Recommender System conference (RecSys) is the premier international forum for the presentation of new research results, systems and techniques in the broa...
Proposed by: Karl Anderson
Propose new sessions, then read, comment and vote!
11
Model Selection Methods
Submitted on Aug 26, 2014 at 08:57 AMWhat is the best model? Larger data sets and more computational power make possible more comprehensive methods to find the correct regression or classification model. Building on the conventional methods of statistical significance testing and relating them to more powerful Bayesian methods, we'll...
9
KEYNOTE PRESENTATION: An Update on Distributed Computing with Spark
Submitted on Jul 16, 2014 at 04:49 PMAs computer clusters scale up, data flow models such as MapReduce have emerged as a way to run fault-tolerant computations on commodity hardware. Unfortunately, MapReduce is limited in efficiency for many numerical algorithms. We show how new data flow engines, such as Apache Spark, enable much fast...
Proposed by: SFBayACM, Reza Zadeh
9
Panel Discussion on hiring Data Scientists: finding, interviewing, hiring
Submitted on Jul 16, 2014 at 04:46 PMI would like to invite others who hire Data Scientists to join a panel discussion * Initial 3-5 minutes per panelist giving context and background * General Q&A with the audience What are sub-types of candidates? What blend of experiences may be desired? * big data software developer * data...
Proposed by: SFBayACM, Greg Makowski
7
Data Science and Product Management
Submitted on Oct 22, 2014 at 01:44 AMMany of today's products are fueled with buzz words BigData, Data Scientist, etc. How would product owners leverage the Data Science resources to build better products, and market to more targeted customers? How would Data Scientists work with Product owners and becoming an integral part of the Prod...
Proposed by: Jay Chen
7
Case Studies Deploying Cluster Analysis
Submitted on Aug 20, 2014 at 10:05 PMGreg will go through a number of case studies applying cluster analysis in consulting or embedded in an enterprise software solution. The audience will learn about general challenges and solutions that can be applied to other projects. Customer description: A major credit card company had an a...
5
Tree Kernel Learning for Textual Data
Submitted on Oct 21, 2014 at 11:03 AMWe show how one can learn rich linguistic data using tree kernel learning. This is a support vector learning where the features are fragments of parse trees for sentences and parse thickets for paragraphs of text. Learning parse trees instead of keyword statistics gets much deeper into language unde...
5
Intro to competing in Kaggle Machine Learning Competitions using Azure ML
Submitted on Oct 16, 2014 at 11:16 PMAzure ML is a new, fully cloud-based product from Microsoft that allows people to create fairly sophisticated machine learning applications and take them to production with a simple push of a button. It's easy to get going with Azure ML using its drag-and-drop Studio interface. In this session, we w...
Proposed by: Eugene Chuvyrov
3
Data Science for Healthcare
Submitted on Oct 25, 2014 at 12:19 PMA brainstorming session on how data science can help in the biggest challenge of the century: healthcare.
Proposed by: vishnu pendyala, Vishnu Pendyala
3
plyrmr: a plyr-like R package for big data manipulation.
Submitted on Sep 2, 2014 at 10:19 AMThe latest effort in a never-ending quest to make big data and mapreduce more accessible, plyrmr takes a page out of the popular R package plyr and the SQL language to define a DSL that covers a variety of data manipulations and runs them on mapreduce and, soon, spark, without you having to think a...
Proposed by: Antonio Piccolboni
2
RecSys 2014 Recap & Recommendation Systems
Submitted on Oct 23, 2014 at 02:21 PM[Placeholder for discussion about Recommendation Systems. Feel free to propose specific sessions, especially if you presented at RecSys] The ACM Recommender System conference (RecSys) is the premier international forum for the presentation of new research results, systems and techniques in the broa...
Proposed by: Karl Anderson
Session Schedule
9:00 am - 6:00 pm
2161 North 1st Street
San Jose, CA 95131
DataScienceCamp Silicon Valley 2014
When:
Oct 25, 20149:00 am - 6:00 pm
Where:
eBay Town Hall2161 North 1st Street
San Jose, CA 95131
Collaboration Lists
> Proposed Sessions for DataScienceCamp 2014
Data Science Camp Sponsors
Venue Sponsors
Platinum Sponsors